Will it crystallise?

One of the biggest barriers when it comes to studying the structures of molecules is the ability to obtain them in a crystalline form for x-ray diffraction. Now, Richard Cooper and Jerome Wicker at the University of Oxford, UK, have developed a machine learning approach to predict whether a small organic molecule will be able to crystallise. Since crystallinity is vital both for determining structures, and also for the delivery of many drugs, this work could provide valuable information.

Machine learning involves the construction of algorithms that can learn from data, and it has been used in the past to predict the solubilities and melting points of materials. Cooper and Wicker set out to test whether simple two-dimensional information, such as atom types, bond types and molecular volume, could be used to predict if a material would crystallise.

Data sets were obtained from the Cambridge Crystallographic Data Centre (CCDC) and ZINC, a database of commercially available chemical compounds, and the model was trained and tested with a few properties of the molecules to determine which were the most significant in predicting crystallinity. Rotatable bond count and 0χv, a molecular connectivity index that gives an indirect measure of 3D volume, proved to be the key variables and produced a model that was 80% accurate.

0χv was found to give the highest predictive accuracy in determining crystallisation propensity

‘The analysis tells us whether a material should crystallise, and therefore when to expend effort trying to obtain a crystalline sample,’ explains Cooper. The model could also give information as to whether changing a small feature, such as a functional group, might make a molecule more or less likely to crystallise

Crystallography experts put the work into context: Simon Coles, Director of the UK National Crystallography Service, says ‘many areas of science are on the verge of a new age – we have been collecting individual datasets for decades and can now apply informatics-based approaches across these collections, not only to observe trends and derive rules but also to predict.’  Pete Wood, a scientist at the CCDC says ‘the likelihood of crystallinity, or crystallisability, of small molecules is of great significance in the pharmaceutical industry as the majority of small molecule drugs are delivered in the crystalline state.’

In the future Cooper and Wicker hope to incorporate other variables into the model, such as temperature and solvent, and are currently testing their model on a range of materials on the ‘edge of crystallinity’ in order to get more insight into the mechanisms that determine whether these materials crystallise.


This article is free to access until 2 January 2015. Download it here:

J G P Wicker and R I Cooper, CrystEngComm, 2015, DOI: 10.1039/c4ce01912a

Related Content

Molecular machines

16 February 2016 Feature

news image

Victoria Richards investigates the world of artificial molecular machines – where have they come from and where are they he...

Artificial intelligence for quantum chemistry

14 December 2011 News Archive

news image

A database of quantum chemical results and some clever algorithms can be used to predict atomisation energies

Most Commented

WHO clarifies glyphosate risks

23 May 2016 Business

news image

UN and WHO panel conclude the herbicide glyphosate is ‘unlikely’ to cause cancer at realistic exposure levels

Large HIV vaccine trial to launch in South Africa

24 May 2016 News and Analysis

news image

US funding agency will enlist 5400 people for HIV vaccine study in South Africa in November