Will it crystallise?

One of the biggest barriers when it comes to studying the structures of molecules is the ability to obtain them in a crystalline form for x-ray diffraction. Now, Richard Cooper and Jerome Wicker at the University of Oxford, UK, have developed a machine learning approach to predict whether a small organic molecule will be able to crystallise. Since crystallinity is vital both for determining structures, and also for the delivery of many drugs, this work could provide valuable information.

Machine learning involves the construction of algorithms that can learn from data, and it has been used in the past to predict the solubilities and melting points of materials. Cooper and Wicker set out to test whether simple two-dimensional information, such as atom types, bond types and molecular volume, could be used to predict if a material would crystallise.

Data sets were obtained from the Cambridge Crystallographic Data Centre (CCDC) and ZINC, a database of commercially available chemical compounds, and the model was trained and tested with a few properties of the molecules to determine which were the most significant in predicting crystallinity. Rotatable bond count and 0χv, a molecular connectivity index that gives an indirect measure of 3D volume, proved to be the key variables and produced a model that was 80% accurate.

0χv was found to give the highest predictive accuracy in determining crystallisation propensity

‘The analysis tells us whether a material should crystallise, and therefore when to expend effort trying to obtain a crystalline sample,’ explains Cooper. The model could also give information as to whether changing a small feature, such as a functional group, might make a molecule more or less likely to crystallise

Crystallography experts put the work into context: Simon Coles, Director of the UK National Crystallography Service, says ‘many areas of science are on the verge of a new age – we have been collecting individual datasets for decades and can now apply informatics-based approaches across these collections, not only to observe trends and derive rules but also to predict.’  Pete Wood, a scientist at the CCDC says ‘the likelihood of crystallinity, or crystallisability, of small molecules is of great significance in the pharmaceutical industry as the majority of small molecule drugs are delivered in the crystalline state.’

In the future Cooper and Wicker hope to incorporate other variables into the model, such as temperature and solvent, and are currently testing their model on a range of materials on the ‘edge of crystallinity’ in order to get more insight into the mechanisms that determine whether these materials crystallise.


This article is free to access until 2 January 2015. Download it here:

J G P Wicker and R I Cooper, CrystEngComm, 2015, DOI: 10.1039/c4ce01912a

Related Content

Machine-learning accelerates catalytic trend spotting

9 June 2016 Research

news image

Example of what you can gain when ‘people from different disciplines start looking at the same problems’

Molecular machines

16 February 2016 Feature

news image

Victoria Richards investigates the world of artificial molecular machines – where have they come from and where are they he...

Most Commented

Ethanol to butanol conversion shows sustainable potential

13 January 2016 Research

news image

Borrowed hydrogen chemistry drives reaction to obtain useful fuel from biomass

Injectable foam repairs bones

22 December 2015 Research

news image

Scientists say biomaterial could treat bone defects and diseases