Research project

Active Learning for Computational Polymorph Landscape Analysis

Project overview

The proposed research will develop advanced computational methods for predicting the possible crystal structures of drug-like molecules. The work is motivated by the importance of anticipating the occurrence of polymorphism, where a molecule can crystallise in more than one crystal structure, depending on the conditions used for its crystallisation. In the context of pharmaceutical materials, we must know when polymorphs exist that we have not yet characterised. These present a risk related to property control; a change in crystal structure can dramatically alter important properties of a crystalline drug, affecting its processing, tabletting and bioavailability. Hence, there has been a huge investment in crystal structure prediction methods. Predicted structures could guide experimental screening - where to focus effort and, in the long run, what experimental variables to vary to maximise likelihood of isolating new structures. Structure prediction has progressed impressively but still not made the expected impact on assessing risk. A root cause is the problem of over-prediction. Current methods always predict many competing crystal forms, most of which are never observed. Accordingly all candidate drug molecules appear to have significant uncertainly as to expected extent of polymorphism and this adversely impacts risk analysis. The root of the problem is that the underlying lattice energy surface, on which local minima represent possible structures, is extremely complex and current methods for predicting polymorphism do not provide a sufficiently detailed description of this energy surface. We will develop the use of statistical learning methods to guide crystal structure calculations to efficiently map out the global features of lattice energy surfaces in a way that is not possible using current computational methods. Two lines of study are proposed: to improve the fidelity of energetic assessment and, more importantly, to map the energy landscape of structures more globally. A starting point is to develop advanced statistical learning methods for correcting approximate computational models that are used for assessing lattice energies of predicted crystal structures. Our goal is to reduce the uncertainty in ranking of predicted structures at a controlled computational cost. We will then move to a completely unexplored problem: learning more detailed features of the lattice energy surface, such as the depth, shape and connectivity of energy basins. Key to this work is the development of multi-fidelity (multiple models of known accuracy and computational cost) and multi-objective Bayesian optimisation approaches to make use of the hierarchical of energy models (a series of approximate energy models with known, ordered accuracy) used in crystal structure prediction. The objective is to judge the thermodynamic robustness and kinetic accessibility of individual predicted crystal structures and address the polymorphism over-prediction problem. This is completely new in the area and can be transformative in guiding experimental screening. Thus, the vision is that active learning methods will guide the computer simulations that, in turn, will provide guidance to experimental polymorph screening.

Staff

Lead researchers

Professor Graeme Day

Professor of Chemical Modelling
Research interests
  • crystal structure prediction
  • materials discovery
  • computational chemistry
Connect with Graeme

Other researchers

Professor Simon Coles

Professor of Structural Chemistry
Research interests
  • The work we do is highly collaborative and multidisciplinary and can broadly be split into th…
  • 1) National Crystallography Service (NCS, www.ncs.ac.uk) & Physical Sciences Data-science…
  • 2) Structural Chemistry We have an interest in determining the mechanisms of solid-state rea…
Connect with Simon

Professor Dave Woods

Professor of Statistics
Research interests
  • Design of experiments
  • Bayesian statistics
  • Statistical computing
Connect with Dave

Collaborating research institutes, centres and groups

Research outputs

Olga Egorova, Roohollah Hafizi, David Woods & Graeme M. Day, 2020, Journal of Physical Chemistry A, 124(39), 8065-8078
Type: article