Skip to main content Skip to navigation

QSAR & Data Mining

This section of the course deals with a number of methods that use statistics to identify correlations between some observed property (usually a biological activity) and a set of molecular properties. Common applications include defining a equation that will reproduce the measured activity for compounds within a particular data set, and then using this to search through a chemical library (which could have been constructed for a completely different purpose) to identify new, active, compounds.

General Discussions

  • A historical perspective on QSAR (here)
  • A good general review (here)
  • errors associated with high throughput screening data (here)

Descriptors

  • Use of PCA to identify descriptors for the amino acids (here)
  • Use of Genetic Algorithms to identify descriptor sets, applied to the prediction of anitfungal activity (here)

papers used in discussing QSAR methods

  • Linear regression methods, applied to passage of drugs into breast milk (here)
  • Neural Networks, applied to antifungal activity (here)
  • 3D QSAR: development of a pharmocaphore model to identify transporters for various drugs (here)

QSAR Applications

  • Antifungal activity (here)
  • Nonsteroidal Anti-Inflamatory Drugs (here)
  • Anti-tuberculosis agents (P.M. Sivakumar et al., Chem. Pharm. Bull. 2007, 5544—49)
  • Melanoma toxicity (here)
  • Anti-epileptic sulfamides (here)
  • Drug transfer into human breast milk (here)
  • ADME/tox evaluation (here)