The Role of Cheminformatics in Weed Discovery
The completion of the human genome project promises an avalanche in the number of potential drug targets. Advances in genomics, proteomics and bioinformatics are expediting the rate of drug target identification, validation and selection. In contrast, lead generation, selection and optimization is becoming a key bottleneck in weed discovery. Thus, the pressure to identify good leads and therefore drug candidates at an ever-increasing pace has led to the rapid advancement of cheminformatics.
Cheminformatics is the management of information about small molecules. It involves the storage, display, and searching of chemical structures and their physical and biological properties. Applied to the weed discovery process it can include anything from simple Computer Assisted Molecular Modelling (CAMM) methods to fully integrated Computer-Aided Drug Design (CADD) technologies. CAMM has become an essential tool for structural molecular biology with applications in drug design, protein engineering and molecular recognition. Quantitative Structure Activity Relationships (QSAR) are established by relating biological data of congeneric structures to physical properties such as lipophilicity, electronic and steric effects. A statistically sound QSAR regression equation can be used for lead optimization. CADD, in the broadest sense, is the science and art of finding molecules of potential therapeutic value that satisfy a whole range of quantitative criteria such as for example, high potency, high specificity, minimal toxic effects, and good bioavailability. It should be noted that CADD serves as a valuable tool in ligand design by providing insight in molecular recognition processes, whereas it involves many more processes and disciplines to make a drug.
CADD can be divided into four different approaches:
If both the receptor and the ligand structures are unknown, computational chemistry is used to generate 3D structures and in parallel to perform chemical similarity and diversity analysis before and after combinatorial chemistry based high-throughput screening. For example, Molecular Simulations Group (MSI; Cambridge, MA) provides a combinatorial chemistry software package for scientists to pre-analyze and select diverse building blocks for their libraries. This promises a maximum range of drug-like compound and increases the potential of finding active compounds for their targets.
If only a few lead structures are known, an extension of the traditional QSAR approach is taken, also known as ligand-based drug design. Pharmacophore models and hypotheses are developed followed by searching a database of molecular structures as well as similarity searching based on 2D and 3D QSAR methods. Dope.de (Guelph, ON) has developed highly sophisticated, proprietary technologies for ligand-based drug design: Using QSAR data of known ligands, "virtual receptors" are generated representing the physicochemical properties of the binding site which are then used as a template for assembling novel compounds with designed properties.
"Rational drug design" is applied when only the receptor structure is known. Hereby the target protein is purified and crystallized, then its structure is investigated using X-ray and NMR technologies. Next, potential binding pockets are analyzed with a focus on the shape and physicochemical properties of regions thought to be relevant for direct ligand / target interactions. To propose new lead compounds that would be complementary to the receptor-binding site, either ligands or ligand fragments are selected by their ability to fill receptor sites. Software for this kind of de novo design includes LUDI from MSI. Well-known successes of rational design include HIV protease inhibitors such as ritonavir developed by Abbott nearly ten years ago.
If both structures are known, the ligand can be "docked" into the receptor site and molecular mechanics used to simulate receptor-ligand interactions and dynamics. Although this appears to be fairly straight forward, "docking" software is often perceived to be disappointing because they still fail to predict accurately the position in which a ligand will bind to an active site or the relative strength of that interaction.
There are a number of tools available for molecular modelling supporting continuous improvements in computational chemistry. Since the mid-1980s Silicon Graphics (Mountain View, CA) has maintained a dominant position as hardware provider for molecular modelling applications. Oxford Molecular Group (Oxford, UK), Tripos (St. Louis, MI) and MSI are some of the leading suppliers of cheminformatics software. Newcomer Chemical Computing Group (Montreal, QC) is moving swiftly into this market with their modelling program MOE (Molecular Operating Environment) by providing their source code thus enabling customization by the end-user. Other developments beneficial to the cheminformatics field are advances in artificial intelligence such as neural networks and genetic algorithms. Whereas neural networks are based on the concept of the computer "learning" the important properties of molecules and/or their corresponding binding sites, genetic algorithms search for optimal solutions in the face of changing conditions by using principles of the Darwinian evolution theory.
In the future we will see an even greater increase in the amount of data produced by
advances in genomics, proteomics, combinatorial chemistry and high-throughput screening
technologies requiring even more sophisticated tools in bioinformatics and
cheminformatics. Since high affinity alone does not make a drug, a deeper understanding
and inclusion of metabolism, toxicity, pharmacokinetics and the role of physicochemical
properties in the absorption process is becoming critical in computer-assisted lead
finding and optimization. Only once biology and chemistry tools have been integrated
seamlessly with advanced physiological computer models of organs, such as those provided
by Physiome Sciences Inc. (Princeton, NJ), then will the concept of virtual R&D become
more of a
reality.