Bringing together worldwide experts in the field, the Handbook of Chemoinformatics Algorithms provides an overview of the most common chemoinformatics algorithms in a single source. After a historical perspective of the applications of algorithms and graph theory to chemical problems, the book presents algorithms for two-dimensional chemical structures and three-dimensional representations of molecules. It then focuses on molecular descriptors, virtual screening methods, and quantitative structure—activity relationship QSAR models, before introducing algorithms to enumerate and sample chemical structures. The book also covers computer-aided molecular design, reaction network generation, and open source software and database technologies.
|Published (Last):||23 June 2016|
|PDF File Size:||19.96 Mb|
|ePub File Size:||8.31 Mb|
|Price:||Free* [*Free Regsitration Required]|
For example, the molecular center-of-mass may be placed on the origin, so that molecules are located in the same location.
Algorithm 3. Centering the molecule around the origin is then done by subtracting the coordinates of the center-of-mass from the atomic coordinates. However, although Cartesian coordinates are universal, they are not always the best choice with respect to computation times or algorithm simplicity.
For geometry optimization calculations, internal coordinates are more suitable, requiring less computation to reach the same results. Tomczak reports an approximately fourfold speed using internal coordinates over Cartesian coordinates [ 3 ]. They describe the molecular geometry in terms of distances between atoms and angles and torsions between bonds.
This closely overlaps with force field approaches where the molecular energy is expressed in terms of bond length, angles, and torsions, defining a well-structured search space for geometrical optimization. Many molecular dynamics and quantum mechanics algorithms take advantage of this representation. The internal coordinates for ethanol shown in Figure 3. The atomic numbering is the same as for the list of Cartesian coordinates and is shown in Figure 3.
These coordinates are interpreted as follows. The first distance given 1. The first angle given The first torsion angle Note that atom 1 is located behind atom 2 in this figure. These lines do not necessarily have to coincide with bonds. Figure 3. Note that atom 1 is depicted behind atom 2. The lines between atoms are not bonds; in fact, atom 3 is bonded to atom 2 and not to atom 1. However, the vector between atom 4 and atom 2 does coincide with an actual bond.
The unit cell axes can be described as in notional axes 5. Alternatively, the axes can be described as vectors in Euclidean space. This leaves a choice of how to rotate the unit cell in Euclidean space.
If we fix the A axis on the x axis and the B axis in the XY plane, then rotation in the Euclidean space is fixed. The coordinates of atoms in the unit cell are expressed as fractions of the axes A, B, and C. The fractional coordinates of the four sodium atoms in the shown unit cell are 0, 0, 0, 0. The chloride ions are located at 0. The notional coordinates of this unit cell are defined by the A, B, and C axis lengths all 5.
These diagrams are aimed at graphical visualization of the connection table and typically focus on depiction of atom and bond properties, such as isotope and charge details for atoms, and bond properties like bond order, delocalization, and stereochemistry. This 2D coordinate space is outside the scope of this chapter. It is mentioned here, however, because 2D diagrams are often the input in algorithms that create 3D molecular structures.
These algorithms create 3D Cartesian coordinates from the information presented in 2D molecular representations. Primarily, this information includes the connection table, and atom- and bond-type information.
Additionally, coordination generation for ring systems can use a template library that may or may not contain information on the layout of the attachment points to assemble the geometries of ring and nonring systems.
The general concept is given in Algorithm 3. Algorithms to interconvert coordinate systems are abundant, but may differ in detail between implementations. This section discusses two algorithms: conversion of internal coordinates into Cartesian coordinates and conversion of fractional coordinates into Cartesian coordinates.
The algorithm has two degrees of freedom: 1 in which Cartesian coordinate the first atom is placed and 2 in which plane the first two bonds are located. The algorithm description given in Algorithm 3. Atom Numbering Follows those from Table 3.
Handbook of chemoinformatics algorithms
For example, the molecular center-of-mass may be placed on the origin, so that molecules are located in the same location. Algorithm 3. Centering the molecule around the origin is then done by subtracting the coordinates of the center-of-mass from the atomic coordinates. However, although Cartesian coordinates are universal, they are not always the best choice with respect to computation times or algorithm simplicity. For geometry optimization calculations, internal coordinates are more suitable, requiring less computation to reach the same results.
Handbook of Applied Algorithms