J. Chem. Inf. Comput. Sci. (2004), 44, 2133-2144. [ doi:10.1021/ci049780b ]
The crystallographically determined bond length, valence angle, and torsion angle information in the Cambridge Structural Database (CSD) has been made accessible by development of a new program (Mogul) for automated retrieval of molecular geometry data from the CSD. The program uses a system of keys to encode the chemical environments of fragments (bonds, valence angles, and acyclic torsions) from CSD structures. Fragments with identical keys are deemed to be chemically identical and are grouped together, and the distribution of the appropriate geometrical parameter (bond length, valence angle, or torsion angle) is computed and stored. Validation experiments indicate that, with rare exceptions, search results afford precise and unbiased estimates of molecular geometrical preferences. Such estimates may be used, for example, to validate the geometries of libraries of modeled molecules or of newly determined crystal structures or to assist structure solution from low-resolution (e.g. powder diffraction) X-ray data.