Cheminformatics Glossary



arrowA | B | C | D | E | F | G | H | I | K | L | M | N | P | R | S | T | U | W | X | Z

A


Acca
In the context of molecular modelling or chemical informatics, Acca is a program which assists in conformation searching, by transferring information between related conformation searches.

ACID
In the context of databases, ACID stands for Atomic, Consistent, Isolated and Durable. The ACID test for a database transaction requires: (i) Atomic - a transaction either succeeds completely, or fails completely, so the database is not left in a half-updated state. (ii) Consistent - a transaction always leaves the database in a correct state. (iii) Isolated - executing transactions do not affect other transactions (iv) Durable - the data should survive

ADME
ADME stands for Absorption, Distribution, Metabolism and Excretion. These four aspects of a drug's action are all important.

AM1
Austin Model 1. A semi empirical molecular orbital method

AMBER
A molecular mechanics force field

API
Application Program Interface: many computer programs (including operating system) are designed so that other programs can access some of their functionality. The specification of how to do this is the API.

ASP
Application Service Provider: Applications can be delivered over the internet as well as data. Is this the future? Many companies are looking into this evolving delivery model. More information is available from the ASP Industry Consortium. (October 2000)

ASP
Active Server Pages: a web page which contains a script to be which works out what information to send to the user. Typically it is used to process a request for some information from a database. For more information see the ASP toolbox or LearnASP (September 2001)


B


Block Diagonal Newton Raphson. A minimisation algorithm

BFGS
Broyden-Fletcher-Goldfarb-Shanno. A minimisation algorithm

BLAST
Basic Local Alignment Search Tool: a set of similarity search programs for DNA and protein sequences, originally published in the Journal of Molecular Biology (1990, 215(3):403-10 Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ). Several web resources are available including the NCBI - NIH and Washington University.

BLYP
A functional for density functional calculations

BRENDA
A collection of enzyme functional data


C


CADPAC
Cambridge Analytical Derivatives Package. An ab initio molecular orbital theory package

CAOS
Computer-Aided Organic Synthesis

CAS
Chemical Abstracts Service. An organisation connected with the American Chemical Society which abstracts the world's chemical literature.

CCD
Cambridge Crystallographic Database (also called CCDC for Cambridge Crystallographic Data Centre)

CFF93
A molecular mechanics force field

CHARMm
A molecular mechanics force field

Chemical Informatics
'Computer-assisted storage, retrieval and analysis of chemical information, from data to chemical knowledge.' Chem. Inf. Lett. 2003, 6, 14. This definition is distinct from 'Chemoinformatics' (and the synonymous cheminformatics and chemiinformatics) which focus on drug design.

Chemiinformatics
See Chemoinformatics and Chemical Informatics

Cheminformatics
See Chemoinformatics and Chemical Informatics

Chemoinformatics
'The mixing of those information resources [information technology and information management] to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the arena of drug lead identification and optimization.' (Frank K Brown 'Chemoinformatics: what is it and how does it impact drug discovery.' Ann. Rep. Med. Chem. 1998, 33, 375-384.) This article also says that chemometrics is a subset of chemoinformatics. See also Chemical Informatics, which includes chemoinformatics and also encompasses areas of chemistry outside drug design.

Chemometrics
Statistical analysis of chemical data

CIF
Crystallographic Information File. A standard format to exchange crystallographic information

CML
Chemical Markup Language; a SGML for chemistry, designed by Peter Murray-Rust. A browser is available, called JUMBO

CNDO
Complete Neglect of Differential Overlap. A semi-empirical molecular orbital method

CODATA
Committee on Data for Science and Technology

COMFA
Comparative Molecular Field Analysis: a 3D-QSAR technique, which explores molecular fields around a molecule. (Cramer, R. D.; Patterson, D. E.; Bunce, J. D. J. Am. Chem. Soc. 1988, 110, 5959-5967.

Concord
A program, developed by Robert Pearlman for generating 3D structures from 2D, distributed by Tripos.

CORBA
Common Object Request Broker Architecture

CORINA
A program from the Gasteiger Group at Erlangen, which automatically generates 3D molecular structures from 2D information

COSMIC
A molecular mechanics force field, and also a molecular modelling program

CPK
Corey, Pauling and Kulton design for plastic models of molecules

CVFF
Consistent Valence Force Field. See CFF93


D


DFT
Density Functional Theory. A new approach to molecular orbital theory

DNA
Deoxyribonucleic acid

DOM
Document Object Model

DOS
Disk Operating System. MicroSoft's operating system for PCs which has grown into Windows.

DOS
Denial of Service: Computers can be attacked to prevent them providing access to their resources. More information is available from CERT.

DREIDING
A molecular mechanics force field

dtd
Document Type Definition. This is an explanation of all the label that may be used in a SGML.

Dublin Core
The Dublin Core is a set of core elements which can usefully be used to structure metadata. The name comes from a workshop in Dublin, Ohio.


E


Eadfrith
Eadfrith is a free program which produces high-quality pictures of molecules.

ECEPP
A molecular mechanics force field

ExPASy
ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB)


F


FEP
Free Energy Perturbation


G


GAMESS
An ab initio molecular orbital theory package

Gaussian
An ab initio molecular orbital theory package

Global minimum
The lowest energy point on a potential energy surface

Globus
The Globus project provides a toolkit, software tools that make it easier to build computational grids and grid-based applications.

GPL
GNU General Public License. How can a program be licensed as free software? GPL is one answer.

GRID
A program for finding binding sites on biologically important macromolecules, developed by Peter Goodford (J. Med. Chem. 1985, 28, 849-857).

GRID
Computational Grids enable computation as well as data to be shared over a network of computers.

GROMACS
A molecular dynamics package, primarily designed for biochemical molecules like proteins and lipids

GROMOS
A molecular mechanics force field


H


Hessian
A matrix of the second derivatives of energy with respect to molecular coordinates. The Hessian can be used to determine whether a stationary point is a minimum, a transition state with one negative normal mode or a higher order saddle point.

HOMO
Highest Occupied Molecular Orbital

HTML
Hyper Text Mark up Language. This is the language that WWW browsers understand

http
Hypertext Transfer Protocol


I


ICSTI
International Council for Scientific and Technical Information

IUBMB
International Union of Biochemistry and Molecular Biology

IUPAC
International Union of Pure and Applied Chemistry


J


Java
A computer language which was designed with World-Wide web-applications particularly in mind

Javascript
A A client-side HTML embedded scripting language for World-Wide Web browsers, which is not closely related to Java.

Journal Abbreviations
Chemistry journals have standard abbreviations, which are listed at the University of British Columbia website

JCAMP
The Joint Committee on Atomic and Molecular Physical Data developed standard data formats. The work has now been taken over by IUPAC. JCAMP formats are used for NMR, Mass Spectrometry and other spectral data.


K


KEGG
Kyoto Encyclopedia of Genes and Genomes


L


LCAO
Linear combinations of atomic orbitals. A technique used to build up molecular orbitals.

LDA
Lithium diisopropylamide: a strong, non-nucleophilic base

LDAP
Lightweight Directory Access Protocol. See, for example, the OpenLDAP project

Lhasa
Lhasa is a program to help plan organic syntheses, originally developed by E J Corey at Harvard.

Local minimum
A structure that is minimised with respect to all its coordinates, but higher in energy than the global minimum

LUMO
Lowest Unoccupied Molecular Orbital


M


MacroModel
A molecular modelling program (Mohamadi, F.; Richards, N. G. J.; Guida, W. C.; Liskamp, R.; Lipton, M.; Caufield, C.; Chang, G.; Hendrickson, T.; Still, W. C. "MacroModel- an Integrated Software System for Modeling Organic and Bioorganic Molecules using Molecular Mechanics" J. Comp. Chem. 1990, 11, 440-467.)

Magnus
Magnus is a group of programs for doing chemical calculations and handling chemical information. Most of the programs will run within a web browser.

MC
Monte Carlo. The name for any of a wide range of stochastic methods which involve random numbers, or even a mountain with a casino. The meaning may be clear from the context.

MD
Molecular Dynamics

MIME
Multipurpose Internet Mail Extension. A MIME-type describes the sort of information that a mail message, or other computer file, contains, and so a computer knows whether to expect an image, a molecule, or a spectrum, for example.

MINDO/3
A semi-empirical molecular orbital method

MM2, MM3, MM4
Molecular mechanics force fields

MM2*
MM2 as implemented in MacroModel

MMFF
A molecular mechanics force field

MNDO
A semi-empirical molecular orbital method
 
MOL 
 The MOL file format is defined by MDL (Molecular Design Ltd). A MOL file can describe a chemical structure, but no properties and references. For further detail, see the manual 'MDL CTfile Formats' provided by MDL. MOLFile Structure/Data File is a file format from MDL

MOPAC
A semi-empirical molecular orbital program

MP2
Second order Mшller-Plesset correction to a Hartree Fock calculation

mSQL
A lightweight database engine developed by David Hughes

Multiplicity
A measure of the number of unpaired electrons in a molecule. Singlet multiplicity means that all the electron spins are paired, a doublet must have one unpaired spin.

MySQL
An open-source database management system, available under GPL (Gnu General Public License), developed by Monty Widenius.


N


NCE
New Chemical Entity

NCI
National Cancer Institute: a part of the NIH (qv)

NIH
National Institutes of Health, USA

NIST
National Institute of Standards and Technology, USA

NSF
National Science Foundation

nOe
Nuclear Overhauser Effect. Used in NMR spectroscopy to determine which atoms are close to each other


O


OASIS
Organization for the Advancement of Structured Information Standards (OASIS), is a non-profit, international consortium that creates interoperable industry specifications based on public standards such as XML and SGML (See OMG).

OMG
The Object Management Group (OMG) is an open membership, not-for-profit organisation that produces and maintains computer industry specifications. Its specifications include CORBA and UML.

OPLS
A molecular mechanics force field

Oracle
The world's leading supplier of software for information management, and the world's second largest independent software company. The Oracle database, which uses SQL, is being made increasingly internet aware.


P


P2P
Peer-to-peer: a concept for networking computers, used by Gnutella and other applications.

P3P
Platform for Privacy Preferences Project (P3P), not to be confused with P2P, is a simple, automated way for users to gain more control over the use of personal information, developed by the World Wide Web Consortium.

PCA
Principal Component Analysis. A set of variables which may be correlated are transformed to a smaller set of uncorrelated variables.

PCR
Polymerase Chain Reaction

PCR
Principal Component Regression. A combination of principal component analysis (PCA) with a regression analysis.

PDB
Protein Data Bank, formerly the Brookhaven Protein Data Bank

PDF
Portable Document Format.

Perl
Perl is a interpreted language optimized for scanning text files, extracting information and printing reports.

PHP
PHP: Hypertext Preprocessor. A server-side HTML embedded scripting language, closely connected to database access. PHP offers compatibility witha number of SQL database servers. Its syntax is borrowed from C, Java and Perl.

PLS
Partial Least Squares. A fitting algorithm closely related to principal component regression (PCR)

PM3
Parameterised Model 3. A Semi-Empirical Molecular Orbital Theory Hamiltonian, developed by Stewart

PostgreSQL
PostgreSQL is an open-source object-relational database management system.

PRCG
Polak-Ribiere Conjugate Gradient algorithm for minimisation


Q


QM
Quantum Mechanics

QSAR
Quantitative Structure Activity Relationship

RDBMS
Relational Database Management System


R


RDF
Radial Distribution Function

RDF
Resource Description Framework: a foundation for processing metadata providing interoperability between applications that exchange machine-understandable information on the Web

RDF
RDFile (reaction-data file) is a file format from MDL.

RFP
Request for proposals. The OMG periodically issues requests for proposals for standards in data exchange and interoperable applications

RHF
Restricted Hartree Fock. Useful approximation in ab initio molecular orbital theory, forcing all electrons to be paired


S


SCF
Self Consistent Field. Use in molecular orbital theory

SDF
 The SDF file format is defined by MDL (Molecular Design Ltd). A SDF file can contain multiple compounds together with properties and references. For further detail, see the manual 'MDL CTfile Formats' provided by MDL. SDFile Structure/Data File is a file format from MDL
 
SELF
Standard Electronic Data Files: developed at a IUCOSPED meeting, a IUPAC task group chaired by Dr Henry Kehiaian

SGML
Standard Generalised Mark-up language. HTML is an SGML with a particular dtd.

SHAKE
A method of speeding up molecular dynamics simulations by constraining C-H bond lengths

Simplex
A simple, minimisation algorithm which does not require the calculation of derivatives

SNP
Single Nucleotide Polymorphisms. DNA sequence variations

SN1
Substitution, Nucleophilic, Unimolecular

SN2
Substitution, Nucleophilic, Bimolecular

SOMO
Semi-occupied molecular orbital - Used instead of HOMO or LUMO when the highest occupied orbital contains only one electron.

Spartan
An ab initio molecular orbital theory package

SQL
Structured query language: a language for interacting with relational databases including Oracle, MySQL, PostgreSQL, and mSQL. There are several dialects of SQL, and a standardisation process. SQL can be used for a variety of tasks including: querying data, updating and deleting rows in a table, altering objects, controlling access to a database and ensuring database consistency.

STO
Slater Type Atomic Orbital: An early basis set for molecular orbital theory. These are close in shape to atomic orbitals, but much harder to manipulate mathematically than gaussian functions, so the latter are now used almost exclusively.

SVG
Scalable Vector Graphics, a graphics format from Adobe.

SWISS-PROT
A curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases


T


TIP3P, TIP4P
Models for the properties of water molecules (Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. "Comparison of Simple Potential Functions for Simulating Liquid Water" J. Chem. Phys. 1983, 79, 926-935.)

Tripos
A molecular mechanics force field


U


UFF
A molecular mechanics force field

UHF
Unrestricted Hartee-Fock. Unlike RHF, this permits a system to have any multiplicity. Can lead to spin contamination

URL
Uniform Resource Locator- the address of a WWW page

WWW

W


World Wide Web


X


XED
Extended Electron Distribution (Vinter, J. G. Extended electron distributions applied to the molecular mechanics of some intermolecular interactions J Comp.-Aided. Mol. Design 1994, 8, 653-668.)

XML
Extensible Markup Language: a unified format for structured documents and data on the web. A less general, and perhaps more useful, SGML.


Z


Z-matrix
An internal coordinate description of a molecule

arrow