Nomenclature and Conventions for Database files

General format:

Single molecules:

<molecule>(charge +/-)
e.g., NH4(+)
<molecule>(charge +/-)_<conformation specification>
e.g., P(O)(O)(OCH3)(OCH3)( )_g_g

Molecular complexes:

<molecule>(charge +/-):<molecule>(charge +/-)
e.g., HOH:HOH
e.g., P(O)(O)(-O-CH2CH2-O-)(-):NH4(+)


<reactant A>(charge +/-)...<reactant B>(charge +/-)
e.g., CH3O(-)...P(O)(O)(-O-CH2CH2-O-)(-)
tags such as _min, _ts, and _ns are always added to these (see below)

For any names such as Imidazole, the suffix H(+) signifies one position
has been protonated, resulting in a (+) charge on the molecule and an
extra hydrogen

-O-CH2CH2-O- deprotonated ethylene glycol

-O-sugar-O- similar to ribose ring in RNA

Imidazole an imidazole ring, C3H4N2

Specific Nomenclature conventions:


Non-cyclic: P(R)(S)(A)(B)

Cyclic: P(R)(S)(-A-bridge-B-)

--Since we are not concerned by optical isomers, sort the ligands first by
increasing number of atoms (i.e. one atom groups, two atom groups, etc...)
and then by increasing atomic number (e.g. OH before SH, but after S). In
general, this it to be used for multiple ligands that are in energetically
indistiguishable positions upon conformational averaging,
So: P(S)(OH)(OCH3)(O)(-) should be P(O)(S)(OH)(OCH3)(-)


(S = back, into plane; R = forward, out of plane)

Non-cyclic: P(R)(S)(E)(A)(B)

Cyclic: P(R)(S)(-E-Bridge-A-)(B)

Metal ion binding

--Metal ions will be shown at the end of the file name, seperated from the
phosphate/phosphorane by a colon. Immediately following the metal
abbreviation (from the periodic table) are the numbers of the atoms the
metal is bound to, seperated by commas and enclosed in parenthesis. The
atoms are labelled increasing from left to right, and only atoms which can
bind to the metal and the phosphorous are included (oxygen and sulfur,

e.g. bidentate coordination of cadmium(II) to a phosphate at the
O and S positions


or magnesium(II) binding to a phosphorane at the OH and S positions


--in the case of a dimetal species, the second metal is written in the
same format as the first, and the bridging group is surrounded by hyphens
and placed between the two.

e.g. P(S)(OH)(-O-sugar-O-)(OCH3):Mg(1)(HOH)4-OH-Mg(5)(HOH)5

--a two metal complex, with the one metal bound to the first group on
the phosphorane and the second bound to the fifth group, connected by a
hydroxide and the coordination spheres filled by four and five waters,

Multiple Phosphates:

--For multiple phosphates, the naming is similar. The group closest to
the bulk of the organic matter (i.e. the alpha phosphorous) is printed
first, directly following the organic matter. The nonbridging R and S
groups are named then, followed by the next bridging atom, followed by the
R and S, etc. For example, ATP would be named
Adenosine-O-P(O)(O)-O-P(O)(O)-O-P(O)(O)(O)(4-). Doing a sulfur
substitution at the bridging oxygen between the beta and gamma phosphates
would be Adenosine-O-P(O)(O)-O-P(O)(O)-S-P(O)(O)(O)(4-).

e.g., something-B1-P(Xa)(Ya)-B2-P(Xb)(Yb)-B3-P(Xc)(Yc)(Tc)

Reactions suffixes:

--Structures found with a minimum energy optimization (reactants,
products, intermediates) will use a _min_#, with # being the different minimun
structure along a reaction pathway (e.g. R is the reactant structures,
I_1 is the first intermediate formed, I_2 is the second intermediate formed, P is product)

--Transistion states will be labeled with _ts_#, where the # represent the following:

PT--proton transfer
ET--electron transfer
HT--hydride transfer
CT--conformational reaction

SN--nucleophilic substitution

in order to keep track of the steps of complex nucleophilic substitution reactions, we have the conventions as following:

AN-----bond making for nucleophilic substitution
DN---bond breaking for nucleophilic substitution
ANDN---concerted ''SN2'' displacement for nucleophilic substitution
DN + AN --- ''SN1'' process for nucleophilic substitution

Sulfur Substitutions:

-- Sulfur substitutions should be placed directly below the original molecule.
Mark the original molecule with an asterick (*) if they are NOT complete.
Two spaces indentation means the input files are complete, but the jobs
have not finished yet.

Optimizations in solvent:

-- These should be named as above, but append the file with the suffix _sSOLV,
where SOLV is the solvent used (HOH for water)

e.g. P(O)(O)(-O-CH2CH2-O-)(OCH3)(2-)_sHOH.out

Optical Isomers:

-- These were added because of anionic dimethyl phosphate. Something with the _optical tag has almost exactly the same energy as the related file, but it's orieted differently.

e.g. P(O)(O)(OCH3)(OCH3)(-)_min1.out