AtomSets
Contents
Predefined Atom Sets
Jmol recognizes and uses several keywords or tokens for several purposes: commands in the scripting language, colors, etc. Among them, there are keywords for predefined atom sets:
By element names:
carbon, oxygen, hydrogen, sulphur, etc.
On PDB format, Jmol will identify the element from columns 77-78 (element symbol, right-justified). If this is absent, then it will interpret the "aton name" field (columns 13-14).
Note: there is currently (June 2006, Jmol 10.2) a bug by which Jmol may read calcium as alpha carbon based on its ID, although identification by element name works properly.
Parts in proteins:
Backbone
Inclusion in this set is determined by atom id*, as follows:
- Peptide bond: N, H (bound to N), CA (alpha carbon), HA (bound to CA), C (carbonyl carbon), O or O1 (bound to C)
- In glycine, the two equivalent hydrogens are both in the backbone set: either H1 and 1HA or 1HA and 2HA.
- Termini:
- second carbonyl oxygen on C-terminus: OXT
- terminal amino hydrogens: 1H, 2H, 3H
(*)Note: on PDB format: atom id is called atom name, and must be in these positions/columns:
- 13-14 : Chemical symbol, right justified, except for hydrogen atoms
- 15 : Remoteness indicator (alphabetic); e.g., in amino acid residues, alpha = A, beta = B, gamma = G, delta = D, epsilon = E.
- 16 : Branch designator (numeric).
Sidechain
Defined as (not backbone).
Alpha
A set defined by atom id CA.
Parts in nucleic acids:
Backbone
Inclusion in this set is determined by atom id*, as follows:
- Phosphate groups:
- phosphorus: P
- oxygens bound to phosphorus: O1P, O2P
- Atoms in pentose:
- carbon ring: C1', C2', C3', C4', C5'
- hydrogens attached to carbon ring: H1', 1H2', 2H2' (only DNA), H3', H4', 1H5' and 2H5'
- hydroxyls: O2', O3', O4', O5', 2HO' (H on 2'-hydroxyl, only RNA) (the ring oxygen is denoted O4, not O1).
Note: PDB files label pentose atoms with asterisk instead of prime signs. How does Jmol cope with this? Not much of a trouble: given the asterisk is a wildcard, "select C3*" will get pentose carbons either labeled with prime or asterisk!.
- Termini:
- 5'-terminus oxygen (no phosphate): O5T
- 5'-terminus hydrogen (attached to O5T or O5'): H5T
- 3'-terminus hydrogen (on 3'-hydroxyl): H3T
- Atoms in bases:
- ring, both purines and pyrimidines: N1, C2, N3, C4, C5, C6
- ring, purines: N7, C8, N9
- ring, pyrimidines: O2
- substituents on ring:
- in cytosine: N4
- in guanine: N2
- in adenine: N6
- in thymine: C5M
- in guanine and hypoxanthine: O6
- in thymine and uracil: O4
- in thiouracil: S4
(*)Note: on PDB format, atom id is called atom name, and must be in these positions/columns:
- 13-14 : Chemical symbol, right justified, except for hydrogen atoms
- 15 : Remoteness indicator (alphabetic).
- 16 : Branch designator (numeric).
Sidechain
Defined as (not backbone).
Bases
Synonim of sidechain.
By type of molecule:
protein, nucleic, dna, rna, water, solvent, ligand...
By type of residue:
Inclusion in this sets is determined by residue id (only as far as it is written in the adequate field in the molecular coordinate file, usually PDB format).
Residue IDs:
- Nucleotides: A, G, C, T, U
- Amino acids: the 3-letter standard abbreviation
Residue sets:
- Nucleotides: purine, pyrimidine, at, cg
- Amino acids:
- acyclic, cyclic, aliphatic, aromatic
- large, medium, small
- polar, nonpolar, hydrophobic, neutral, charged, acidic, negative, basic, positive, ...
- buried, surface
- hetero, ions, ligand, water, solvent
By structure of the polymer:
- amino, protein, nucleic
- helix, sheet, turn
- bonded