Difference between revisions of "AtomSets"

From Jmol
Jump to navigation Jump to search
m (By element names:)
(By element names:)
Line 7: Line 7:
 
On PDB format, Jmol will identify the element from columns 77-78 (element symbol, right-justified). If this is absent, then it will interpret the "aton name" field (columns    13-14).
 
On PDB format, Jmol will identify the element from columns 77-78 (element symbol, right-justified). If this is absent, then it will interpret the "aton name" field (columns    13-14).
  
Note: there is currently (June 2006, Jmol 10.2) a bug by which Jmol reads calcium as alpha carbon.
+
Note: there is currently (June 2006, Jmol 10.2) a bug by which Jmol may read calcium as alpha carbon based on its ID, although identification by element name works properly.
  
 
== Parts in proteins: ==
 
== Parts in proteins: ==

Revision as of 08:41, 6 June 2006

Predefined Atom Sets

Jmol recognizes and uses several keywords or tokens for several purposes: commands in the scripting language, colors, etc. Among them, there are keywords for predefined atom sets:

By element names:

carbon, oxygen, hydrogen, sulphur, etc.

On PDB format, Jmol will identify the element from columns 77-78 (element symbol, right-justified). If this is absent, then it will interpret the "aton name" field (columns 13-14).

Note: there is currently (June 2006, Jmol 10.2) a bug by which Jmol may read calcium as alpha carbon based on its ID, although identification by element name works properly.

Parts in proteins:

Backbone

Inclusion in this set is determined by atom id*, as follows:

  • Peptide bond: N, H (bound to N), CA (alpha carbon), HA (bound to CA), C (carbonyl carbon), O or O1 (bound to C)
  • In glycine, the two equivalent hydrogens are both in the backbone set: either H1 and 1HA or 1HA and 2HA.
  • Termini:
    • OXT : second carbonyl oxygen on C-terminus
    • 1H, 2H, 3H : terminal amino hydrogens

(*)Note: on PDB format: atom id is called atom name, and must be in these positions/columns:

  • 13-14 : Chemical symbol, right justified, except for hydrogen atoms
  • 15 : Remoteness indicator (alphabetic); e.g., in amino acid residues, alpha = A, beta = B, gamma = G, delta = D, epsilon = E.
  • 16 : Branch designator (numeric).

Sidechain

Defined as (not backbone).

Parts in nucleic acids:

Backbone

Inclusion in this set is determined by atom id*, as follows:

  • Phosphate groups:
    • phosphorus: P
    • oxygens bound to phosphorus: O1P, O2P
  • Atoms in pentose:
    • carbon ring: C1', C2', C3', C4', C5'
    • hydrogens attached to carbon ring: H1', 1H2', 2H2' (only DNA), H3', H4', 1H5' and 2H5'
    • hydroxyls: O2', O3', O4', O5', 2HO' (H on 2'-hydroxyl, only RNA) (the ring oxygen is denoted O4, not O1).

Note: PDB files label pentose atoms with asterisk instead of prime signs. How does Jmol cope with this? Not much of a trouble: given the asterisk is a wildcard, "select C3*" will get pentose carbons either labeled with prime or asterisk!.

  • Termini:
    • 5'-terminus oxygen (no phosphate): O5T
    • 5'-terminus hydrogen (attached to O5T or O5'): H5T
    • 3'-terminus hydrogen (on 3'-hydroxyl): H3T
  • Atoms in bases:
    • ring, both purines and pyrimidines: N1, C2, N3, C4, C5, C6
    • ring, purines: N7, C8, N9
    • ring, pyrimidines: O2
    • substituents on ring:
      • in cytosine: N4
      • in guanine: N2
      • in adenine: N6
      • in thymine: C5M
      • in guanine and hypoxanthine: O6
      • in thymine and uracil: O4
      • in thiouracil: S4

(*)Note: on PDB format, atom id is called atom name, and must be in these positions/columns:

  • 13-14 : Chemical symbol, right justified, except for hydrogen atoms
  • 15 : Remoteness indicator (alphabetic).
  • 16 : Branch designator (numeric).

Sidechain

Defined as (not backbone).

Bases

Synonim of sidechain.


By type of molecule:

protein, nucleic, dna, rna, water, solvent, ligand...

By type of residue:

Inclusion in this sets is determined by residue id (only as far as it is written in the adequate field in the molecular coordinate file, usually PDB format).

Residue IDs:

  • Nucleotides: A, G, C, T, U
  • Amino acids: the 3-letter standard abbreviation

Residue sets:

  • Nucleotides: purine, pyrimidine
  • Amino acids: polar, nonpolar, charged, acidic, negative, basic, positive, aromatic, ...