Basetype Notation

The basetype of a monosaccharide describes residue size, the stereochemistry (incl. absolute configuration and anomeric) and the ring closure. In addition, it may contain a number of core modifications.


Absolute Configuration

For the use of the configurational symbols and prefixes, see the IUPAC definition 2-Carb-4.


Anomeric

For the definition of the anomeric, see the IUPAC definition 2-Carb-6.


Core Modifications

The monosaccharide basetype can feature a number of core modifications. Several of them result in achiral positions and thus influence stereochemistry.

The subsequent table summarizes the core modifications that are used in MonosaccharideDB.

Name Description Valence Comment
DEOXY Deoxygenation of a position: The OH group is removed and replaced by a hydrogen atom. 1
  • results in achiral position
KETO A carbonyl group in the open chain version of a monosaccharide. This modification is omitted if it is only present at position 1 (standard aldose). 1
ALDI Alditol: Reduction of the aldehyde group to CH2OH. 1
ACID Carboxyl (COOH) group. 1
EN Double bond in the basetype backbone. This modification implies that - unless explicitly stated with a deoxy modification - hydroxyl groups are preserved. 2
  • results in achiral positions
ENX Double bond in the basetype backbone with unknown deoxygenation pattern. 2
  • results in achiral positions
YN Triple bond in the basetype backbone. 2
  • results in achiral positions
ANHYDRO Intramolecular anhydride. 2
SP Triple bond to a substituent. 1
  • only possible at terminal positions
  • always in combination with substituent
SP2 Double bond to a substituent. 1
  • results in achiral position
  • always in combination with substituent
GEMINAL Loss of stereochemistry due to identical substituents with DEOXY and H_LOSE linkage types at a single position. 1
  • results in achiral position
  • always in combination with substituents

Stereochemistry

Many monosaccharides only differ in the stereochemistry of the basetype backbone carbons. In most notations, this stereochemistry is denoted using the IUPAC stem type ("parent") names.

In addition to this indirect description of the stereochemistry based on parent names, MonosaccharideDB features a Stereocode field, which contains a direct description of the stereochemistry. The stereocode is a String that contains one character for each carbon of the basetype backbone. "1" indicates that the corresponding carbon is in L-Configuration (OH-group pointing left in Fischer projection), "2" marks a D-Configuration (OH-group pointing right in Fischer projection), and "0" is used to describe achiral positions. D-Glucose in open chain form, for example, has the stereocode "021220":
D-Glucose Stereocode

When a ring is formed from this, the anomeric center (position 1 in this example) becomes a chiral atom and thus the stereocode of that position is adjusted depending on the anomer. For example, the stereocode of β-D-Glcp is "121220", that of α-D-Glcp is "221220".

In case MonosaccharideDB is queried with a residue name, in which the absolute configuration is not given (e.g. "a-Fucp" in CarbBank notation), the stereocode is given based on the D-Configuration, and "1" and "2" are replaced by "3" and "4", respectively. Thus, the stereocode "443340" is assigned to the CarbBank residue "a-Fucp".


Extended Stereocode

The stereocode described above only distiguishes between D- and L-configuration and achiral positions. The latter, however, can be caused by various core modificartions or simply by a terminal position. The "Extended Stereocode" takes this into account. While symbols for chiral positions remain the same as in the standard stereocode, various symbols are used instead of "0" for achiral positions, depending on the cause of the achirality:

Symbol Description
h "head or tail group", CH2OH group at a terminal position
d DEOXY core modification at non-terminal position
m DEOXY core modification at terminal position ("methyl" group)
a ACID core modification
o aldehyde group
k KETO core modification at non-terminal position
e EN + deoxy core modifications
n EN core modification without DEOXY core modification
E EN core modification with unknown deoxygenation status
y YN core modification at non-terminal position
s SP2 core modifation
t SP core modifiation (always at terminal position)
1 "L-Configuration" carbon atom
2 "D-Configuration" carbon atom
x unknown configuration (D or L) carbon atom