June 6, 2022
by Mikhail Elyashberg, Leading Researcher, ACD/Labs
Comatulin A
Several types of polyketide-derived pigments can be found in crinoids. These include anthraquinones, napthopyrones, bisanthrones and phenanthroperylenequinones. It is these pigments that give the crinoids their colorful appearance. They are also used as some sort of chemical defense against fish. Some of them also exhibit in vitro activity in biomedical assays.
The group of Lum explored samples collected from different locations of the Great Barrier Reef [1]. The analysis of C. rotalaria extracts by UHPLC-MS in conjunction with database searching led to the isolation and structure elucidation of a series of new taurine-conjugated anthraquinones, including comatulin A (1).
1
The published spectroscopic data used for the structure elucidation of compound 1 (Table 1) were utilized to challenge ACD/Structure Elucidator (ACD/SE).
Comatulin A (1) was isolated as a red amorphous powder and was assigned the molecular formula C20H19NO10S following analysis of the HR-ESI-MS ion at m/z 464.0643 [M −H]−. The degree of unsaturation is equal to 12. The presence of a fragment SO3H was determined from the mass spectrum. The molecule is not large but highly hydrogen deficient: the ratio of skeletal atoms to hydrogens is 1.7. A “silent fragment” whose carbon atoms are deprived from hydrogen is marked by red in structure 1.
Table 1. NMR spectroscopic data of compound 1. Weak HMBC correlations are distinguished by letter w.
C/X Label | δC | δC calc (HOSE) | XHn | δH | COSY | H to C HMBC |
C 1 | 127.7 | 123.08 | C | |||
C 2 | 161.2 | 166.19 | C | |||
C 3 | 112.1 | 114.54 | C | |||
C 4 | 162.7 | 167.23 | C | |||
C 5 | 106.9 | 108.03 | C | |||
C 6 | 164 | 164.41 | C | |||
C 7 | 107.1 | 107.26 | CH | 6.89 | 7.15 | C 10, C 15, C 6, C 8, C 14w |
C 8 | 165.8 | 165.77 | C | |||
C 9 | 56.4 | 56.1 | CH3 | 3.92 | C 8 | |
C 10 | 107.8 | 107.18 | CH | 7.15 | 6.89 | C 7, C 15, C 11, C 8, C 12, C 14w |
C 11 | 134.3 | 136.06 | C | |||
C 12 | 181.8 | 180.88 | C | |||
C 13 | 130.6 | 129.07 | C | |||
C 14 | 188.2 | 189.12 | C | |||
C 15 | 109.6 | 110.52 | C | |||
C 16 | 201.6 | 202.89 | C | |||
C 17 | 31 | 30.47 | CH3 | 2.42 | C 1, C 16 | |
C 18 | 39.2 | 46.49 | CH2 | 4.25 | 8.71 | C 19, C 3, C 2, C 4 |
C 19 | 44.3 | 43.01 | CH2 | 3.29 | 2.85, 8.71 | C 18, C 20 |
C 20 | 46.6 | 46.86 | CH2 | 2.85 | 3.29 | C 19 |
N 1 | NH | 8.71 | 3.29, 4.25 | |||
O 1 | OH | 13.22 | C 5, C 3, C 4 | |||
O 2 | OH | 12.1 | C 7, C 15, C 6 |
The spectroscopic data were entered into ACD/SE. The lengths of two HMBC correlations which were measured as weak in the article [1] were set as being of 2-4 bonds length. The Molecular Connectivity Diagram (MCD) was created automatically by the program. A slightly manually edited MCD is presented in Figure 1.
Figure 1. Molecular C\connectivity diagram. Hybridizations of carbon atoms are marked by corresponding colors: sp2 – violet, sp3 – blue, not sp (sp2 or sp3) – light blue. Labels “ob” and “fb” are set by the program to carbon atoms for which neighboring with heteroatom is either obligatory (ob) or forbidden (fb).
MCD overview. The MCD contains six light blue carbon atoms characterized by ambiguous hybridization. The SO3H fragment was also manually drawn. The carbon atom at 130.60 ppm has no correlation in the HMBC spectrum. It was assumed that seven carbons with chemical shifts which lie in the interval 161.20 – 201.60 ppm are connected with at least one atom of oxygen (a label ob). Five free oxygen atoms will be incorporated into structures during the structure generation process. With those initial data, and having in mind the deficit of hydrogens, it should be expected that structure generation will require a long time.
Checking the MCD for consistency showed that the collective HMBC and COSY data contain contradictions, and a minimum number of nonstandard correlations is 1. Therefore Fuzzy Structure Generation (FSG) was initiated with the options set automatically. To reduce the number of structures generated, structures containing 4-membered rings, rare in chemistry of natural products, were forbidden. Structure generation accompanied by 13C chemical shift prediction using the incremental approach and neural networks and combined with spectral filtering. Results: k = 646,818 → (spectral filtering) → 13 → (duplicate removal) → 13, tg=1 h 34 min, one COSY connectivity was lengthened during FSG.
13C chemical shift prediction was performed afterwards, using the HOSE code-based algorithm, and the output file was ranked in ascending order of average deviations dA(13C) of the experimental chemical shifts from calculated ones. The three top ranked structures of the output file are shown in Figure 2.
Figure 2. The three top ranked structures of the output file.
We see that the best structure is identical with the structure of comatulin A suggested by the authors [1]. However, the calculation time consumed for the problem solving exceeded 1.5 h. It was interesting to investigate how this time could be reduced if some evident structural information was added. For this, three carbonyl bonds were drawn in MCD as shown in Figure 3.
Figure 3. MCD in which three evident carbonyl bonds were manually added.
FSG initiated from the MCD shown in Figure 3 was completed with the following results: k = 8372 → (spectral filtering) → 13 → (duplicate removal) → 12, tg = 18 sec. The three best structures are shown in Figure 4.
Figure 4. The three best structures obtained with the MCD shown in Figure 3.
Comparison of Figures 2 and 4 shows that both solutions are very similar while the second solution was obtained at 1/300th of the time. This example demonstrates that adding some evident additional information, new “axioms”, can dramatically accelerate the process of structure generation. However, it is necessary to remember that the additional information is useful only if it is reliable. A wrong “axiom” deliberately leads to getting a wrong solution. In this particular case assuming that all carbons with chemical shifts above 180 ppm are carbonyls is very safe.
The structure of comatulin A together with the assigned 13C chemical shifts is shown below. The red arrow indicates the lengthened COSY correlation by FSG.
References
- K. Y. Lum, A. C. Taki, R. B. Gasser, I. Tietjen, M.G. Ekins, J.M. White, R. S. Addison, S. Hayes, J. St John, R. A. Davis. (2020). Comatulins A−E, Taurine-Conjugated Anthraquinones from the Australian Crinoid Comatula rotalaria. J. Nat. Prod., 83 (6), 1971-1979, DOI: 10.1021/acs.jnatprod.0c00267