November 5, 2024
by Mikhail Elyashberg, Leading Researcher, ACD/Labs
Two Polyenol Natural Products
In 2023, an article entitled “Structural reassignment of two polyenol natural products” [1] was published in the Eur. J. Org. Chem. The article attracted great interest from readers. Over the course of a year, this work has received more than 2,000 views. Both structures of natural products revised in this article were similar, and the reasons for the incorrect structures turned out to be the same in both cases. The Structure Elucidator Suite expert system was used to verify and subsequently revise these structures. Therefore, we decided to talk about how both structures which were initially assigned to the polyenol class were reassigned.
In 2016 Ma et al. [2] reported the isolation of (5S,6R,7S,8R)-5-amino-(2Z,4Z)-1,2,3-trihydroxybuta-2,4-dienyloxy-pentane-6,7,8,9-tetraol (1) from the southeast Asian spice Murraya koenigii (L.).
More recently Siebatcheu et al. [3] isolated 1, together with a related isomer, (Z)-5-amino-5-(1,1,2-trihydroxybuta-1,3-dienyloxy)pentane-6,7,8,9-tetraol (2), from an endophytic fungus, Trichoderma erinaceum.
These purported natural products were elucidated based on extensive spectroscopic methods (e.g., NMR, MS, UV, IR, and CD), and were isolated as stable solid substances (i.e., colorless and white powders, respectively). From a structural perspective, however, both 1 and 2 contain multiple enol moieties, that would under normal chemical circumstances be expected to exist in the respective keto (carbonyl) tautomeric forms (Scheme 1).
For example, 1 would be better represented as either 3 or 4 (and 2 as 7), and in both cases the hemiaminal ether moiety would also be considered quite sensitive to even mild acid (Scheme 1). The latter would undergo hydrolysis to give degradation products 5 and 6 (Scheme 1). Furthermore, stable enols are rare and exist only when stabilized in some form, while stable ene-diols are even rarer with ascorbic acid (vitamin C) being one of a very few. Therefore, based on these chemical principles it was suspected that a potential structure misassignment had occurred in both cases (i.e., 1 and 2).
Revision of Structure 1
The proposed structure 1 (C9H17NO8) was entered into Structure Elucidator Suite and 13C chemical shift prediction was performed using the empirical methods implemented in the expert system (see Figure 1).
Given that acceptable deviations for correct structures are usually in the range of 1-2.5 ppm, Figure 1 unambiguously shows that structure 1 is incorrect. Therefore, 13C, 1H, HSQC, and HMBC data presented in the work of Ma et al [2] were entered into the program (Table 1), and a Molecular Connectivity Diagram (MCD) was generated (Figure 2).
Table 1. NMR spectroscopic data of compound 1 [1]
Label | dC | dC calc (HOSE) | XHn | dH | M (H) | H to C HMBC |
C 1 | 153.5 | 127.05 | CH | 8.19 | s | C 3, C 2 |
C 2 | 157.6 | 133.63 | C | |||
C 3 | 150 | 141.82 | C | |||
C 4 | 142 | 132.49 | CH | 8.32 | s | C 3 |
C 5 | 91.3 | 81.46 | CH | 5.97 | d | C 6, C 4 |
C 6 | 75.5 | 72.92 | CH | 4.75 | t | C 8, C 5 |
C 7 | 72.7 | 75.64 | CH | 4.34 | u | C 9, C 8, C 5 |
C 8 | 88.2 | 71.75 | CH | 4.18 | u | C 7 |
C 9 | 63.5 | 64.01 | CH2 | 3.92 | u | C 7, C 8 |
C 9 | 63.5 | 64.01 | CH2 | 3.77 | u |
Structure generation was performed from the MCD with the following results: k = 25, tg = 1 s, where k is the number of structures, tg – processor time. 13C chemical shift prediction was carried out for the output file, and structures were ranked in increasing order of dA deviations. The top six structures of the ranked structural file are shown in Figure 3.
It turned out that proposed structure 1 was placed in the first position with very large deviations. This means that no structure characterized by average deviations of acceptable values can be generated from the initial data. Therefore, structure 1 should be revised.
The reassessment of 1 was initially approached by inspecting the non-controversial moiety of the molecule, which in this case was the carbohydrate fragment (right-hand fragment) concerning carbons C6-C9. All four carbon chemical shift values were in the range (δc 75.5, 72.7, 88.2, 63.5 ppm) matching that expected for hydroxylated sp3 carbon atoms. The hemiaminal ether at C5 would also be expected to resonate in the recorded region (i.e., δc 91.3 ppm) and this was further supported by the observed HSQC correlations.
Inspection of the 13C NMR spectrum of compound 1 revealed an additional resonance at ~121 ppm. Therefore, attention was then focused on the reported molecular formula, i.e., C9H17NO8 (m/z 290.0853 [M+Na]+, calcd. for 290.0846), which contained an unusually high number of hydrogen atoms. The observed value of m/z 290.0853 was also in agreement with a molecular formula consistent with C10H13N5O4 (calcd. for m/z 290.0860[M+Na]+), which requires an additional carbon atom. Considering that the new molecular formula also indicated the presence of four additional nitrogen atoms, and that purine bases are often associated with carbohydrate moieties, it was conceivable that a nucleoside had been isolated.
A new MCD was created from the spectroscopic data (Table 1) and the molecular formula C10H13N5O4 (Figure 4), and structure generation was repeated, which gave the following results: k = 2,160,807 → (Filter) → 2151 → (duplicate removal) → 1978, tg = 43 m.
The six top-ranked structures ranked in increasing order of average deviations are presented in Figure 5.
It turned out that the best structure in Figure 5 coincided with the structural formula of a known compound, adenosine. This fact was confirmed from the calculation of the DP4 probabilities for the set of the three top-ranked structures (Figure 6).
DU8ML [4], the DFT-empowered NMR chemical shift and coupling constant prediction tool enabled with machine learning capability, also confirmed the hypothesis of adenosine, which was subsequently proven beyond doubt through direct 1H and 13C NMR comparison to commercial material.
Revision of Structure 2
Having deduced the presence of nucleosides, the focus was turned to the assessment of 2. Evaluation of the HRMS-ESI spectrum for 2 revealed two peaks [2]. The major closely matched the molecular formula previously seen for adenosine [i.e., C10H13N5O4 (m/z 268.1041 [M+H]+, calcd. for 268.1046)], and the minor was observed at m/z 245.0769. Therefore, assuming the minor ion was also a [M+1] peak, a molecular formula of C9H13N2O6 (calcd. for 245.0774) could be generated.
The steps of structure 2 revision were similar to those which were described above. With this in mind, we will present only the figures reflecting the data which were processed or obtained, with short explanations. In the case of 2, Structure Elucidator again demonstrated that this structure was incorrect (see Figure 7).
In the work [3], COSY and main HMBC correlations were presented graphically:
13C, 1H, HSQC, and HMBC data presented in Figure 8 were entered into the program, and a Molecular Connectivity Diagram (MCD) was created (Figure 9).
Results of structure generation: k= 75 → (removal of duplicates) → 58, tg = 2.5 s, where k is number of structures, tg – processing time. 13C chemical shift prediction was performed for the output file, and the structures were ranked in increasing order of dA deviations. The top six structures of the ranked structural file are shown in Figure 10. It turned out that proposed structure 2 was placed in the fourth position by the ranking procedure, while the “best” structure is characterized by very large values of average and maximum deviations.
The next actions were similar to those performed in the previous case. Therefore, we will present the corresponding pictures assuming that their meaning will be clear.
Results of structure generation: k = 1300 → (Filter) → 402 → (duplicate removal) → 402, tg =9 s. The six top-ranked structures are shown in Figure 12.
As in the previous case, the structure of uridine was confirmed by DU8ML calculations, as well as by comparison of experimental 1H spectrum with the literature.
Thus, structures 1 and 2 were revised:
Several methodological conclusions can be drawn from the examples considered:
- First and foremost, chemical knowledge must be used to assess the stability of each proposed chemical structure.
- All possible variants of the molecular formula that follow from the mass spectrum should be carefully checked.
- It is necessary to perform predictions of 13C chemical shifts for the proposed structures, which is easily achieved with the help of fast empirical methods. This would make it possible in both considered cases to immediately establish that the structures are wrong.
- It is necessary to extract information about functional groups from the infrared spectrum as much as possible. Obviously, the infrared spectrum of compound 2 (if registered) would show the presence of a carbonyl group in the molecule.
References
- G. Kutateladze, R. W. Bates, M. E. Elyashberg, C. M. Williams. (2023). Structural reassignment of two polyenol natural products. Eur. J. Org. Chem., 26, e202201316
- Q.-G. Ma, K. Xu, Z.-P. Sang, R.-R. Wei, W.-M. Liu, Y.-L. Su, J.-B. Yang, A.-G. Wang, T.-F. Ji, L.-J. Li. (2016). Alkenes with antioxidative activities from Murraya koenigii (L.) Spreng. Bioorg. Med. Chem. Lett., 26, 799.
- C. Siebatcheu, D. Wetadieu, O. Y. Youassi, M. A. B. Boat, K. G. Bedane, N. S. Tchameni, M. L. Sameza. (2022). Secondary metabolites from an endophytic fungus Trichoderma erinaceum with antimicrobial activity towards Pythium ultimum. Nat. Prod. Res. 37(4), 657-662.
- M. Novitskiy, A. G. Kutateladze. (2022). DU8ML: Machine learning-augmented Density Functional Theory nuclear magnetic resonance computations for high-throughput in silico solution structure validation and revision of complex alkaloids. J. Org. Chem., 87, 4818.