August 1, 2019
by Mikhail Elyashberg, Leading Researcher, ACD/Labs
Penerpene A
The paxilline-type indole-terpenoids are a class of fungal metabolites with a common core consisting of a cyclic diterpene fused with an indole moiety skeleton derived from geranylgeranyl diphosphate and indole-3-glycerol phosphate. They are one of the largest classes of fungal indole-terpenoids with diverse structures. More than 1000 have been discovered until now. Many of them showed diverse biological activity, including anti-H1N1, antibacterial, cytotoxic, and ion channel antagonistic,.
While searching for new metabolites from marine fungi the Penicillium sp. KFS28 fungus was isolated from a bivalve mollusk, Meretrix Lusoria, from the Haikou Bay in China [1]. The EtOAc extract of the fermentation broth was processed by the authors who isolated and identified four new paxilline-type indole-terpenoids, named penerpenes A-D.
Compound 1 (penerpene A) is a unique spiro indole-diterpene with a 1,4-dihydro-2Hbenzo[d][1,3]oxazine motif.
1
It’s structure was elucidated using 1D and 2D NMR spectra. These data were used for challenging ACD/Structure Elucidator.
Compound 1 has the molecular formula C28H35NO6 as derived from its HR-ESI-MS and 13C NMR data. Complete 1D NMR and HSQC data were presented in tables in [1], but only selected HMBC and COSY correlations were presented graphically (see Figure 1).
Figure 1. Selected COSY and HMBC correlations presented in [1]
The NMR spectroscopic data from [1] are summarized in Table 1.
Table 1. Spectroscopic data of compound 1
Label | δC | δC calc (HOSE) | XHn | δH | M(J) | COSY | H to C HMBC |
C 1 | 121.000 | 120.960 | C | ||||
C 2 | 125.700 | 126.860 | CH | 6.920 | d (8.0) | 6.67 | C 8 |
C 3 | 118.100 | 120.300 | CH | 6.670 | t(8.0) | 6.92, 7.03 |
C 1 |
C 4 | 127.600 | 128.790 | CH | 7.030 | t(8.0) | 6.64, 6.67 |
C 6 |
C 5 | 116.600 | 112.030 | CH | 6.640 | d(8.0) | 7.03 | |
C 6 | 143.100 | 142.940 | C | ||||
C 7 | 39.800 | 38.630 | CH2 | 2.200 | u | 2.66 | C 8, C 9 |
C 7 | 39.800 | 38.630 | CH2 | 1.960 | u | ||
C 8 | 81.100 | 90.800 | C | ||||
C 9 | 213.700 | 216.900 | C | ||||
C 10 | 53.700 | 54.790 | C | ||||
C 11 | 36.300 | 37.130 | CH | 2.660 | u | 1.80, 2.20 |
|
C 12 | 20.300 | 23.930 | CH2 | 1.800 | u | 1.89, 2.66 |
|
C 12 | 20.300 | 23.930 | CH2 | 1.610 | u | ||
C 13 | 31.900 | 33.860 | CH2 | 1.660 | u | ||
C 13 | 31.900 | 33.860 | CH2 | 1.890 | u | 1.80 | |
C 14 | 76.600 | 77.020 | C | ||||
C 15 | 42.500 | 42.680 | C | ||||
C 16 | 24.500 | 28.070 | CH2 | 1.820 | u | ||
C 16 | 24.500 | 28.070 | CH2 | 2.410 | u | 2.10 | |
C 17 | 28.200 | 28.360 | CH2 | 2.100 | u | 2.41, 4.74 |
|
C 17 | 28.200 | 28.360 | CH2 | 1.650 | u | ||
C 18 | 73.000 | 72.810 | CH | 4.740 | u | 2.10 | |
C 19 | 168.300 | 165.330 | C | ||||
C 20 | 119.500 | 122.210 | CH | 5.770 | d(1.9) | C 18, C 14 |
|
C 21 | 197.500 | 198.050 | C | ||||
C 22 | 83.400 | 85.840 | CH | 3.640 | d(1.9) | ||
C 23 | 71.000 | 71.590 | C | ||||
C 24 | 25.900 | 26.150 | CH3 | 1.190 | S | C 23, C 22 |
|
C 25 | 25.900 | 25.420 | CH3 | 1.150 | S | C 23, C 22 |
|
C 26 | 13.000 | 27.390 | CH3 | 0.830 | S | C 15, C 10, C 14 |
|
C 27 | 17.800 | 16.260 | CH3 | 1.410 | S | C 11, C 15, C 10, C 9 |
|
C 28 | 69.300 | 68.830 | CH2 | 4.510 | u | 6.34 | C 8, C 6 |
C 28 | 69.300 | 68.830 | CH2 | 4.740 | u | ||
N 1 | NH | 6.340 | t(4.7) | 4.51 | C 1 | ||
O 1 | OH | 5.070 | S | ||||
O 2 | OH | 4.340 | S |
Entering the data shown it Table 1 into ACD/Structure Elucidator produced automatically the Molecular Connectivity Diagram (MCD) shown in Figure 2.
Figure 2. Molecular Connectivity Diagram
MCD overview. The diagram contains five light blue carbon atoms C 71.00 ppm, C 76.60 ppm, C 81.10 ppm, C 119.50 ppm and C 121.00 ppm. This color is used to indicate atoms with ambiguous hybridization – either sp3 or sp2 (but not sp). The possibility of bonding to a heteroatom (ob – obligatory, fb – forbidden) was set to carbon atoms by the program and manuall,y taking into account both the 13C and the 1H chemical shifts of the corresponding carbon atoms. Two evident carbonyl bonds were manually added in the MCD.
Structure generation followed by fast 13C chemical shift prediction using the incremental method and spectral filtering were initiated from the constraints summarized in the MCD. Generation was completed with the following results: k =297,588 → (filtering) 266 (duplicate removal) → 77, tg =10 m 42 s. We see that the number of initially generated structures is large (~300,000), which is a consequence of the relatively small number of HMBC correlations (Figure 1) and the uncertainty of the constraints displayed in the MCD (five light blue carbons!). However the final number of candidate structures is less than 80.
At that point 13C chemical shift prediction by all the methods available in ACD/Structure Elucidator was performed, followed by structural file ranking in descending order of dA (13C) value (the average deviation between the calculated and experimental 13C chemical shifts using the HOSE codes approach). 1H chemical shifts were calculated by the neural networks algorithm as well. The six top ranked structures are presented in Figure 3.
Figure 3. Six top ranked structures of the output file.
We see that the average deviations calculated by all methods identified structure #1 as the best one, and this structure matches the structure of penerpene A reported by the authors [1]. However, the value of the maximum deviation max_dA(13C) between the experimental and the predicted chemical shifts is equal to 14.40 ppm, which is unusually large. The accuracy of the assignment of the 13C chemical shifts for Structure #1 is shown below, with green marked atoms having a difference of not more than 3 ppm and yellow of not more than 15 ppm.
1
The maximum deviation was found for the CH3 at 13.00 ppm. To explain the reason of such a deviation, the protocol of the 13C chemical shift prediction is displayed in Figure 4.
Figure 4. Protocol of the 13C chemical shift prediction for the CH3 group (13.00 ppm)
The protocol shows that only one reference structure exists in the ACD/Labs Predictors Training Database. The experimental chemical shift value of that group is 24.00 ppm in the spectrum of reference structure, while the algorithm predicted 27.4 ppm for penerpene A. Regardless of these, the correct structure of compound 1 was identified by the program.
Figure 3 shows the candidate structures contain similar structural elements. It was interesting to see how similar structures are sorted by the program. For this the output file was ranked in decreasing order of similarity coefficient calculated for structure #1 (Figure 5).
Figure 5. First six structures of the output file ranked by similarity coefficient.
Figure 5 shows that the most similar structures, #2 – #4, have average deviations much larger than #1, which allows one to safely rule them out. This example illustrates the high structural “resolving power” of ACD/Structure Elucidator.
References
F.-D. Kong, P. Fan, L.-M. Zhou, Q.-Y. Ma, Q.-Y. Xie, H.-Z. Zheng, Z.-H. Zheng, R.-S. Zhang, J.-Z. Yuan, H.-F. Dai, D.-Q. Luo, Y.-X. Zhao. Penerpenes A−D, Four Indole Terpenoids with Potent Protein Tyrosine Phosphatase Inhibitory Activity from the Marine-Derived Fungus Penicillium sp. KFD28 Org. Lett. 2019, 21, 4864−4867
References
- F. Goudou, P. Petit, C. Moriou, O. Gros, A. Al-Mourabit. (2017). “Orbicularisine: A Spiro-Indolothiazine Isolated from Gills of the Tropical Bivalve Codakia orbicularis“, J. Nat. Prod., 80: 1693−1696.