Bioactivity Based Molecular Networking for the Discovery of Drug Lead in Natural Product Bioassay-Guided Fractionation Louis-Félix Nothias,,, Mélissa Nothias-Esposito,, # Ricardo da Silva,, Mingxun Wang,,, Ivan Protsyuk, Zheng Zhang,, Abi Sarvepalli,, Pieter Leyssen, David Touboul, Jean Costa, # Julien Paolini, # Theodore Alexandrov,, Marc Litaudon, & Pieter C. Dorrestein,,* Collaborative Mass Spectrometry Innovation Center, and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, 92093, California, USA Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California at San Diego, La Jolla, 92093, La Jolla, USA Institut de Chimie des Substances Naturelles, CNRS, ICSN UPR 2301, University of Paris-Saclay, 91198, Gif-sur-Yvette, France # Laboratoire de Chimie des Produits Naturels, CNRS, UMR SPE 6134, University of Corsica, 20250, Corte, France European Molecular Biology Laboratory, EMBL, eidelberg, Germany. Laboratory for Virology and Experimental Chemotherapy, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium Author Contributions L.-F.N. and M.N.-E. contributed equally to this manuscript. L.-F.N., M.E., M.L. and P.C.D conceived and designed the research, and wrote the manuscript. M.N-E. isolated and identified the molecules. P.L performed antiviral evaluation. I.P and T.A adapted the ptimus workflow. L.-F.N and M.N.-E. generated and annotated the molecular networks. R.S prepared the Jupyter notebook. Z.Z. and A.B coded MZmine2 modules. All other authors discussed and commented on the manuscript preparation. Supporting Information STRUCTURAL ELUCIDATIN Compound 20 showed by RESIMS, a molecular ion at m/z 591.3272 [M + Na] +, corresponding to the molecular formula, C 34 48 7, indicating 11 degrees of unsaturation. Examination of SQC, 1 and 13 C NMR spectra (CDCl 3, Figure S1, table 1 and 13 C) allowed determination of one ketone carbonyl ( δ C 210.1), two ester carbonyls ( δ C 166.1 and 179.7) and two olefinic signals attributable to two vicinal-monosubstituted double bond, [ δ C 160.1 (C-1), 136.5 (C-2) and δ 7.54 (br t, J = 1.7 z, -1)] and [ δ C 142.1 (C-6), 126.7 (C-7) and δ 5.52 (d, J = 5.4 z, -7)]. The CSY and MBC experiment permit to deduced the ester s nature, which were assigned to one deca-2 Z,4 E -dienoyl groups ( δ C 115.0, 146.3, 127.0, 146.6, 33.2, 28.6,
31.6, 22.7, 14.2; δ 5.52, 6.56, 7.29, 6.07, 2.15, 1.41, 1.27, 1.28, 0.87), and one isobutyrate groups ( δ C 34.4, 18.8, 18.7; δ 2.57, 1.18, 1.15). From 11 degrees of unsaturation, 7 were attributed to two ester groups, one carbonyl and two double bonds, and thus four additional rings had to be assumed as tigliane skeleton. Examination of the CSY spectrum of compound 20 revealed three spin systems (-1, -10, -4 and -5; -7, -8 and -14; and -11, -12 and -18). The first one is represented by: =C C C C 2 [-1 and δ at 3.23 (m, -10); 2.44 (td, J = 9.5 and 4.5 z, -4)] connected through methylene group [ δ 2.82 and 2.12 (dd, J = 18.0 and 10.0 z, -5 α and -5 β respectively). The Second is characterized by: =C C C [-7; δ 2.35 (t, J = 5.4 z, -8) and 1.01 (d, J = 5.4 z, -14)]. The third spin-system can be built by: C(C 3 ) C [ δ 1.56 (dd, J = 6.4 and 9.8 z, -11); 0.91 (d, J = 6.4 z, 3-18) and 5.43 (d, J = 9.8 z, -12)]. The first spin-system can be interconnected to one α, β -unsatured carbonyl by correlations from -1 to C-2, C-3 and C-19 [( δ C 136.5, 210.1 and 10.4 ( δ 1.70, dd, J = 2.4 and 1.2 z, 3-19)] and from -4 to C-3, observed in the MBC spectrum. Two spin-systems can be interconnected, deduced by MBC correlations from -5 and -7 to C-6 and C-20 [( δ C 142.1 and 67.7 ( δ 4.01, br s, 2-20)] and, from -10 and -8 to C-9 ( δ C 77.9). Thus, the correlation between -8, -14, -12, 3-16 and 3-18 to C-15, C-13, C-16, C-18, C-17 and C-9 correspond to the part of the molecule. All these observations were used to determinate a 4 β -deoxyphorbol skeleton. The attachment of an deca-2 Z,4 E -dienoyl group at C-12 and an isobutyryl group at C-13 can be established thanks to MBC correlations between the oxymethine protons and ester carbonyl carbons (Figure S2). The RESY spectrum, allowing the attachment of two hydroxyl groups at δ 5.81 and 5.27 at position C-9 and C-20 respectively. Based on previous data, and by observation of RESY correlations, the relative configuration of compound 20 was established as depicted in Figure S2 and established as 12 β - -[deca-2 Z,4 E -dienoyl]-13 α -isobutyryl-4 β -deoxyphorbol. The same general approach was used to elucidate the structures of 4 β -deoxyphorbol esters 21 23. These compounds possess the same 4 β -deoxyphorbol skeleton and the isobutyrate group at C-13 as depicted for compound 20, but differ in C-12 ester pattern. Elucidation of the acylation pattern was solved thanks to CSY and MBC experiments (Figures S12, S14; S17, S19; and S22, S24). EXPERIMENTAL SECTIN Spectral feature detection with ptimus workflow. ptimus is an open-source and open-format LC-MS processing workflow http://github.com/molecularcartography/ptimus using penms algorithms [ http://www.nature.com/nmeth/journal/v13/n9/full/nmeth.3959.html ], and available as Knime workflow and run on Knime v. 3.2.1 on MacS 10.12. The parameters used were set as following for Detect LC-MS features: m / z tolerance (15 ppm), Noise threshold (1 000 000), alf of the MS/MS isolation window (0.075 Da), RT tolerance for MS/MS acquisition (10 s) ; for Detect LC-MS features/set advanced FD settings: common_chrom_peak_snr (3.0), common_chrom_fwhm (25 s), mtd_reestimate_mt_sd (yes), mtd_trace_termination_criterion (outlier), mtd_trace_termination_outliers (3 spectra), mtd_min_sample_rate (0.5), mtd_max_sample_rate (100), epd_width_filtering (fixed), epd_min_fwhm (3), epd_max_fwhm (40), epd_masstrace_snr_filtering (no), ffm_local_rt_range (20 s), ffm_local_mz_range (5 s),
ffm_charge_lower_bound (1), ffm_charge_upper_bound (3); for Detect LC-MS features/detect LC-MS features/decharger: Potential adducts: +:0.1 / Na+:0.1, Align and quantify features, RT tolerance (60 s), Enable re-integration of missing features (yes), Enable pose clustering alignment (no),; for Filter features: Minimum intensity ratio as compared to blanks (3.0), Minimal occurrence number (2.0), igh chromatographic peak quality (yes), Elution time to discard at start (300 s), Elution time to discard at end (600 s), Presence of MS/MS (yes); for Normalize features: Enable feature normalization (no); for Prepare results, Save as mgf (yes), MS1 noise threshold (5000), MS2 noise threshold (1000).
Figure S1. Table 1 and 13 C NMR Data for Compound 20 (300 Mz) in CDCl 3 (δ and δ C in ppm, J in z)... 2 Figure S2. Key CSY (bold, left), MBC (blue arrows, left) and RESY (red arrows, right) correlations of compound 20.... 2 Figure S3. Extracted ion chromatogram of the compound 20 in LC-MS/MS with neutral loss and backbone fragment ions in zoom.... 3 Figure S4. 1 NMR spectrum (CDCl 3, 300 Mz) of 20... 4 Figure S5. 13 C NMR spectrum (CDCl 3, 75 Mz) of 20... 5 Figure S6. CSY NMR spectrum (CDCl 3, 300 Mz) of 20... 6 Figure S7. SQC NMR spectrum (CDCl 3, 300 Mz) of 20... 7 Figure S8. MBC NMR spectrum (CDCl 3, 300 Mz) of 20... 8 Figure S9. RESY NMR spectrum (CDCl 3, 500 Mz) of 20... 9 Figure S10. RESIMS spectrum (TF) of 20... 10 Figure S11. 1 NMR spectrum (CDCl 3, 300 Mz) of 21... 11 Figure S12. CSY NMR spectrum (CDCl 3, 300 Mz) of 21... 12 Figure S13. SQC NMR spectrum (CDCl 3, 300 Mz) of 21... 13 Figure S14. MBC NMR spectrum (CDCl 3, 300 Mz) of 21... 14 Figure S15. RESIMS spectrum (TF) of 21... 15 Figure S16. NMR spectrum (CDCl 3, 300 Mz) of 22... 16 Figure S17. CSY NMR spectrum (CDCl 3, 300 Mz) of 22... 17 Figure S18. SQC NMR spectrum (CDCl 3, 300 Mz) of 22... 18 Figure S19. MBC NMR spectrum (CDCl 3, 300 Mz) of 22... 19 Figure S20. RESIMS spectrum (TF) of 22... 20 Figure S21. 1 NMR spectrum (CDCl 3, 500 Mz) of 23... 21 Figure S22. CSY NMR spectrum (CDCl 3, 500 Mz) of 23... 22 Figure S23. SQC NMR spectrum (CDCl 3, 500 Mz) of 23... 23 Figure S24. MBC NMR spectrum (CDCl 3, 500 Mz) of 23... 24 Figure S25. RESIMS spectrum (TF) of 23... 25 Figure S26. Table of Antiviral Activities of Compounds 20 23 against CIKV in Vero Cells a... 26
Figure S1. Table 1 and 13 C NMR Data for Compound 20 (300 Mz) in CDCl 3 (δ and δ C in ppm, J in z) position δ C δ position δ C δ 1 160.1 7.54, br t (1.7) 19 10.4 1.70, dd (2.4, 1.2) 2 136.5-20 67.7 4.01, br s 3 210.1-9- 5.81, br s 4 44.4 2.44, td (9.5, 4.5) 20-5.27, br s 5 29.8 2.82, dd (18.0, 10.0) 12-R 166.1-20 2.12, dd (18.0, 10.0) 1' 115.0 5.52, d (11.3) 6 142.1-2' 146.3 6.56, t (11.3) 7 126.7 5.52, d (5.4) 3' 127.0 7.29, dd (15.6, 11.5) 8 42.2 2.35, t (5.4) 4' 146.6 6.07, ddd (15.6, 7.1, 6.8) 9 77.9-5' 33.2 2.15, dd (13.6, 7.1) 10 54.4 3.23, m 6' 28.6 1.41, m 11 42.7 1.56, dd (6.4, 9.8) 7' 31.6 1.27, m 12 76.2 5.43, d (9.8) 8' 22.7 1.28, m 13 65.1-9' 14.2 0.87, t (6.9) 14 35.8 1.01, d (5.4) 13-iBu - - 15 26.1-1'' 179.7-16 23.9 1.18, s 2'' 34.4 2.57, q (7.0) 17 16.9 1.18, s 3''-Me 18.8 1.18, d (7.0) 18 15.2 0.91, d (6.4) 4''-Me 18.7 1.15, d (7.0) Figure S2. Key CSY (bold, left), MBC (blue arrows, left) and RESY (red arrows, right) correlations of compound 20. 19 CSY MBC 2 18 1 3 4 10 9 8 12 13 11 14 5 6 20 7 17 15 16 2 RESY 11 15 14 8 10 5
Figure S3. Extracted ion chromatogram of the compound 20 in LC-MS/MS with neutral loss and backbone fragment ions in zoom. http://gnps.ucsd.edu//proteosafe/result.jsp?task=548b824c6fab4ebe8b5447f5bf4b68bb&view=cluster_details&protein=118 and CCMSLIB00000840550
Figure S4. 1 NMR spectrum (CDCl 3, 300 Mz) of 20 CDCl 3
Figure S5. 13 C NMR spectrum (CDCl 3, 75 Mz) of 20 CDCl 3
Figure S6. CSY NMR spectrum (CDCl 3, 300 Mz) of 20
Figure S7. SQC NMR spectrum (CDCl 3, 300 Mz) of 20
Figure S8. MBC NMR spectrum (CDCl 3, 300 Mz) of 20
Figure S9. RESY NMR spectrum (CDCl 3, 500 Mz) of 20
Figure S10. RESIMS spectrum (TF) of 20 [M+Na] + [M+Na] +
Figure S11. 1 NMR spectrum (CDCl 3, 300 Mz) of 21 CDCl 3
Figure S12. CSY NMR spectrum (CDCl 3, 300 Mz) of 21
Figure S13. SQC NMR spectrum (CDCl 3, 300 Mz) of 21
Figure S14. MBC NMR spectrum (CDCl 3, 300 Mz) of 21
Figure S15. RESIMS spectrum (TF) of 21 [M+Na] + [M+Na] +
Figure S16. NMR spectrum (CDCl 3, 300 Mz) of 22 CDCl 3
Figure S17. CSY NMR spectrum (CDCl 3, 300 Mz) of 22
Figure S18. SQC NMR spectrum (CDCl 3, 300 Mz) of 22
Figure S19. MBC NMR spectrum (CDCl 3, 300 Mz) of 22
Figure S20. RESIMS spectrum (TF) of 22 [M+Na] + [M+Na] +
Figure S21. 1 NMR spectrum (CDCl 3, 500 Mz) of 23 CDCl 3
Figure S22. CSY NMR spectrum (CDCl 3, 500 Mz) of 23
Figure S23. SQC NMR spectrum (CDCl 3, 500 Mz) of 23
Figure S24. MBC NMR spectrum (CDCl 3, 500 Mz) of 23
Figure S25. RESIMS spectrum (TF) of 23 [M+Na] + [M+Na] +
Figure S26. Table of Antiviral Activities of Compounds 20 23 against CIKV in Vero Cells a CIKV Compound a EC 50 a CC 50 SI b 20 0.9 ± 0.1 5.2 ± 0.8 6 21 0.6 ± 0.6 25.8 ± 5.3 41 22 0.4 ± 0.02 13.4 ± 2.6 34 23 12.6 ± 8.4 46.2 ± 18.4 4 Chloroquine 11 ± 7 89 ± 28 8 a Data in µm. Values are the median ± median absolute deviation calculated from at least three independent assays. b SI or window for antiviral selectivity calculated as CC 50 Vero/EC 50 CIKV