Close

Discovery of FoTO1 and Taxol genes enables biosynthesis of baccatin III

Chemical and biological materials

Chemical standards were purchased from the following vendors (with catalogue number listed): taxusin (TargetMol; TN6763), 1-hydroxybaccatin I (LKT Labs; T0092), baccatin VI (Santa Cruz Biotechnology; sc-503244), 10-deacetylbaccatin III (Sigma-Aldrich; D3676), baccatin III (MedChemExpress; HY-N6985) and 9-dihydro-13-acetylbaccatin III (TargetMol; T5132). Taxadien-5α-ol was synthesized as previously described18. Taxus media var. hicksii was obtained from FastGrowingTrees.

Tissue preparation and single-nucleus sequencing

The cells of Taxus species, like those of many plants, are often two to three times larger than the 35-µm diameter limit for the standard 10x Genomics Chromium single-cell library devices. Consequently, single-cell isolation approaches (such as protoplasting) risked introducing a severe cell-type bias, and we instead adapted previously described nuclei isolation methods52 into a conifer-compatible snRNA-seq protocol. Taxus media var. hicksii aerial tissues (needles, stems and bud scales) were manually disrupted by razor blade and detergent treatment, followed by DNA staining, fluorescence-activated cell sorting (FACS) purification and library synthesis in the 10x Chromium platform. Nuclei extraction buffer (NIB) consisted of 5 mM MgCl2, 10 mM HEPES pH 7.6, 0.8 M sucrose, 0.1% Triton X-100 and (for density matching to prevent nuclei settling during flow sorting) 1% dextran T40 and 2% Ficoll. On the day of use, NIB was supplemented with 1 mM dithiothreitol. All nuclei-extraction steps were performed at 4 °C and wide-bore pipette tips were used when handling nuclei. Steps between tissue collection and loading into the Chromium device were completed within 90 min to avoid RNA loss. To isolate nuclei, approximately 1 g of T. media tissue was removed from the plant and immediately placed in a Petri dish with 10 ml NIB. Tissue was chopped by hand at around 200 rpm with a fresh razor blade for 5 min until most of the large tissue was broken down, and was then gently rocked at 4 °C for 15 min. To remove large debris, disrupted tissue was then passed through a pre-wet 100-μm cell strainer stacked on top of a 40-μm cell strainer. Nuclei were gently pelleted at 300g at 4 °C for 5 min and resuspended in 1 ml NIB with 5 ng μl−1 4,6-diamidino-2-phenylindole (DAPI, Thermo Fisher Scientific) and 5 ng μl−1 propidium iodide. Using a Sony SH800 cell sorter with a 70-μm chip, 140,000–200,000 nuclei were sorted into a tube containing 1 ml PBS+ (PBS, 0.1% bovine serum albumin and 20 U ml−1 Invitrogen ribonuclease inhibitor). The gating strategy is shown in Supplementary Fig. 22. Nuclei were centrifuged at 300g at 4 °C for 5 min, then gently resuspended in 40 μl PBS+. Nuclei were immediately loaded onto a 10x Genomics Chromium controller and libraries were generated using v3 chemistry. Libraries were sequenced on an Illumina NextSeq 3000.

Multiplexed tissue elicitation

For the multiplexed elicitation experiment, Taxus needles were subjected to perturbation in deep-well 96-well plates with 200 μl MS medium (7.5 g l−1 Murashige and Skoog macronutrients (Fisher), 3 g l−1 sucrose, pH 5.7) supplemented with elicitor. Two needles (biological replicates), each from two developmental stages (young and mature), were treated with each elicitation condition (17 conditions listed in Supplementary Table 2) for each time point (1, 2, 3 and 4 days), resulting in 272 tissue samples (2 replicates of 136 perturbations). To minimize contamination, needles were washed thoroughly in sterile water before moving to MS plates, which were sealed with breathable rayon film (VWR) and placed under 18-h light cycles. Tissue elicitation was started at staggered times so that all tissues could be collected simultaneously. To extract nuclei from elicited tissues, all tissues were combined in a wire mesh, washed with water and subjected to the above nuclei-extraction protocol.

Analysis of single-cell data

Reads were cleaned with Trimmomatic53 and mapped to the genomes of T. chinensis5 with STARsolo (v.2.7.10b)54 (STAR…–runThreadN 32–alignIntronMax 10000–soloUMIlen 12–soloCellFilter EmptyDrops_CR–soloFeatures GeneFull–soloMultiMappers EM–soloType CB_UMI_Simple). Ambient RNA was removed with CellBender (v.0.3.0)55. Using the doubletdetection (v.4.2) library56, doublets were removed, as well as cells with outlier numbers of reads or in which most reads were the most expressed genes (pct_counts_in_top_20_genes < 25). Genes were removed from analysis if expressed in fewer than 50 cells. For integrated UMAP plots, scVI was used to integrate cells from multiple single-cell experiments57. Scanpy (v.1.10.1)58 was used for processing and plotting post-filtered nuclear transcriptomes. For co-expression analysis and gene–gene correlation calculations, scVI-normalized57 transcriptomes (8,039 elicited transcriptomes, 3,027 naive transcriptomes from young tissues and 6,077 naive transcriptomes from mature tissues) were clustered into 2,901 cell states (around 10 cells per state) by Leiden clustering58, and then raw reads from each cluster were pooled to yield pseudobulk transcriptomes. These pseudobulk transcriptomes were used to calculate gene–gene correlations. For module analysis, raw reads were analysed by a cNMF package28 run with default parameters, except ‘total modules’) to yield gene modules and their usage across cells. Factorization approximates the observed dataset as the product of two smaller, meaningful matrices: (i) a gene–module matrix (a weight value for each gene in each module); and (ii) a cell–module matrix (expression values of each module in each cell) (Fig. 2c). The weight values of the gene–module matrix can be used as scores that identify the genes that dominate each module; top-scoring genes from the same module have coordinated expression patterns and are likely to be part of the same molecular processes. This approach adapts to the rich but noisy data inherent in single-cell analysis, and reveals patterns of coordinated gene expression that might not be apparent from linear correlation analysis. For example, it allows for genes to be in multiple, overlapping modules, which is likely to better represent how genes in a highly branched metabolism may be expressed. The ‘total modules’ parameter was scanned from k = 50 to k = 400 to determine the sensitivity of the results on this parameter (Supplementary Fig. 1).

Bulk RNA-seq analysis

Raw fastq files from six previous studies5,59,60,61,62,63 were downloaded from NCBI (PRJNA493167, PRJNA251671, PRJNA733140, PRJNA427840, PRJNA497542, PRJNA499080 and PRJNA864083), cleaned with Trimmomatic53 and aligned to the T. chinensis genome5 (STAR map64). Gene–gene correlation was calculated with numpy. Mutual rank (mr), used to calculate the gene linkage maps (Fig. 1d), is defined as:

$${{rm{mr}}}_{ij}=sqrt{{{rm{rank}}}_{ij}times {{rm{rank}}}_{ji}},$$

where rankij indicates the Pearson correlation rank of gene i to gene j.

Cloning of Taxus genes

The cloning of cytosolic diterpenoid boost genes (tHMGR and GGPPS), cytosolic TDS1 and TDS2, T5αH, TAT, T10βΗ, DBAT, T13αΗ and TAX19 genes has been described previously18,65. Candidate genes were amplified from T. media gDNA or cDNA (generated with SuperScript IV, Thermo Fisher Scientific) by PCR (PrimeStar, Takara Bio R045B, primers in Supplementary Table 13), and the PCR products were ligated with AgeI- and XhoI- (New England Biolabs) linearized pEAQ-HT vector66 using HiFi DNA assembly mix (New England Biolabs). Gene annotations used for cloning were taken from the T. chinensis genome5 by default, but were BLAST-searched against the T. media genome (NCBI PRJNA1136025) to determine whether alternative gene models were available. Constructs were transformed into 10-beta competent E. coli cells (New England Biolabs). Plasmid DNA was isolated using the QIAprep Spin Miniprep kit (QIAGEN) and the sequence was verified by whole-plasmid sequencing (Plasmidsaurus).

Transient expression of Taxus genes in N. benthamiana by Agrobacterium-mediated infiltration

pEAQ-HT plasmids containing the Taxus gene were transformed into Agrobacterium tumefaciens (strain GV3101) cells using the freeze–thaw method. Transformed cells were grown on bacteria screening medium 523-agar (Phytotech Labs) plates containing kanamycin and gentamicin (50 μg ml−1 and 30 μg ml−1, respectively; same for the 523 medium below), at 30 °C for two days. Single colonies were then picked and grown overnight at 30 °C in 523-kanamycin–gentamicin liquid medium. The overnight cultures were used to make dimethyl sulfoxide (DMSO) stocks (7% DMSO) for long-term storage in the −80 °C fridge. For routine N. benthamiana infiltration experiments, individual Agrobacterium DMSO stocks were streaked out on 523-agar containing kanamycin and gentamicin and grown for around one to two days at 30 °C. Patches of cells were scraped off from individual plates using 10-μl inoculation loops and resuspended in around 1–2 ml of Agrobacterium induction buffer (10 mM MES pH 5.6, 10 mM MgCl2 and 150 μM acetosyringone; Acros Organics) in individual 2-ml safe-lock tubes (Eppendorf). The suspensions were briefly vortexed to homogeneity and incubated at room temperature for 2 h. The optical density at 600 nm (OD600 nm) of the individual Agrobacterium suspensions was measured, and the final infiltration solution, in which the OD600 nm was 0.2 for each strain (except for TDS, T7AT and T7dA; OD600 nm of 0.6, 0.4 and 0.1, respectively), was prepared by mixing individual strains and diluting with the induction buffer. Leaves of four-week-old N. benthamiana were infiltrated using needleless 1-ml syringes from the abaxial side. Each experiment was tested on leaf 6, 7 and 8 (numbered by counting from the bottom) of the same N. benthamiana plant, as three biological replicates.

For the reconstitution of pathways that involve TBT, the following modifications were made to the procedure above to increase the production of the desired benzoylated products: N. benthamiana plants were watered with 2 mM benzoic acid in water (buffered to pH 5.6) a day before Agrobacterium infiltration, 1 mM benzoic acid was added to the induction buffer and the pH was adjusted to 5.6 before being used for the resuspension of Agrobacterium and preparation of the final infiltration solution.

Phylogenomic analysis

FoTO1 homologues were identified by scanning the Thousand Plant Transcriptome (1KP)67, RefSeq plants and Uniprot Viridiplantae databases with jackhmmer36 (command: jackhmmer -o tempout.txt -E 1e-5 -N 4). Hits with greater than 40% sequence gaps to the original query were discarded. A phylogenetic tree was generated with the remaining protein sequences with FastTree68.

Metabolite extraction of N. benthamiana leaves

Five days after Agrobacterium infiltration, N. benthamiana leaf tissue was collected using a leaf disc cutter 1 cm in diameter and placed inside a 2-ml safe-lock tube (Eppendorf). Each biological replicate consisted of four leaf discs from the same leaf (approximately 40 mg fresh weight). The leaf discs were flash-frozen and lyophilized overnight. Analyses of the more hydrophobic metabolites (for example, compounds 16) were done by GC–MS, and analyses of the more hydrophilic metabolites (for example, compounds 418) were done by liquid chromatography–mass spectrometry (LC–MS). To extract metabolites, ethyl acetate (ACS reagent grade; J.T. Baker) or 75% acetonitrile (high-performance liquid chromatography (HPLC) grade; Fisher Chemical) in 500 μl water was added to each sample along with one 5-mm stainless steel bead for GC–MS or LC–MS analysis, respectively. The samples were homogenized in a ball mill (Retsch MM 400) at 25 Hz for 2 min. After homogenization, the samples were centrifuged at 18,200g for 10 min. For GC–MS samples, the supernatants were transferred to 50-μl glass inserts, placed in 2 ml vials and subjected to analysed by the GC–MS instrument. For LC–MS samples, the supernatants were filtered using 96-well hydrophilic PTFE filters with a pore size of 0.45 μm (Millipore) and analysed by the LC–MS instrument.

GC–MS analysis

GC–MS samples were analysed using an Agilent 7820A gas chromatography system coupled to an Agilent 5977B single quadrupole mass spectrometer. Data were collected with Agilent Enhanced MassHunter and analysed by MassHunter Qualitative Analysis B.07.00. Separation was done using an Agilent VF-5HT column (30 m × 0.25 mm × 0.1 μm) with a constant flow rate of helium of 1 ml per min. The inlet was set at 280 °C in split mode with a 10:1 split ratio. The injection volume was 1 μl. Oven conditions were as follows: start and hold at 130 °C for 2 min, ramp to 250 °C at 8 °C per min, ramp to 310 °C at 10 °C per min and hold at 310 °C for 5 min. The post-run condition was set to 320 °C for 3 min. MS data were collected with a mass range 50–550 m/z and a scan speed of 1,562 u s−1 after a 4-min solvent delay. The MSD transfer line was set to 250 °C, the MS source was set to 230 °C and the MS Quad was set to 150 °C.

LC–MS analysis

LC–MS samples were analysed on either or both of our two instruments: (1) an Agilent 1260 HPLC system coupled to an Agilent 6520 Q-TOF mass spectrometer or (2) an Agilent 1290 HPLC system coupled to an Agilent 6546 Q-TOF mass spectrometer. Typically, the 6520 system shows better sensitivity for the more hydrophobic metabolites, such as 46, whereas the 6546 system works better for the more hydrophilic, highly modified taxanes. Data were collected with Agilent MassHunter Workstation Data Acquisition and analysed by MassHunter Qualitative Analysis 10.0. Separation was done using a Gemini 5-μm NX-C18 110-Å column (2 × 100 mm; Phenomenex) with a mixture of 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B) at a constant flow rate of 400 μl per min at room temperature. The injection volume was 2 μl or 1 μl for the 6520 or the 6546 system, respectively. The following gradient of solvent B was used: 3% 0–1 min, 3%–50% 1–2 min, 50%–97% 2–12 min, 97% 12–14 min, 97%–3% 14–14.5 min and 3% 14.5–21 min (6520 system) and 3% 0–1 min, 3%–50% 1–5 min, 50%–97% 5–10 min, 97% 10–12 min, 97%–3% 12–12.5 min and 3% 12.5–15 min (6546 system). MS data were collected using electrospray ionization (ESI) in positive mode with a mass range of 50–1,200 m/z and a rate of one spectrum per second (6520 system), or Dual AJS ESI in positive mode with a mass range of 100–1,700 m/z and a rate of one spectrum per second (6546 system). The ionization source was set as follows: 325 °C gas temperature, 10 l min−1 drying gas, 35 psi nebulizer, 3,500 V VCap, 150 V fragmentor, 65 V skimmer and 750 V octupole 1 RF Vpp (6520 system), or 325 °C gas temperature, 10 l min−1 drying gas, 20 psi nebulizer, 3,500 V VCap, 150 V fragmentor, 65 V skimmer and 750 V octupole 1 RF Vpp (6546 system). MS/MS fragmentations were generated using [M+Na]+ as the precursor ion and fragmented with a collision energy of 30 eV unless otherwise stated.

Quantification of baccatin III (16)

The samples in Fig. 5i were analysed by an Agilent 1290 HPLC system coupled to an Agilent 6470 triple quadrupole (QQQ) mass spectrometer to accurately quantify the concentration of baccatin III. Data were collected with Agilent MassHunter Workstation Data Acquisition and analysed by MassHunter Quantitative Analysis 10.1 and Microsoft Excel. Separation was done using a ZORBAX RRHD Eclipse Plus C18 Column (2.1 × 50 mm, 1.8 µm; Agilent) with a mixture of 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B) at a constant flow rate of 600 μl per min at 30 °C. The injection volume was 0.5 μl. The following gradient of solvent B was used: 30% 0–1 min, 30%–100% 1–5 min, 100% 5–6.5 min, 100%–30% 6.5–7 min and 30% 7–8 min. MS data were collected using AJS ESI in positive mode. Multiple reaction monitoring was used to monitor the 609.2 to 549.2 ion transition at a collision energy of 24 eV as the quantifier, and the 609.2 to 427.1 ion transition at a collision energy of 32 eV as the qualifier. The ionization source was set as follows: 250 °C gas temperature, 12 l min−1 drying gas, 25 psi nebulizer, 300 °C sheath gas temperature, 12 l min−1 sheath gas flow, 3,500 V VCap, 0 V nozzle voltage.

Extraction and purification of taxanes from N. benthamiana

Nicotiana benthamiana plants were infiltrated with the combinations of biosynthetic genes shown in Supplementary Table 12 for the purification of taxusin (6), taxusin (6’), 1β-hydroxytaxusin (6-O1) and 15-hydroxy-11(15→1)abeo-taxusin (6-O2). Lyophilized N. benthamiana materials were cut into small pieces and extracted with 1 l ethyl acetate (ACS reagent grade; J.T. Baker) in a 2-l flask for 48 h at room temperature with constant stirring. Extracts were filtered using vacuum filtration and dried using rotary evaporation. Two rounds of chromatography were used to isolate compounds of interest. The chromatography conditions for each compound are summarized in Supplementary Table 12. In brief, the first chromatography was performed using a 7-cm-diameter column loaded with P60 silica gel (SiliCycle) and using hexane (HPLC grade; VWR) and ethyl acetate as the mobile phases. The second chromatography was performed on an automated Biotage Selekt system with a Biotage Sfär C18 Duo 6-g column using Milli-Q water and acetonitrile as the mobile phases. Fractions were analysed by LC–MS to identify those containing the compound of interest. Desired fractions were pooled and dried using rotary evaporation (first round) or lyophilization (second round). Purified products were analysed by NMR.

NMR analysis of purified compound

CDCl3 (Acros Organics) was used as the solvent for all NMR samples. 1H, 13C and 2D-NMR spectra were acquired on a Varian Inova 600-MHz or a Bruker NEO 500-MHz spectrometer at room temperature using VNMRJ 4.2, and the data were processed and visualized on MestReNova v.14.3.1-31739. Chemical shifts were reported in ppm downfield from Me4Si by using the residual solvent (CDCl3) peak as an internal standard (7.26 ppm for the 1H and 77.16 ppm for the 13C chemical shift). Spectra were analysed and processed using MestReNova v.14.3.1-31739.

Taxane feeding experiments

Taxus genes were expressed in N. benthamiana leaves using the Agrobacterium-mediated infiltration method described above. Three days after Agrobacterium infiltration, taxanes (purified 3O2A (4), taxusin (6), 10-deacetylbaccatin III or 9-dihydro-13-acetylbaccatin III (13); unless otherwise specified, a 100-μM solution after diluting with 10 mM DMSO stock was used) were fed into the leaves. Approximately 150 μl of solution was used per leaf to yield a circle with a diameter around 3 cm, which was marked for reference. After 18–24 h, four leaf discs were collected within the marked area with a 1-cm diameter cutter, and LC–MS samples were prepared following the methods described above.

Construction of phylogenetic trees

Sequences from the T. chinensis genome were selected using Pfam to identify 672 P450s (PF00067), 218 2-ODDs (PF03171) and 195 acyltransferases (PF02458). P450s were further filtered to those longer than 300 amino acids (467 P450s). Multiple sequence alignment for each family was performed using Clustal Omega, and the phylogenetic trees were constructed using the neighbour-joining method in Geneious Prime (v.2024.0.4) with 100 bootstrap replicates for initial analysis. Arabidopsis thaliana cinnamate 4-hydroxylase (AtC4H, accession NP_180607.1), A. thaliana gibberellin 20-oxidase1 (AtGA20ox1, accession NP_194272.1), and Hordeum vulgare agmatine coumaroyltransferase (HvACT, accession AAO73071.1) were used as outgroups for the P450, 2-ODD and acyltransferase families, respectively. All analyses were performed with default settings unless otherwise specified. Representative genes from major clades of the initial analyses and the Taxol biosynthetic genes were then selected to construct the final phylogenetic trees (Extended Data Fig. 9) using the neighbour-joining method with 1,000 bootstrap replicates.

Purification of proteins and binding assays

All proteins were purified from standard pET28a vectors expressed in BL21DE3 cells (New England Biolabs, C2527H). FoTO1 and FoTO1(ΔCterm) were purified as C-terminal fusions: His6-3×Flag-TEV-mTurq2-GSG-FoTO1. T5αH and TDS were purified with N-terminal purification tags (His6-3×Flag-TEV-enzyme) with N-terminal signal peptides removed (T5αH, 47 amino acids removed; TDS2, 60 amino acids removed). Proteins were purified as previously described69, with post-lysis steps done at 4 °C. In brief, 1 l of cells were grown to an OD600 nm of 0.4–0.5, induced with 0.3 mM IPTG and expressed for 16 h overnight at 18 °C. Cell pellets were lysed in lysis buffer (0.5 M NaCl, 20 mM HEPES pH 8.0, 0.1% Triton X-100, 1 mg ml−1 lysozyme, HALT protease cocktail (Thermo Fisher Scientific) and 1 μl ml−1 DNAse I (New England Biolabs)) by sonication, clarified by centrifugation for 1 h at 8,000g. Proteins were purified on pre-equilibrated Ni-NTA beads (New England Biolabs) and exchanged into a protein storage buffer (10 mM HEPES-KOH pH 8.0, 50 mM KCL, 10% glycerol, 1 mM DTT and 1 mM EDTA). Purified proteins were quantified by Bradford assay, and SDS–PAGE gels were used to verify protein size and correct protein concentration.

For each multiscale thermophoresis experiment, one protein was first labelled with the NanoTemper His-Tag labelling kit (RED-tris-NTA v2, MO-L018) for 30 min at room temperature according to reagent protocols. MST experiments were performed in PBS with 0.05% Tween-20 with labelled query protein (T5αH- or TDS-labelled) at 100 nM and a titration series of target protein.

Co-IP

Nicotiana benthamiana leaves were harvested four days after infiltration. Leaf tissue was homogenized in liquid nitrogen and resuspended in extraction buffer (50 mM Tris pH 7.5, 150 mM NaCl, 0.6% NP-40, 0.6% CHAPS and 1 mM β-mercaptoethanol)70. Lysates were kept on ice and centrifuged at 20,000g for 10 min at 4 °C. The protein content of the clarified extract was determined by Bradford assay (Abcam, 119216). Ten microlitres of protein-G-coated magnetic beads (Invitrogen, 10003d) were washed twice in binding buffer (50 mM Na2HPO4, 25 mM citric acid, pH 5.0) before a 1-h incubation at room temperature under agitation with 1 μl anti-V5 antibody. Lysates were incubated with the indicated compounds for 15 min under agitation. Antibody-bound beads were then washed twice in extraction buffer and incubated for 15 min under agitation at room temperature with lysate corresponding to 100 μg of total protein content (approximately 40 μl). After incubation, bead complexes were washed three times in extraction buffer and mixed with LDS sample buffer (Invitrogen, NP0007) for subsequent analysis by immunoblotting.

Immunoblotting

Lysates were separated for 1.5 h at 80 V on a NuPAGE gel (Invitrogen, NP0321) before transfer onto a PVDF membrane using a Bio-Rad Trans-Blot Turbo Transfer System (Bio-Rad, 1704150). Immunoblots were incubated with the indicated antibodies (anti-V5 at 1:1,000 and anti-HA-HRP at 1:2,500) for 3 h at room temperature under agitation. Blots were subsequently washed and incubated with HRP–protein G (Genscript, M00090) for 1 h, then imaged on the iBright FL1500 Imaging System (Invitrogen, a44241). The extraction buffer was adapted from a previously published procedure70.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *