Discovery of FoTO1 and Taxol genes enables biosynthesis of baccatin III
Abstract
Plants make complex and potent therapeutic molecules1,2, but sourcing these molecules from natural producers or through chemical synthesis is difficult, which limits their use in the clinic. A prominent example is the anti-cancer therapeutic paclitaxel (sold under the brand name Taxol), which is derived from yew trees (Taxus species)3. Identifying the full paclitaxel biosynthetic pathway would enable heterologous production of the drug, but this has yet to be achieved despite half a century of research4. Within Taxus’ large, enzyme-rich genome5, we suspected that the paclitaxel pathway would be difficult to resolve using conventional RNA-sequencing and co-expression analyses. Here, to improve the resolution of transcriptional analysis for pathway identification, we developed a strategy we term multiplexed perturbation × single nuclei (mpXsn) to transcriptionally profile cell states spanning tissues, cell types, developmental stages and elicitation conditions. Our data show that paclitaxel biosynthetic genes segregate into distinct expression modules that suggest consecutive subpathways. These modules resolved seven new genes, allowing a de novo 17-gene biosynthesis and isolation of baccatin III, the industrial precursor to Taxol, in Nicotiana benthamiana leaves, at levels comparable with the natural abundance in Taxus needles. Notably, we found that a nuclear transport factor 2 (NTF2)-like protein, FoTO1, is crucial for promoting the formation of the desired product during the first oxidation, resolving a long-standing bottleneck in paclitaxel pathway reconstitution. Together with a new β-phenylalanine-CoA ligase, the eight genes discovered here enable the de novo biosynthesis of 3’-N-debenzoyl-2’-deoxypaclitaxel. More broadly, we establish a generalizable approach to efficiently scale the power of co-expression analysis to match the complexity of large, uncharacterized genomes, facilitating the discovery of high-value gene sets. An approach that combines single-nucleus RNA sequencing and multiplexed perturbation identifies genes that enable the biosynthesis of direct precursors of the anti-cancer drug Taxol, whose current production involves a laborious extraction process from yew trees.




