The genomics revolution has thus far supplied us with over 200,000 publicly available microbial genome sequences. Mining this vast resource for natural product biosynthetic gene clusters whose chemical products could hold future value to humankind is one avenue to fill the drug discovery pipeline. This process is relatively trivial for non-ribosomal peptide and polyketide natural products, owing to the large size of the requisite biosynthetic genes and the genetic commonality within each compound class. Considerably more challenging is the bioinformatic identification of ribosomally synthesized and post-translationally modified peptides, the so-called RiPP natural products, of which there are now over 30 distinct structural classes that often surpass NRPS and PKS products in their structural complexity, breadth of bioactivities, and phylogenetic distribution. RiPP biosynthetic pathways and the resulting natural products are a largely untapped source of new chemical matter created in ways that challenge existing biochemical paradigms.
This lecture will cover our motivations, initial forays, and on-going work in how big data genomics, high-throughput mining, and reactivity-based screening can be brought to bear in cataloguing all observable members of a desired class of RiPP. Retrospective analysis of these specific datasets naturally leads to the generation of new biosynthetic and mechanistic hypotheses while also accelerating the discovery of new members of that RiPP class. By focusing on divergence within protein sequence-function space, one can even discover first-in-class RiPPs using bioinformatic approaches, which stands in contrast to popular belief. With the prevalence of RiPP biosynthesis starting to come into focus, early estimates suggest this molecular class will far outnumber other classes of specialized microbial metabolites.