Pavlopoulos, Georgios A. and Baltoumas, Fotis A. and Liu, Sirui and Selvitopi, Oguz and Camargo, Antonio Pedro and Nayfach, Stephen and Azad, Ariful and Roux, Simon and Call, Lee and Ivanova, Natalia N. and Chen, I. Min and Paez-Espino, David and Karatzas, Evangelos and Acinas, Silvia G. and Ahlgren, Nathan and Attwood, Graeme and Baldrian, Petr and Berry, Timothy and Bhatnagar, Jennifer M. and Bhaya, Devaki and Bidle, Kay D. and Blanchard, Jeffrey L. and Boyd, Eric S. and Bowen, Jennifer L. and Bowman, Jeff and Brawley, Susan H. and Brodie, Eoin L. and Brune, Andreas and Bryant, Donald A. and Buchan, Alison and Cadillo-Quiroz, Hinsby and Campbell, Barbara J. and Cavicchioli, Ricardo and Chuckran, Peter F. and Coleman, Maureen and Crowe, Sean and Colman, Daniel R. and Currie, Cameron R. and Dangl, Jeff and Delherbe, Nathalie and Denef, Vincent J. and Dijkstra, Paul and Distel, Daniel D. and Eloe-Fadrosh, Emiley and Fisher, Kirsten and Francis, Christopher and Garoutte, Aaron and Gaudin, Amelie and Gerwick, Lena and Godoy-Vitorino, Filipa and Guerra, Peter and Guo, Jiarong and Habteselassie, Mussie Y. and Hallam, Steven J. and Hatzenpichler, Roland and Hentschel, Ute and Hess, Matthias and Hirsch, Ann M. and Hug, Laura A. and Hultman, Jenni and Hunt, Dana E. and Huntemann, Marcel and Inskeep, William P. and James, Timothy Y. and Jansson, Janet and Johnston, Eric R. and Kalyuzhnaya, Marina and Kelly, Charlene N. and Kelly, Robert M. and Klassen, Jonathan L. and Nüsslein, Klaus and Kostka, Joel E. and Lindow, Steven and Lilleskov, Erik and Lynes, Mackenzie and Mackelprang, Rachel and Martin, Francis M. and Mason, Olivia U. and McKay, R. Michael and McMahon, Katherine and Mead, David A. and Medina, Monica and Meredith, Laura K. and Mock, Thomas and Mohn, William W. and Moran, Mary Ann and Murray, Alison and Neufeld, Josh D. and Neumann, Rebecca and Norton, Jeanette M. and Partida-Martinez, Laila P. and Pietrasiak, Nicole and Pelletier, Dale and Reddy, T. B. K. and Reese, Brandi Kiel and Reichart, Nicholas J. and Reiss, Rebecca and Saito, Mak A. and Schachtman, Daniel P. and Seshadri, Rekha and Shade, Ashley and Sherman, David and Simister, Rachel and Simon, Holly and Stegen, James and Stepanauskas, Ramunas and Sullivan, Matthew and Sumner, Dawn Y. and Teeling, Hanno and Thamatrakoln, Kimberlee and Treseder, Kathleen and Tringe, Susannah and Vaishampayan, Parag and Valentine, David L. and Waldo, Nicholas B. and Waldrop, Mark P. and Walsh, David A. and Ward, David M. and Wilkins, Michael and Whitman, Thea and Woolet, Jamie and Woyke, Tanja and Iliopoulos, Ioannis and Konstantinidis, Konstantinos and Tiedje, James M. and Pett-Ridge, Jennifer and Baker, David and Visel, Axel and Ouzounis, Christos A. and Ovchinnikov, Sergey and Buluç, Aydin and Kyrpides, Nikos C. (2023) Unraveling the functional dark matter through global metagenomics. Nature, 622 (7983). pp. 594-602. ISSN 0028-0836
s41586-023-06583-7.pdf - Published Version
Download (9MB)
Abstract
Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.
Item Type: | Article |
---|---|
Subjects: | GO for ARCHIVE > Multidisciplinary |
Depositing User: | Unnamed user with email support@goforarchive.com |
Date Deposited: | 10 Nov 2023 06:07 |
Last Modified: | 10 Nov 2023 06:07 |
URI: | http://eprints.go4mailburst.com/id/eprint/1712 |