Dr. Levi Waldron, a professor, at the CUNY School of Public Health co-authored published two bioinformatics papers. These papers present novel databases and bioinformatic methods that enable effective analyses of major cancer and human microbiome datasets for a much broader range of researchers than could previously utilize these publicly generated resources. The methods are implemented as components of the free R and Bioconductor software for statistical analysis of high-throughput biological data. The work was published in the journals Cancer Research and Nature Methods. For the paper in Cancer Research, his co-authors included alumni Mr. Marcel Ramos, Mr. Lucas Schiffer, Ms. Carmen Rodriguez, Ms. Tiffany Chan and Mr. Hanish Kodali as well as colleagues from around the world. For the paper in Nature Methods, in addition to Dr. Jennifer B. Dowd, also a professor at the CUNY School of Public Health, and alumni Mr. Lucas Schiffer, Ms. Audrey Rensen, Ms. Valerie Obenchain, Mr. Faizan Malik, and Mr. Marcel Ramos, as well as other colleagues from around the world.
[Photo: Dr. Levi Waldron]
The first of these papers, published in Cancer Research, presents a novel data structure for representing and analyzing multi-omics experiments: a biological analysis approach utilizing multiple types of observations, such as DNA mutations and abundance of RNA and proteins, in the same biological specimens. These kinds of experiments generate comprehensive molecular portraits of cancer tumors and other biological tissues but can be extremely complex to analyze. The published method introduces a network representation linking each observation to its patient and associated clinical data, providing an integrative representation for any number of heterogeneous kinds of measurements. This harmonized representation provides researchers and other methods developers with a simpler interface for previously complicated and error-prone analysis procedures.
The method and its software implementation are applicable across numerous diseases and data types. The team integrated 12 types of molecular data with clinical and pathological information from over 11,000 patients of 33 different cancer types from The Cancer Genome Atlas (TCGA), a nationwide project of the National Cancer Institute, and made these integrated data publicly available. Whereas other software has provided downloading capabilities for these data, or integrated small subsets of it, this work represents the first comprehensive integration of the TCGA data. The authors demonstrate how previously laborious analyses, such as correlating the rates of DNA copy number alterations to the rates of somatic mutation in breast and colorectal cancer, can be accomplished in several lines of code.
Discussing the significance of the studies, Dr. Waldron explains, “Our hope is that patients who donate their specimens and their DNA to help find preventions and cures for their disease, will see the importance of their contributions as newly empowered researchers from around the world are able to set to work turning their data into discoveries.”