Elaine Frederick

The School of Medicine introduced the new Yale Center for Biomedical Data Science on Wednesday, an initiative that will support and enhance data science research in biology and biomedicine across the University.

Amidst the rapid rise over the past decade of the importance of biomedical data — driven in particular by innovations in large-scale genomic sequencing — the Center will serve as a collaborative hub for the emerging biomedical data science community at Yale. The Center, which has yet to announce a formal space, will be led by two faculty directors: Mark Gerstein, a professor of biomedical informatics, computer science and molecular biophysics and biochemistry, and Hongyu Zhao, department chair of biostatistics and professor of genetics and statistics and data science.

“The field of data science has become particularly relevant in the biomedical realm — in genomic sequencing, imaging data, patient record data, data on molecules like nucleotides, proteins and metabolites and wearable personal health devices,” Gerstein said. “All of these create data streams that are growing particularly large, and there’s a lot of value in mining and integrating these different data streams.”

The Center was created to provide both a physical and intellectual space for researchers and faculty around the University who are interested in data science — both those doing large-scale analysis and those generating large amounts of data — according to Gerstein.

The center held its “inaugural workshop” at the medical school on Wednesday afternoon to introduce the center and highlight some of its programs. In their kickoff talks, Gerstein and Zhao spoke about the emergence of the field of data science and the significance of establishing the Center.

“It was incredible that the entire auditorium [at The Anlyan Center] was packed, with people standing,” Zhao said. “That’s precisely what we hope to achieve — without the Center, the people in that room would not have all come together in the interest of data science.”

Initial ideas for the Center arose almost five years ago, Gerstein noted, and its conception drew inspiration from Yale’s existing Computational Biology & Bioinformatics program. Faculty members in this program, led by Gerstein and Zhao, hope to support the CBB program through the Center.

“The Center for Biomedical Data Science will work closely with the CBB program and the fellowship program in medical informatics to strengthen Yale’s formal curriculum in bioinformatics, medical informatics and data science,” Zhao said.

The Center has already recruited about 50 faculty members from Yale, including those from the medical school, School of Engineering, Science Hill and West Campus. The faculty directors said that they hope to continue to draw more, both within and outside of the University. The Center is also currently searching for an executive director, who will carry out more practical responsibilities for the Center.

According to Zhao, the exact location for the Center has not been decided yet, but 300 George St. — home to the Yale Center for Medical Informatics and several Yale biostatistics laboratories, including Zhao’s lab — is a strong candidate.

At the workshop, Gerstein noted that the increase of data gathering in genomics is even outpacing growth in computing power. The phenomenon exemplifies how biomedicine is transforming the field of data science, he added.

The professors also gave special thanks to Carolyn Slayman, the late former deputy dean for academic and scientific affairs at the School of Medicine, noting that she was instrumental to the creation of the Center.

The workshop next provided a forum for six Yale researchers from different areas of the University to share their current work in large-scale analysis. The speakers included Murat Günel, chair of neurosurgery at the medical school; Harlan Krumholz ’80, a cardiology professor at the medical school and health policy professor at the School of Public Health; and Daniel Spielman ’92, a professor in computer science and statistics and data science.

Following the talks, keynote speaker Xiaole Shirley Liu, a professor of statistics, biostatistics and computational biology at the Harvard T.H. Chan School of Public Health, gave a presentation on her computational biology research in cancer.

In the future, the center plans to host smaller workshops and courses focused on developing technical skills to work in biomedical data science, according to Zhao. These targeted workshops will be open to faculty members, postdoctoral fellows and students.

A major goal of the Center is to enhance biomedical data science education at the University, Gerstein added. The Center will provide resources and collaborative opportunities for faculty to develop and improve data science courses — for both undergraduates and students in Yale’s graduate schools.

“Although this initiative was started by the medical school, it is meant for the whole campus,” Gerstein said. “We want undergraduates to do research and take courses in biomedical data science — and to be engaged in this center.”

Amy Xiong | amy.xiong@yale.edu

AMY XIONG