In the era of precision medicine, enormous amounts of data are being generated from disparate sources including omics, imaging, sensing and beyond. Today, computational scientists need to develop better tools to manage, integrate and share data to make it clinically actionable. The Bioinformatics for Big Data program at the Molecular Medicine Tri-Conference 2019 will showcase how medical centers and pharma industry are developing such tools and software to meet this goal.

Final Agenda

Monday, March 11

10:30 am Conference Program Registration Open

ANALYTICAL INNOVATION IN PERSONALIZD MEDICINE VIA BIG DATA

11:50 Chairperson’s Opening Remarks

Zhongming Zhao, PhD, Professor and Director, Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston

12:00 pm Identifying Actionable and Druggable Mutations from Cancer Big Data

Zhao_ZhongmingZhongming Zhao, PhD, Professor and Director, Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston

In this talk, I will first review the computational methods and tools for detecting cancer driver genes and mutations from cancer big data. Then I will present our informatics and biostatistics approaches for identifying cancer mutations and genes from a large amount of somatic mutation data. Finally, I will present an integrative network-based framework for identifying new druggable targets and anticancer indications from existing drugs.

12:30 ML/AI for Pharma R&D: Analytical Challenges and Opportunities

Ray Liu, PhD, Senior Director, Advanced Analytics and Statistical Consultation, Takeda

1:00 Session Break

1:10 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

2:10 Session Break

INTEGRATING MULTIOMICS FOR KNOWLEDGE GENERATION

2:30 Chairperson’s Remarks

Zhongming Zhao, PhD, Professor and Director, Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston

2:40 Wearables and Wired Health

Snyder_MikeMike Snyder, PhD, Stanford W. Ascherman Professor and Chair, Department of Genetics; Director, Center for Genomics and Personalized Medicine, Stanford University

Wearable portable biosensors allow frequent measurement of health-related physiology. We have used smart watches and other devices to detect the onset of infectious diseases such as Lyme disease. We have used continuous glucose monitor to detect individuals with glucose dysregulation. Using these devices we can build personalized models for monitoring health status and early onset of disease.

3:10 Methods for Functional Microbiome by Shotgun Metagenomic Sequencing

Li-HongzheHongzhe Li, Professor of Biostatistics and Statistics, Director, Center for Statistics in Big Data, Chair, Biostatistics Graduate Program, Vice Chair for Integrative Research, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania

Shotgun metagenomic sequencing provides a powerful tool for studying functions of microbial communities. Current methods mainly focus on quantifying microbial compositions and gene/pathway compositions. However, such data in combination with metabolomics data provide important information on functional microbiome. I will present methods for quantifying microbiome growth dynamics and for predicting metabolic potential of a given microbial community and show how to use these quantities to study disease and treatment outcome.

3:40 Quantifying Wellness Using Personal, Dense, Dynamic Data Clouds

Earls_JohnJohn Earls, PhD, Senior Software Engineer, Institute for Systems Biology

We used personal dense, dynamic data clouds (pD3 clouds), where thousands of multi-modal, longitudinal measurements quantify individual health status, to estimate the biological age of thousands of individuals. I will present our work on integrating measurements of clinical labs, proteomics, metabolomics, and genetics to better understand and quantify wellness through the lens of aging. I will show how the aging process affects these measures and how deviations of biological age from chronological age are manifested in disease. I will also present results demonstrating the effect of lifestyle

4:10 Presentation to be Announced

4:25 Sponsored Presentation (Opportunity Available)

4:40 Refreshment Break and Transition to Plenary Session


8:00 Plenary Keynote Session

6:00 Grand Opening Reception in the Exhibit Hall with Poster Viewing

7:30 Close of Day

Tuesday, March 12

7:30 am Registration Open and Morning Coffee


8:00 Plenary Keynote Session

9:15 Refreshment Break in the Exhibit Hall with Poster Viewing

PROCESSING AND ANALYZING HETEROGENOUS DATA

10:15 Chairperson’s Remarks

Hongzhe Li, Professor of Biostatistics and Statistics, Director, Center for Statistics in Big Data; Chair, Biostatistics Graduate Program, Vice Chair for Integrative Research, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania

10:25 Accommodating Heterogeneity in Big Biomedical Data Analyses

Nicholas J. Schork, PhD, Professor, Quantitative Medicine, The Translational Genomics Research Institute

Artificial Intelligence (AI) and Machine Learning (ML) techniques are being used more and more often in the biomedical sciences. Although extremely powerful for developing predictive models, AI and ML have limitations in handling heterogeneity in a data set. We show how the use of models accounting for heterogeneity in large data sets can lead to important insights and complement existing AI and ML methods.

10:55 Unlocking the Data Trapped within the Electronic Health Record Using EMERSE

Hanauer_DavidDavid Hanauer, MD, Program Director for Clinical Informatics, Michigan Institute for Clinical and Health Research, Associate CMIO, Michigan Medicine

The most detailed clinical data are trapped within free text clinical notes, and these data are needed when the structured/coded data are inaccurate or incomplete. For over a decade Michigan Medicine has been developing and using an open source search engine designed for clinical notes, called EMERSE (Electronic Medical Record Search Engine). EMERSE has been used to support a wide range operational, clinical, and research tasks.

11:25 Heterogeneity in “Dirty Data”: Blessings in Disguise for Accelerating Translational Medicine

Khatri_PurveshPurvesh Khatri, PhD, Associate Professor, Stanford University School of Medicine

This talk will discuss translational bioinformatics approaches to translation medicine in the broad domains of autoimmunity, infection, and inflammation.


11:55 Presentation to be Announced

12:10 pm Sponsored Presentation (Opportunity Available)

12:25 Session Break

12:35 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:35 Refreshment Break in the Exhibit Hall with Poster Viewing

DISCOVERY VIA DATA COLLECTION

2:05 Chairperson’s Remarks

Olga Sazonova, PhD, Product Scientist II, 23andMe

2:10 Natural Language Processing for Clinical and Translational Research

Xu_HuaHua Xu, PhD, Professor, Director, Center for Computational Biomedicine, The University of Texas Health Science Center at Houston, School of Biomedical Informatics

Over the past few decades, growing use of Electronic Health Records (EHRs) systems has established large practice-based clinical datasets, which are emerging as valuable resources for clinical and translational research. One of the major challenges of using EHR for clinical research is that much of detailed patient information is embedded in narrative reports. This presentation will describe our recent development of natural language processing (NLP) methods and software for extracting phenotypic information from clinical text in EHR, as well as how such NLP methods and tools can be used to support clinical research, such as drug outcome studies.

2:40 Discover, Predict, Prevent: 23andMe and the Mission of Personalized Healthcare, Part 1

Sazonova_OlgaOlga Sazonova, PhD, Product Scientist II, 23andMe

23andMe has built the world’s largest consented, re-contactable database for genetic research, with more than four million consented participants and one billion individual survey responses. 23andMe researchers leverage this unprecedented resource by applying statistical genetics and machine learning to a) uncover novel genetic risk factors for complex disease, b) advance drug discovery, and c) offer personalized predictions of disease risk to all 23andMe customers.

3:10 Discover, Predict, Prevent: 23andMe and the Mission of Personalized Healthcare, Part 2

Laskey_SarahSarah Laskey, PhD, Scientist, Health R&D, 23andMe

In addition to characterizing and treating disease, researchers at 23andMe are working toward a future of personalized disease prevention. Researchers are building models to estimate disease risk based on genetics, lifestyle, environment, and behavior, and data collection at 23andMe is expanding its focus to longitudinal surveys and interventional studies, allowing researchers to move from association and correlation to causation — what actions can people take to get results?

3:40 Sponsored Presentation (Opportunity Available)

4:10 St. Patrick’s Day Celebration in the Exhibit Hall with Poster Viewing

5:00 Breakout Discussions in the Exhibit Hall (see website for details)

6:00 Close of Day

Wednesday, March 13

7:30 am Registration Open and Morning Coffee


8:00 Plenary Keynote Session

10:00 Refreshment Break and Poster Competition Winner Announced in the Exhibit Hall

DATA COMMONS PANEL SESSION

Moderator: Matthew Trunnell, Vice President, CIO, Fred Hutchinson Cancer Research Center

10:50 Open and Distributed Approaches to Biomedical Research

Kellen_MikeMichael Kellen, PhD, CTO, Sage Bionetworks

Today’s biomedical researchers are increasingly challenged to integrate diverse, complex datasets and analysis methods into their work. Sage Bionetworks develops open tools that support distributed, data-driven science driven, and tests their deployment in a variety of research contexts. These experiences informed development of Synapse, a cloud-native informatics platform that serves as a data repository for dozens of multi-institutional research consortia working with large-scale genomics, bioimaging, clinical, and mobile health datasets.

11:00 The Data Commons/Data STAGE Initiatives

Stanley Ahalt, PhD, Director, Renaissance Computing Institute; Professor, Department of Computer Science, University of North Carolina, Chapel Hill

This talk describes the NIH Data Commons and NHLBI Data STAGE initiatives. The Data Commons aims to establish a shared, universal virtual space where scientists can work with the digital objects of biomedical research, including data and analytical tools. A closely related project, Data STAGE, aims to use the Data Commons to drive discovery using diagnostic tools, therapeutic options, and prevention strategies to treat heart, lung, blood, and sleep disorders.

11:10 Innovation through Collaboration: New Data-Driven Research Paradigms Being Developed by the Pediatric and Rare Disease Communities

Adam C. Resnick, PhD, Director, Center for Data Driven Discovery in Biomedicine (D3b); Director, Neurosurgical Translational Research, Division of Neurosurgery; Director, Scientific Chair, Children’s Brain Tumor Tissue Consortium in Neurosurgery (CBTTC); Scientific Chair, Pediatric Neuro-Oncology Consortium (PNOC); Alexander B. Wheeler Endowed Chair in Neurosurgical Research, The Children’s Hospital of Philadelphia

11:20 Building Trust in Large Biomedical Data Networks

Lucila Ohno-Machado, MD, PhD, Associate Dean, Informatics and Technology, University of California, San Diego Health

11:30 PANEL DISCUSSION: Definitions, Challenges and Innovations of Data Commons

Moderator: Matthew Trunnell, Vice President, CIO, Fred Hutchinson Cancer Research Center

Panelists: Stanley Ahalt, PhD, Director, Renaissance Computing Institute; Professor, Department of Computer Science, University of North Carolina, Chapel Hill

Adam C. Resnick, PhD, Director, Center for Data Driven Discovery in Biomedicine (D3b); Director, Neurosurgical Translational Research, Division of Neurosurgery; Director, Scientific Chair, Children’s Brain Tumor Tissue Consortium in Neurosurgery (CBTTC); Scientific Chair, Pediatric Neuro-Oncology Consortium (PNOC); Alexander B. Wheeler Endowed Chair in Neurosurgical Research, The Children’s Hospital of Philadelphia

Lucila Ohno-Machado, MD, PhD, Associate Dean, Informatics and Technology, University of California, San Diego Health

Michael Kellen, PhD, CTO, Sage Bionetworks

  • What is a data commons and what are the common challenges in building and maintaining data commons?
  • Why should you organize your data into a commons?
  • NIH data commons pilot phase updates and future directions
  • The role of data commons in promoting open access and open science
  • Technology innovations

12:30 pm Session Break

12:40 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:10 Refreshment Break in the Exhibit Hall and Last Chance for Poster Viewing

MACHINE LEARNING TO MAKE BIG DATA CLINICALLY ACTIONABLE

1:50 Chairperson’s Remarks

Hua Xu, PhD, Professor, Director, Center for Computational Biomedicine, The University of Texas Health Science Center at Houston, School of Biomedical Informatics

2:00 Machine Learning for Data Driven Decision Making of Clinical Trials

Kevin Hua, PhD, Senior Manager, AI Machine Learning Development, Digital Health Intelligence Group, Bayer

Clinical trials are expensive business expenditures. Advances in AI/machine learning and data mining technology and availability of data make data-driven decision making possible in drug development. We would like to present a case study where wearable devices and deep learning models are used to help clinical scientists make faster and more accurate decisions during clinical trials.

2:30 Informatics Approaches to Reducing the Sanger Burden in Clinical NGS Laboratories

Lebo_mattMatthew Lebo, PhD, FACMG, Director, Bioinformatics, Partners Personalized Medicine; Instructor, Pathology, Brigham and Women’s and Harvard Medical School

Recent work has highlighted the accuracy and completeness of NGS such that these additional assays may not be required, especially in the realm of orthogonal confirmation of variants. However, many of these studies have been underpowered to accurately define thresholds for ensuring high confidence in NGS variant calling. In this talk, we’ll discuss algorithmic and machine learning approaches to tackle this problem, demonstrating the ability to dramatically reduce, but crucially not eliminate, the burden of orthogonal confirmation in germline NGS assays.

3:00 From Pixels to Phenotypes: Analysis Of Cellular Images With Multi-Scale Convolutional Neural Networks

godinez_WilliamWilliam J. Godinez, PhD, Research Investigator, Novartis Institutes for BioMedical Research (NIBR)

Large-scale cellular imaging and phenotyping is a widely adopted strategy for understanding biological systems and chemical perturbations. Quantitative analysis of cellular images for identifying phenotypic changes is a key challenge within this strategy, and has recently seen promising progress with approaches based on deep learning. In this talk we describe our approaches based on deep multi-scale convolutional neural networks for phenotyping cellular images. We discuss supervised as well as unsupervised learning strategies, with the latter requiring no phenotypic labels for training. We present an example application based on images of E. Coli bacteria to show how we use machine learning to predict the binding preferences of antibiotics directly from microscopy image data.

 

3:30 Session Break

PERSONALIZED MEDICINE STRATEGIES FOR CLINICALLY ACTIONABLE DATA

3:40 Chairperson’s Remarks

Funda Meric-Bernstam, MD, Chair, Executive, Investigational Cancer Therapeutics, MD Anderson Cancer Center

3:45 Precision Oncology Decision Support

Meric-Bernstam_FundaFunda Meric-Bernstam, MD, Chair, Executive, Investigational Cancer Therapeutics, MD Anderson Cancer Center

Molecular profiling is increasingly utilized in the management of cancer patients. Decision support for precision oncology includes guidance of optimal testing, interpretation of test results including interpretation of functional impact of genomic alterations and therapeutic implications. We will review strategies for decision support and resources for identifying optimal approved or investigational therapies.

4:15 High-Performance Integrated Virtual Environment (HIVE) and BioCompute Objects for Regulatory Sciences

Mazumder_RajaRaja Mazumder, PhD, Associate Professor, Biochemistry and Molecular Medicine Georgetown Washington University

Advances in sequencing technologies combined with extensive systems level -omics analysis have contributed to a wealth of data which requires sophisticated bioinformatic analysis pipelines. Accurate communication describing these pipelines is critical for knowledge and information transfer. In my talk I will provide an overview of how we have been engaging with the scientific community to develop BioCompute specifications to build a framework to standardize bioinformatics computations and analyses communication with US FDA. I will also describe how BioCompute Objects (https://osf.io/h59uh/) can be created using the High-performance Integrated Virtual Environment (HIVE) and other bioinformatics platforms.

4:45 Integrating Genomic and Immunologic Data to Accelerate Translational Discovery at the Parker Institute for Cancer Immunotherapy

Wells_dannyDanny Wells, PhD, Scientist, Informatics, Parker Institute for Cancer Immunotherapy

Immunotherapy is rapidly changing how we treat both solid and hematologic malignancies, and combinations of these therapies are quickly becoming the norm. For any given treatment strategy only a subset of patients will respond, and an emerging challenge is how to effectively identify the right treatment strategy for each patient. This challenge is compounded by a concomitant explosion in the amount of data collected from each patient, from high dimensional single cell measurements to whole exome tumor sequencing. In this talk I will discuss translational research at the Parker Institute, and how we are integrating multiple molecular and clinical data types characterize the tumor-immune phenotype of each patient.

5:15 Close of Conference Program