Using Machine Learning on Home Health Care Assessments to Predict Fall Risk (Active)

Falls are the leading cause of injuries among older adults, particularly in the more vulnerable home health care (HHC) population.  Existing standardized fall risk assessments often require supplemental data collection and tend to have low specificity. We curated a home health care assessment dataset with over 100 clinical, behavioral, and cognitive features and applied a random forest algorithm to identify factors that predict and quantify fall risks. We will extend the analysis to incorporate longitudinal assessments and apply natural language processing techniques on visit notes to identify fall cases unreported in structured data. Our model achieves higher precision and balanced accuracy than the commonly used multifactorial Missouri Alliance for Home Care fall risk assessment.  This could lead to a reduction of paperwork for nursing staff and better targeting of high fall risk patients.

Services Provided: Data Wrangling, Data Analysis, Publication Support

Principal Investigator: Kathryn H. Bowles, PhD, RN, FAAN, FACMI, Professor of Nursing and vanAmeringen Chair in Nursing Excellence
Department: Biobehavioral Health Sciences, School of Nursing
Y Lo, SF Lynch, et. al. A Machine Learning Approach to Improve Fall Risk Prediction in Home Health Care. May 2018. Symposium on Data Science and Statistics. Reston, Virginia (Poster)
Y Lo, SF Lynch, et. al. Using machine learning on home health care assessments to predict fall risk. Accepted March 2019. The 17th World Congress of Medical and Health Informatics. Lyon, France (Full Paper)

Anticancer Therapy at the End of Life: Lessons From a Community Cancer Institute (Completed)

Studies have shown aggressive cancer care at end of life is associated with decreased quality of life, decreased median survival, and increased cost of care.  We worked with Dr. Shanthi Sivendran and her team to perform a retrospective cohort study of 201 patients who received systemic anti-cancer therapy at Lancaster General Hospital. We defined our outcome variable as the receipt anti-cancer treatment in the last 14 days of a patient’s life. We evaluated 20 clinical exposure variables with respect to the outcome classes using univariate risk ratios and the Benjamini-Hochberg method for determining significance in light of multiple testing. Our findings demonstrate those enrolled in the Oncology Care Model(OCM) and those with hematologic malignancies have a higher risk of receiving anti-cancer therapy in the last 14 days of life.  These observations highlight the need for better identifying the needs of high-risk patients and providing good quality care throughout the disease trajectory to better align end of life care with patients’ wishes.

Services Provided: Data Analysis, Publication Support

Principal Investigator: Shanthi Sivendran MD
Department: Oncology, Lancaster General Health, Penn Medicine
Publication: S Sivendran, SF Lynch, et. al. Anticancer Therapy at the End of Life: Lessons From a Community Cancer Institute. March 2019. Journal of Palliative Care (Under Review)

Comparing AM vs PM Administration of Warfarin in Patients after Mechanical Mitral Valve Surgery (Active)

The optimal dose of warfarin varies among individuals, and the prediction of a maintenance dose is difficult.  We will conduct a retrospective cohort study of 122 patients who underwent mechanical mitral valve surgery at PennMedicine. Approximately half of these patients received post-surgery warfarin treatment in the morning (10 am), while the other patients received warfarin in the evening (6 pm). We used a survival analysis approach to contrast the effect of administration time of warfarin on the time required to reach therapeutic INR level, adjusting for time-dependent (e.g.daily dose) and time-independent (e.g. sex, race) clinical features.

Services Provided: Data Analysis

Principal Investigator: Justin R. Harris, PharmD, BCPS, Cardiology Pharmacy Specialist
Department: Cardiology, Penn Medicine
Publication: Anticipated summer journal submission

Phenotyping of Autism (Active)

Autism specialists are in short supply and waitlists for patients to be definitively diagnosed can be very long.  An automated screening tool that generates reliable predictions of whether a patient is likely to have autism spectrum disorder could streamline the process.  We collaborated with Dr. Whitney Guthrie and Dr. Robert Schultz to conduct a retrospective study of 1086 patients with gold standard autism diagnosis and electronic health record data to develop machine learning methods to generate a score that will indicate the likelihood that a given patient has autism.

Services Provided: Data Analysis
Principal Investigators:
Whitney Guthrie PhD, Postdoctoral Fellow and Co-Director of the Data and Statistical Core at the Center for Autism Research
Robert Schultz PhD, Director of the Center for Autism Research
Department: Center for Autism Research, Children’s Hospital of Philadelphia

Prevalence and Characterization of Yoga Mentions in the Electronic Health Record (Active)

There is a growing patient population using yoga as a therapeutic intervention, but little is known about how yoga actually interfaces with healthcare in a clinical setting. We collaborated with Dr. Nadia Penrod to perform a retrospective observational cohort study using electronic health records from throughout the Penn Medicine health system. Through text-mining and natural language processing, we identified clinical chart notes that mention the word yoga and looked at how these notes were distributed among patients, clinicians, and clinical service departments over a ten year period. To assign a context to yoga notes, we built a text-based classifier to separate the yoga notes into three classes: clinician recommendation, documentation of patient practice, and other. We found widespread and growing documentation of yoga in the clinical charts notes. And we identified nine medical conditions for which clinicians recommend the use of yoga as treatment including: Parkinson’s disease, anxiety, depression, pregnancy, and backache. Work is ongoing to link yoga practice with health outcomes.

Services Provided: Data Wrangling, Publication Support
Principal Investigator: Nadia Penrod PhD, Postdoctoral Scholar with the Computational Genetics Laboratory
Department: Biostatistics Epidemiology & Informatics
Publication: NM Penrod, SF Lynch, S Thomas, N Seshadri, JH Moore. Prevalence and characterization of yoga mentions in the electronic health record. March 2019. (Submitted)

Predicting Risk of Readmission after Joint Replacement Surgery (Completed)

Reducing unnecessary hospital readmissions has potential to both improve patient satisfaction and lower health care costs.  In collaboration with Dr. Eric Hume, We used electronic health record information of patients admitted for hip and knee joint replacement surgery to estimate risk of hospital readmission within 30 days of discharge.  We identified the lab measures of hemoglobin and albumin as the top predictors of readmission risk and noted that neither was currently being used in practice to identify risky patients prior to discharge.  Related to this, we discovered evidence that a nutritional intervention could be particularly valuable in reducing readmission risk.  The orthopedic surgery department will take this knowledge into consideration as they make decisions about patient discharge and pre-surgery interventions.

Services Provided: Data Analysis

Principal Investigator: Eric L. Hume, MD, Associate Professor of Clinical Orthopaedic Surgery
Department: Orthopaedic Surgery, Penn Medicine
Publication: SF Lynch, R Wong, Y Lo, et. al. Readmission Risk after Orthopedic Surgery. May 2018. Symposium on Data Science and Statistics. Reston, Virginia (Poster)

Precision Phenotyping of Hypertension (Complete)

Primary aldosteronism is an underlying cause of chronic hypertension that is highly treatable but often goes undiagnosed.  We collaborated with Dr. Daniel Herman to develop a phenotyping algorithm for hypertension using Penn Medicine electronic health record data: medications, diagnoses, chart notes, and lab results.  We developed a more accurate classification based on EHR data than is currently available through the existing Penn hypertension registry.  This will allow primary aldosteronism researchers more effectively select cohorts for future primary aldosteronism study.

Services Provided: Data Wrangling, Data Analysis

Principal Investigator: Daniel S Herman, MD, PhD, Assistant Professor Of Pathology And Laboratory Medicine
Department: Pathology and Laboratory Medicine

Supporting Translational Research using Penn Medicine BioBank Informatics (Active)

Translational research studies often leverage  heterogeneous data from disparate sources including patient electronic health record, survey, and genomic data. In collaboration with key Penn translational investigators and the Penn Medicine Biobank (PMBB) (below), we are providing informatics support and developing infrastructure to support the needs of the Penn community. More recently, we have deployed Carnival, a graph-based data wrangling tool, to enable PMBB participant data harmonization, deep phenotyping, cohort exploration, case-control cohort matching, and research data set generation.

Services Provided: Data Wrangling, Data Analysis

Principal Investigators: Daniel J. Rader, MD, Seymour Gray Professor Of Molecular Medicine; Michael D. Feldman, MD, PhD, Professor Of Pathology And Laboratory Medicine; Christian Stoeckert PhD, Professor of Genetics; Marylyn Ritchie PhD, Professor of Genetics
Department(s): Genetics, Pathology And Laboratory Medicine, Molecular Medicine
Publications: Birtwell D, Williams H, Pyeritz R, Damrauer S, Mowery D. Carnival: A Graph-based Data Integration and Query Tool to Support Patient Cohort Generation for Clinical Research. MedInfo 2019. (accepted).


Phenotyping Epilepsy Patients using Graph Technology (Active)

We are creating a property graph representation of aggregated epilepsy data, including discrete data from the electronic medical record, instrument data, and variables extracted from unstructured physician reports via automated natural language processing. We will leverage these data to deeply phenotype epilepsy patients according to their response to epilepsy therapeutics. In collaboration with Dr. Brian Litt and his team, we have submitted several grants to support the efforts of early investigators as well as obtained a faculty undergraduate mentoring grant to support our undergraduate research assistant.

Services Provided: Data Wrangling, Data Analysis

Principal Investigator: Brian Litt, MD, Professor Of Neurology And Professor Of Bioengineering; Colin Ellis MD, Postdoctoral Fellow; Leah Blank MD, Assistant Professor; Pouya Khankhanian MD, Clinical Fellow
Department: Neurology


Identifying Risk Factors for Persistent Opioid Use following Surgery (Active)

We are creating a unified and normalized data set for a cohort of surgical patients to test the hypothesis that response to opioids is different for cancer versus non-cancer patients.

Services Provided: Data Wrangling, Feature Engineering, Data Analysis

Principal Investigators:
Caryn Lerman, Ph.D., Emeritus Professor Of Psychiatry
Justin E. Bekelman, M.D., Associate Professor Of Radiation Oncology
Departments: Psychiatry, Radiation Oncology


Future Projects

Effect of Beverage Tax on Obesity and Diabetes

Philadelphia has the highest prevalence of obesity and diabetes among the nation’s largest cities.  In January 2017 Philadelphia implemented a beverage excise tax that was in part motivated by these health issues.  We will evaluate the efficacy of this tax in reducing obesity and diabetes using BMI and A1C readings from patients in the Penn Medicine network region.  By comparing inside and outside of the city, before and after the tax we will determine whether the tax is affecting these conditions.

Principal Investigator: Christina Roberto PhD, Assistant Professor at Perelman School of Medicine
Department: Center for Health Incentives & Behavioral Economics, University of Pennsylvania
Publication: R01 grant resubmission under review


Developing an Accurate Prognostic Instrument for Predicting Mortality in Prolonged-Stay Intensive Care Unit Patients

Mortality prediction in ICU is a challenging but vital task.  Many models focus on mortality prediction at onset of care but much is learned about patient status during their stay in the ICU leading to the potential for improved predictions and better planning of care goals.  We aim to predict mortality of patients at day 14 in their ICU stay.

Principal Investigator: Justin Hatchimonji MD & Niels Martin MD
Department: Surgery


Determining Optimal Team Composition to Reduce Effects of Individual Providers on In-patient Hospital Mortality after Injury

Each year in the US, over 190,000 people die as a result of injury.  A critical knowledge gap exists in our understanding of how individual provider performance is associated with patient outcomes after injury.  Using EHR data, we will develop a novel instrument variable approach to measure the effects of individual providers on in-hospital mortality after injury.

Principal Investigator: Daniel Holena MD
Department: Surgery


Data-Driven Framework for Classification and Surgical Planning of Spinal Deformity

We aimed to improve and custom tailor scoliosis treatments to particular patients.  As a first step towards classifying spinal deformity changes pre and post treatment, we leveraged python and OpenCV to automatically recognize spine curvature from radiology images.  We then explored structured surgery outcomes and brainstormed methodology for recommending surgery parameters to certain patients based on their demographic and spine curvature information.  We have submitted a grant to support further study.

Principal Investigator: Saba Pasha PhD, Assistant Professor, Department of Orthopedic Surgery
Department: Orthopaedic Surgery