NYUTron: Health System-scale Language Models for Clinical Operations: 30-day Readmissions
, Senior Data Scientist, NVIDIA
, Ph.D. Student, New York University
Highly Rated
Unplanned 30-day readmissions impose health risks on patients and increase the cost of care. Predicting unplanned readmissions early could help with discharge planning and resource allocation, which in turn could improve patient care and reduce costs. We propose to treat readmission prediction as a natural language processing task involving creation of a large language model pre-trained on a health-system scale corpus of clinical text on high-end multi-node GPU servers, a.k.a. NYUTron. Having fine-tuned on a labeled dataset, we'll evaluate its performance based on the change in readmission rates after deployment. As we develop our model, we try to address the following questions: 1) How to handle long sequence length? 2) How to address label imbalance? 3) How to assess the impact of noisy labels on model evaluation?