A Sepsis Prediction Engine for Telehealth applications.
This work is part of the paper titled - “Minimal Vital Sensor Architectures for Early Warning of Sepsis in Telehealth Patients” (under review)
A Sepsis Prediction Engine that employs Gradient Boosted Decision Tree (XGBoost) on features extracted from vitals obtained from wearable sensors
Sequence of hourly measurements of the following vital signs:
These measurements obtained from patients of two different hospitals are contained in the following zip files. Each zip file when extracted generates the individual patient data files.
The raw files refer to Physionet CinC 2019 database, which are then preprocessed (as per inclusion exclusion criteria etc.) to generate the curated datasets used for this study.
The input should be formatted so that the measurements span a minimum of 3 hours and a maximum of 6 hours.
Input data files are zipped and can be accessed from the repository: Raw Dataset
Curated dataset for this study
The Algorithm is implemented as a set of following three python modules:
Module: Tele-SEP-train-model.py
Parameters: Each of the 15 sensor configurations (Si) Each of the 16 timing tuples (W,L)
Output: AUROC for each (Si,W,L)
For each sensor configuration the highest AUC yielding model is chosen to be validated in the next function
Module: Tele-SEP-ModelLoadRunOnly.py
Parameters: each of the 15 sensor configurations (Si) Best performing timing tuple (WAUC,LAUC) corresponding to Si.
import pickle
model_filename = 'trained-models/XGBoost/XGB-Model-PPG-RR-Temp-L6-M4-verified.sav'
# load the model from disk
loaded_model = pickle.load(open(model_filename, 'rb'))
# make predictions for test data
y_pred = loaded_model.predict(X_test)
# print classification report
print(classification_report(y_test, y_pred))
#confusion matrix
cnf_matrix = confusion_matrix(y_test, y_pred)
print(cnf_matrix)
Output: AUC and its difference from that obtained in function 1 (for each sensor configuration)
Module automatation being implemented
Parameters: AUROC threshold value AUCmin Lead time threshold value Lmin
Output: From the list of Sensor configurations arranged in ascending order based on number and complexity of vitals, choose the first configuration Smin for which AUROC obtained in module 1 and corresponding lead time are greater than or equal to their respective threshold values AUCmin and Lmin.
Modules 1,2 and 3 are run once at the setup time and a subset of the best performing pre-trained and validated models corresponding to various sensor configurations are also provided in the repository. During runtime, the following algorithm is used to predict sepsis for a new patient.
Parameters: Patient’s wearable sensor configuration Sp Patient_vitals = new patient data Lead time = 3,4,5,6 hours
Subroutines: Choose the Tele-SEP model that satisfies the patient’s wearable sensor configuration Sp. For the sensor configuration Sp, retrieve four sets of models Mp3, Mp4, Mp5, Mp6 corresponding to the four lead times. From each set choose the best performing model Mp3*, Mp4*, Mp5*, Mp6*. Run these on Patient_vitals to compute the sepsis probabilities.
Output: The maximum of the four sepsis probabilities and the corresponding lead time resulting from the above computation