Smart Digital Therapeutics for Alcohol Use Disorder: Algorithms for Prediction and Adaptive Intervention

John J. Curtin, Ph.D.

University of Wisconsin-Madison

Mental Healthcare Needs are High and Unmet

  • In 2019, 52 million Americans had an active mental illness
    • More than half did not receive any treatment

Mental Healthcare Needs are High and Unmet

  • In 2019, 52 million Americans had an active mental illness
    • More than half did not receive any treatment
  • 20 million adults had an active substance use disorder
    • 9 out of 10 did not receive any treatment

Mental Healthcare Needs are High and Unmet

  • In 2019, 52 million Americans had an active mental illness
    • More than half did not receive any treatment
  • 20 million adults had an active substance use disorder
    • 9 out of 10 did not receive any treatment
  • Large treatment disparities exist by race, ethnicity, geography, and income

Mental Healthcare Needs are High and Unmet

  • Failure to treat is not surprising given many treatment barriers:
    • Access
    • Availability
    • Affordability
    • Acceptability

Digital Therapeutics (DTx)

Digital therapeutics are smartphone “apps” that are designed to prevent, manage, or treat disease, including mental illness.

Can augment mental health services to address barriers

  • Accessible everywhere
  • Available 24/7
  • Highly scalable (affordable?)

Digital Therapeutics (DTx)

Digital therapeutics are smartphone “apps” that are designed to prevent, manage, or treat disease, including mental illness.

Can augment mental health services to address barriers

  • Accessible everywhere
  • Available 24/7
  • Highly scalable (affordable?)
  • Effective!

Smart Digital Therapeutics

“Could you predict not only who might be at greatest risk for relapse …
… but precisely when that relapse might occur …
… and how best to intervene to prevent it?”

Lapse Prediction in Patients with AUD

  • 151 patients with AUD
  • Early in recovery (1-8 weeks)
  • Committed to abstinence throughout study
  • Followed for up to 3 months
  • Collected active and passive personal sensing data streams

risk1_pis.png 
niaaa_logo.png 

GOAL: Develop a temporally precise lapse monitoring (prediction) system for patients with AUD

Personal Sensing Data Streams

  • 4X daily ecological momentary assessments (EMA)

  • Monthly self-report

  • Geolocation (GPS)

  • Cellular communications (voice and text messages)

    • Meta data
    • Text message content
  • Sleep sensor (Wake/sleep times; sleep efficiency; wakings; restlessness)

4x Daily Ecological Momentary Assessments

risk1_ema_questions.png 

  • Current
    • Craving
    • Affect
    • Risky situations
    • Stressful events
    • Pleasant events
  • Future
    • Risky situations
    • Stressful events
    • Confidence

Feature Engineering

  • Features based on recent past experiences (12, 24, 48, 72, 168 hours)

  • Min, max, and median response (all items)

  • History (count) of past lapses (item 1) and completed EMAs (compliance)

  • Raw scores and change scores (from baseline/all past responses)

Machine Learning Methods

  • Predict hour-by-hour probability of future lapse

  • Lapse window widths

    • 1 hour
    • 1 day
    • 1 week

Machine Learning Methods

  • Statistical Algorithms
    • ElasticNet GLM (e.g., LASSO, ridge regression)
    • Random Forest
    • XGBoost
    • KNN
  • Model Tuning and Performance Evaluation
    • Area under ROC curve (AUC) as primary performance metric
    • Sensitivity, Specificity, Balanced accuracy, Positive predictive value
    • Using grouped (by participant) 10-fold CV

1 Week: Probabilities for No Lapse and Lapse

  • Model predicts probability of lapse in next week for “new” observations in test set

  • Can panel predictions by Ground Truth (i.e., true lapse vs. no lapse observations

  • Want high probabilities to be high for true lapses and low for true no lapses

1 Week: Probabilities for No Lapse and Lapse

  • Model predicts probability of lapse in next week for “new” observations in test set

  • Can panel predictions for GROUND TRUTH lapse and no lapse observations

  • Want high probabilities to be high for true lapses and low for true no lapses

  • Need decision threshold for classification (.50 default)

Performance Metrics by Lapse Window Width

Week Day Hour
AUC
Sensitivity 0.79
Specificity 0.85
Balanced Accuracy 0.82

1 Week: ROC Curve

Area under the ROC curve (AUC)

  • Across all decision thresholds

  • ~.5 (random) – 1.0 (perfect)

Coarse rules of thumb for AUC

.70 - .80 are considered fair
.80 - .90 are considered good
> .90 are considered excellent

1 Day: ROC Curve

Coarse rules of thumb for AUC

.70 - .80 are considered fair
.80 - .90 are considered good
> .90 are considered excellent

1 Day: ROC Curve

Coarse rules of thumb for AUC

.70 - .80 are considered fair
.80 - .90 are considered good
> .90 are considered excellent

Performance Metrics by Lapse Window Width

Week Day Hour
AUC 0.90 0.91
Sensitivity 0.79 0.82
Specificity 0.85 0.85
Balanced Accuracy 0.82 0.83

1 Hour: ROC Curve

Coarse rules of thumb for AUC

.70 - .80 are considered fair
.80 - .90 are considered good
> .90 are considered excellent

1 Hour: ROC Curve

Coarse rules of thumb for AUC

.70 - .80 are considered fair
.80 - .90 are considered good
> .90 are considered excellent

Performance Metrics by Lapse Window Width

Week Day Hour
AUC 0.90 0.91 0.93
Sensitivity 0.79 0.82 0.84
Specificity 0.85 0.85 0.86
Balanced Accuracy 0.82 0.83 0.85

Global Variable Importance by Model

  • All EMA items impact lapse probability (both globally and locally)
  • Demographics not particularly important (but limited race/ethnicity diversity)
  • Lapse day and Lapse hour are useful for day and hour level models as expected

Positive Predictive Value (PPV)

Week Day Hour
AUC 0.90 0.91 0.93
Sensitivity 0.79 0.82 0.84
Specificity 0.85 0.85 0.86
Balanced Accuracy 0.82 0.83 0.85
PPV

Positive Predictive Value (PPV)

Week Day Hour
Lapse Rate 25.4% 7.7% 0.4%
Week Day Hour
AUC 0.90 0.91 0.93
Sensitivity 0.79 0.82 0.84
Specificity 0.85 0.85 0.86
Balanced Accuracy 0.82 0.83 0.85
PPV

Positive Predictive Value (PPV)

Week Day Hour
Lapse Rate 25.4% 7.7% 0.4%
Week Day Hour
AUC 0.90 0.91 0.93
Sensitivity 0.79 0.82 0.84
Specificity 0.85 0.85 0.86
Balanced Accuracy 0.82 0.83 0.85
PPV 0.65 0.32 0.02

Impact of Decision Thresholds: 1 day

Thres = 0.50 Thres = 0.90
Sensitivity 0.81
Specificity 0.86
PPV 0.32

Impact of Decision Thresholds: 1 day

Thres = 0.50 Thres = 0.90
Sensitivity 0.81 0.40
Specificity 0.86 0.99
PPV 0.32 0.83

Precision - Recall Curves

Week Day Hour
Threshold 0.70 0.88 0.97
Specificity 0.67 0.43 0.20
PPV 0.75 0.75 0.75

::::

::: {.notes} But of course, as we increase the decision threshold for labeling a window as a lapse, we will trade off sensitivity. We can see this trade off directly in the precision-recall curves on the right. If we decide we need PPV of at least .75, you can see that we still have reasonable sensitivity for the one week window but we start to miss many lapses in the 1day window and more still in the 1hour window.

I’ll return to this a bit more later when we discuss emerging plans for how best to implement these models within a digital therapeutic.

Key Take Home Messages

  • Relatively high combined sensitivity and specificity

  • Comparable performance (AUC) from 1 week down to 1 hour windows

  • Will need to adjust decision thresholds to fit how we use the algorithm.

    • Lower PPV OK for low burden or low cost recommendations
    • Higher PPV needed to recommend “costly” interventions or actions

(Selective) Next Steps

  • Geolocation, cellular communications, and other passively sensed signals

…Imagine my smartphone communications…

smartphone_uber.png 

Context is Critical

Context is Critical

smartphone_context.png 

Context is Critical

Contextualized Geolocation

context_gps.png 

Contextualized Communications

context_cell.png 

Baseline Feature Engineering for GPS

  • Focus on recent past experiences (6, 12, 24, 48, 72, 168 hours)

  • Raw scores and change scores (from baseline)

  • Time spent at important places (e.g, alcohol present, drank at location in past, risky, unpleasant)

feature_gps.png 

(Selective) Next Steps

  • Geolocation, cellular communications, and other passively sensed signals

  • Build models with lead times > 0 hours

(Selective) Next Steps

  • Geolocation, cellular communications, and other passively sensed signals

  • Build models with lead times > 0 hours

  • More diversity in training data

Active Project: Lapse in patients with Opioid Use Disorder

  • Recruiting 400 - 500 patients in recovery from Opioid Use Disorder (~ 300 so far)
  • National sample (size; diversity: demographics, location)
  • More variation in stage of recover (1 – 6 months at start)
  • 12 months of monitoring
  • Closer to real implementation methods

risk2_pis.png 

nida_logo.png 

(Selective) Next Steps

  • Geolocation, cellular communications, and other passively sensed signals

  • Build models with lead times > 0 hours

  • More diversity in training data

  • Use models to improve DTx engagement and clinical outcomes

    • SMART DTx – algorithm guided use
    • How to craft patient feedback to encourage trust in the algorithm

Relapse Prevention Model

relapse_prevention_flowchart.png 

Optimization/Evaluation of an Algorithm Guided Smart DTx

  • Lapse probabilities updated daily based on EMA and Geolocation features
  • Use lapse probability and locally important features to select optimal DTx modules – guided by Rela
  • Provide recommendations designed to encourage engagement
    • Algorithm transparency (risk level, change, features)
    • Communication factors (empathy, feasibility)
  • MRT to optimize recommendation message components
  • RCT to evaluate Standard vs. Smart DTx on clinical outcomes

nida_logo.png 

CRediTs

credits.png 

Demographics & Alcohol Use History

Demographics and Alcohol Use Information
N % M SD Range
Age 41 11.9 21-72
Sex
Female 74 49.0
Male 77 51.0
Race
American Indian/Alaska Native 3 2.0
Asian 2 1.3
Black/African American 8 5.3
White/Caucasian 131 86.8
Other/Multiracial 7 4.6
Hispanic, Latino, or Spanish Origin
Yes 4 2.6
No 147 97.4
Education
Less than high school or GED degree 1 0.7
High school or GED 14 9.3
Some college 41 27.2
2-Year degree 14 9.3
College degree 58 38.4
Advanced degree 23 15.2
Employment
Employed full-time 72 47.7
Employed part-time 26 17.2
Full-time student 7 4.6
Homemaker 1 0.7
Disabled 7 4.6
Retired 8 5.3
Unemployed 18 11.9
Temporarily laid off, sick leave, or maternity leave 3 2.0
Other, not otherwise specified 9 6.0
Personal Income $34,298 $31,807 $0-200,000
Marital Status
Never married 67 44.4
Married 32 21.2
Divorced 45 29.8
Separated 5 3.3
Widowed 2 1.3
Alcohol Use Disorder Milestones
Age of first drink 14.6 2.9 6-24
Age of regular drinking 19.5 6.6 11-56
Age at which drinking became problematic 27.8 9.6 15-60
Age of first quit attempt 31.5 10.4 15-65
Number of Quit Attempts* 5.5 5.8 0-30
Lifetime History of Treatment (Can choose more than 1)
Long-term residential (6+ months) 8 5.2
Short-term residential (< 6 months) 49 31.8
Outpatient 74 48.1
Individual counseling 97 63.0
Group counseling 62 40.3
Alcoholics Anonymous/Narcotics Anonymous 93 60.4
Other 40 26.0
Received Medication for Alcohol Use Disorder
Yes 59 39.1
No 92 60.9
Alcohol Use Disorder DSM-5 Symptom Count 8.9 1.9 4-11
Current (Past 3 Month) Drug Use
Tobacco products (cigarettes, chewing tobacco, cigars, etc.) 84 54.5
Cannabis (marijuana, pot, grass, hash, etc.) 66 42.9
Cocaine (coke, crack, etc.) 18 11.7
Amphetamine type stimulants (speed, diet pills, ecstasy, etc.) 15 9.7
Inhalants (nitrous, glue, petrol, paint thinner, etc.) 3 1.9
Sedatives or sleeping pills (Valium, Serepax, Rohypnol, etc.) 22 14.3
Hallucinogens (LSD, acid, mushrooms, PCP, Special K, etc.) 14 9.1
Opioids (heroin, morphine, methadone, codeine, etc.) 16 10.4
Note:
N = 151
* Two participants reported 100 or more quit attempts. We removed these outliers prior to calculating the mean (M), standard deviation (SD), and range.

Consort Diagram

Consort Diagram

ROC Posterior Probabilities

Model Comparison Posterior Probabilites

Posterior Probabilities for Model Contrasts for AUC. Region of Practical Equivalence (ROPE) indicated by dashed yellow lines