Hi everyone,
I have a project I’ve been working on for some time. I’m trying to see if it’s possible to use ECG data (and just ECG data, no other inputs) to predict atrial fibrillation events with the help of machine learning. Ideally it would be great to be able to predict them at least 30-60 minutes ahead of time. Medical AI is tricky because both precision and recall need to be pretty high to actually be useful in a clinical setting, if the recall isn’t high enough no one will trust the model to replace skilled nurse check-ins, and if the precision isn’t high enough it will lead to too many unnecessary alarms and overburden staff.
I have a fairly robust dataset, a few hundred thousand hours of ECG data and around 3000 distinct A fib events. My intent is to take a time series segment (probably around 30-60 minutes), feed it to some kind of classifier, and then output a prediction of the likelihood that an event will occur. I’ve tried a number of different approaches and so far have had minimal success, with my top-performing classifier only giving about 60% balanced class accuracy. Obviously this is far below the threshold of utility.
I’m starting to wonder if I’ve been approaching the problem too simply - a number of different issues can cause Afib and lumping them all together as the “positive” class may dilute the signals I’m trying to detect. So I’m thinking perhaps I should see if the events cluster in ways that reflect the underlying physiological differences and then use a multiclass approach that predicts one of the causes instead.
I’ve been reading a bit and it seems like using KNN with a dynamic time warping metric might be a good way to do this but I have no experience using this time of unsupervised clustering approach. I’m also unclear how to deal with the fact that I don’t actually know how many clusters there will be in the data, everything I’ve read so far suggests that you need to tell KNN first how many clusters there will be.
Any help would be appreciated!
(minor point, DTW is not a metric, it is only a measure)
What you are trying to do is precursor discovery [a].
It is unlikely that the 30 to 60 minue lead time will work for you, it seems too ambitions.
DTW is a great shape measure, it it is unlikely that it will generalize from person to person. In fact, if you change the location of the leads (assuming you are using 12-lead or similar ), it is unlikely to generalize for a single person, one day to another.
You may wish to consider trying to solve this problem in the feature space, not the shape space. The catch22 feature set would be a great starting point [b].
[a] https://www.cs.ucr.edu/~eamonn/WeaklyLabeledTimeSeries.pdf
[b] https://www.dropbox.com/scl/fi/3vs0zsh4tw63qrn46uyf9/C22MP\_ICDM.pdf?rlkey=dyux24kqpagh3i38iw6obiomq&dl=0