DOI: 10.19102/icrm.2012.030405
1F. ROOSEVELT GILLIAM III, MD, FACP, FACC, FHRS, 2GREGORY A. EWALD, MD and 3ROBERT J. SWEENEY, PhD
1Cardiology Associates of Northeast Arkansas, Jonesboro, AR
2Washington University School of Medicine, St Louis, MO
3Boston Scientific CRM, St Paul, MN
Download PDF |
|
ABSTRACT. The decompensation detection (DECODE) study collected data from cardiac resynchronization therapy device (CRT-D) patients via a remote monitoring system to develop and evaluate automated algorithms for detecting heart failure (HF) decompensation events. Patients were enrolled for up to 2 years. Device-based and patient-based data were collected via normal use of the remote monitoring system. Quarterly phone screening identified patients with potential HF events, and when events were confirmed that patient entered a more detailed following status where all medical records were examined from the time of enrollment through the completion of the study. For each such patient with a HF event, another patient without a HF event was randomly selected from the same center for data collection in the more detailed following status. These patients with more detailed following status were randomly assigned to either the development set, used to create a probability model for predicting HF events, or the sequestered evaluation set, used to evaluate the model created from the development set. We were able to produce 48% sensitivity (with two false detections per patient-year) in the development set but only 35% in the evaluation set, and we demonstrated that combining multiple measures improved performance for detection of documented HF hospital admissions or HF intravenous (IV) therapies. Automated early detection of HF decompensation using combined remote-monitored data is possible but, for the data available in this study, the performance was modest, suggesting that additional sensors that are more closely related to HF pathophysiology will be required.
KEYWORDS. cardiac resynchronization, hospitalization, impedance, prediction.
F Roosevelt Gilliam III, MD reports he is a consultant and speaker for Boston Scientific.
Gregory A Ewald, MD reports he is a consultant for Boston Scientific.
Robert J Sweeney, PhD reports he is an employee of Boston Scientific CRM, St Paul, MN.
This study was funded by Boston Scientific CRM, St Paul, MN
This work is submitted on behalf of the DECODE study investigators.
Manuscript received January 31, 2012, final version accepted February 22, 2012.
Address correspondence to: F. Roosevelt Gilliam III, MD, FACP, FACC, FHRS, Cardiology Associates of Northeast Arkansas, Jonesboro, AR 72401. E-mail: roseygill@hotmail.com
Patients with heart failure (HF) often have repeated hospitalizations and frequently need to limit their activities because of fatigue, shortness of breath, and other symptoms of fluid overload. The care of these patients has a large impact on health-care costs. The direct and indirect costs for managing heart failure in the United States are expected to exceed $37.2 billion in 2009, of which more than half are costs associated with hospitalization.1
Patients and/or caregivers can significantly alter the course of HF (improve functional status, quality of life, and survival) while reducing the costs because of hospitalizations by maintaining attention to the signs and symptoms of volume overload.2–6 This requires close monitoring of the patient’s clinical condition and compliance with diet, exercise, and medications. Implanted devices (implantable cardioverter-defibrillator (ICD) and cardiac resynchronization therapy device (CRT-D)) have both continuous access and the technical capability to monitor physiologic signals in patients with HF. In addition to their primary device therapy, implanted devices may play an important role in recognizing and reporting the signs of worsening HF. This has the potential to prevent or reduce the need for hospitalization and to have a significant impact on the outcomes and costs of caring for patients with HF.
If device-based and patient-measured data (such as weights and symptoms) are collected and stored in a centralized system, automated detection algorithms could frequently examine data to look for signs of worsening HF with little additional expenditure of clinician resources. Such algorithms could provide ongoing patient monitoring between routinely scheduled clinical visits.
To create such algorithms, the Decompensation Detection study (DECODE) was designed to collect data routinely available from patients with CRT-D devices via the normal use of a remote monitoring system and to collect a detailed history of HF decompensation events in enrolled patients. The goal was to develop and evaluate an algorithm that could assist in the early detection of worsening HF.
DECODE is a US multicenter observational study that collected data (up to 2 years) from patients implanted with a CRT-D device and prescribed to a remote monitoring system (LATITUDE Patient Management System, Boston Scientific CRM, St. Paul, MN).7,8 This remote monitoring system enabled automatic in-home collection of device data, patient weight, blood pressure, and symptoms as part of routine use. Data regarding patient medical information and HF status were collected from medical records by the investigators and at 3-month intervals in patients selected for a more detailed following status (see below). The study was designed to end when patients selected for the more detailed following status contained approximately 100 patients with protocol-defined HF events. Accordingly, individual patient duration in the study depended on their enrollment date, withdrawal date, and the date of the decision to end the study. Study planning suggested that patients would be enrolled for a minimum of 8–12 months unless they were withdrawn.
Study inclusion criteria required patients to have an implanted CRT-D device (RENEWAL, Boston Scientific CRM, St Paul, MN) programmed to a non-rate-responsive pacing mode (DDD or VDD) to increase the availability of heart rate variability diagnostics in the study population; to be prescribed to the remote monitoring system with configuration for weekly device interrogation and symptom reporting; and to be willing to measure weights daily. Exclusion criteria included the need for rate-responsive pacing or the need for routinely scheduled IV decongestive therapy for HF in the 90 days prior to enrollment.
Worsening HF events were defined as signs and/or symptoms consistent with congestive HF requiring hospitalization (patient confined in a hospital long enough for a calendar date change) or unscheduled intravenous decongestive therapy (e.g. IV diuretics, IV inotropes, IV vasoactive drugs).
All patients enrolled in the study could be selected for a more detailed following status if a protocol-defined HF event occurred. Data from those patients in the more detailed following status were used to develop and evaluate HF predictive algorithms. All patients had phone follow-up on a 3-month interval to screen for HF events. If a patient’s phone answers suggested a defined HF event may have occurred, the investigator obtained and examined the patient’s pertinent medical records.
Once a protocol-defined HF event was confirmed from medical records, that patient was selected to be included in the more detailed following status group (Figure 1). For patients in this group, in addition to the 3-month phone questions, the investigator also examined medical records from all sources from the time the patient was prescribed to LATITUDE to the current date to confirm protocol defined events.
Figure 1: Selection of patients into the detailed following and subsequent assignment to development or evaluation groups. |
For each patient entered into the more detailed following status due to a protocol-defined HF event, another patient was randomly selected from the same center and moved to the more detailed following status to provide additional event-free data for algorithm development and evaluation.
Upon entering the more detailed- following status, patients were randomly assigned (50% probability) to either a development or an evaluation set. Both patients and investigators were blinded to this assignment. Data collected from those patients assigned to the development set were used for HF-predictive algorithm development. The data from the evaluation set were sequestered throughout the study. Once final predictive algorithms were developed, the evaluation set data were opened, and the HF-predictive algorithm performance was formally evaluated using the evaluation set patient group.
Informed consent was required and obtained from all patients prior to data collection, and the study was approved by the institutional review board at all investigating centers. The authors had full access to the data for the manuscript and take responsibility for its integrity. All authors have read and agree to the manuscript as written.
Available data
In addition to the basic demographic and other medical information collected at enrollment, the data available for algorithm development also included both patient-measured data and device-measured data. The patient-measured data required patient compliance and included daily weight and blood pressure measurements and weekly responses to the symptom questions shown in Table 1. Device-measured data did not require patient compliance. Device-based patient-diagnostic data included activity level; minimum, average, and maximum heart rate; an estimate of sympathetic/parasympathetic balance (ABM); and measures of heart rate variability—SDANN and FootPrint area. Other device-based data included daily impedances for right atrial (RA), right ventricular (RV), left ventricular (LV), and shock electrodes. Additionally, because of weekly remote monitoring, the device’s rate histograms permitted a weekly percentage of the time spent at a high atrial rate (>140 bpm) to be calculated. The high atrial rate percentage is related to but different from the device-reported atrial burden that instead reports percentage of time spent in an atrial tachy-response (ATR) fallback mode.
Model development
All data elements measured by the device or patient were examined for potential use in a multivariate “worsening HF event” probability model. For most data elements, both the absolute value and the change relative to 14- to 91-day rolling baselines were examined. For symptoms and high atrial rate percentage, only absolute values were examined. Using only the development set data, the potential predictive content for each element was estimated by paired univariate comparison of mean values in the 28 days prior to documented HF events to mean values at all other times in each patient using paired Wilcoxon and t-tests. Data elements with strong univariate predictive content (i.e. p<0.005 for both tests) were selected for potential use in a multivariate logistic regression model to determine the probability of a worsening HF event (pre-defined as above) occurring within the next 28 days. Using backwards selection, univariate predictors were iteratively removed if they failed to show a strong (i.e. effects test p<0.005) independent improvement of model output.
The resulting model for probability of being within 28 days of an impending worsening HF event was:
where
and
Δ ShockZ is the change in shock lead impedance.
Δ RAZ is the change in right atrial lead impedance.
Δ Weight is the change in weight in pounds.
Δ HRmin is the change in minimum heart rate.
Δ SDANN is the change in SDANN.
Fatigue is 1 when unusually fatigued and –1 otherwise.
Walk/Climb is –1 when no difficulty reported and 1 when symptom is present.
Pillows is 1 when two or more pillows used and –1 otherwise.
It is expected that this model is too complicated for manual processing. Rather it is envisioned that the resulting probability will be automatically computed on a remote monitoring system on each day having new remotely collected data. The remote monitoring system could then issue and alert when the probability exceeds some threshold value.
Model evaluation
To evaluate the worsening HF model, the model probability was compared to a threshold value that was varied to form an algorithm performance curve as sensitivity versus false detection rate. A lower threshold would lead to higher sensitivity but more false detections, whereas a higher threshold would have the opposite effect. When the daily probability exceeded the selected threshold, a worsening HF detection was produced—but in this study no actual alerts were sent to the clinicians or patients. Once a detection was produced, a new detection was inhibited for the next 14 days. A detection was considered true if there was a documented HF decompensation event within 28 days following the day of the detection; however, each documented HF event could only be successfully detected once. If there was no documented HF decompensation within 28 days, the detection was considered false. A documented HF events occurring within 14 days after a previous documented HF event was considered undetectable since its pre-event data would be corrupted by the previous event. Such events did not contribute to the true or false detection rate. This same evaluation methodology was used to determine algorithm performance in both development and test sets.
The algorithm was designed to become active only after remote data collection was consistently available (i.e. >50% of time) and to remain active until data collection fell to an insufficient level (i.e. <25% of the time). Worsening HF events occurring before the algorithm became active or after it became inactive were not used in algorithm development or evaluation. Similar availability criteria were incorporated so that the algorithm also used weights, symptoms, and HRV data when these data were consistently available. In the model, remote monitored symptoms responses were considered “symptomatic” unless the patient selected the response that indicated “no symptoms” were present. All other data elements were used as collected.
All statistical comparisons and the model development was performed using JMP 7 (SAS Institute, Inc., Cary, NC).
Study follow-up
Between March 2006 and May 2007, DECODE enrolled 699 patients at 46 centers with a total enrolled duration of 759.5 patient-years. One hundred and ninety-three (27.6%) patients were withdrawn prior to the end of the study due to death (63), patient/physician decision (33), change of location/clinician (30), failure to use remote monitoring system (25), or other reasons (42). Eight percent (56) of the patients were withdrawn prior to the first follow-up and so had no opportunity for HF events to be reported. These 56 patients were excluded, leaving 643 patients for analysis. Of those, 201 patients entered the more detailed-following status and were assigned into either the development set (86 patients, 14.6 + 4.3 months, 104.9 patient-years) or evaluation set (115 patients, 15.6 + 4.6 months, 149.0 patient-years, p=0.15 versus development). One patient assigned to the development set was withdrawn prior to the start of remote data collection and thus did not contribute to the analysis.
Remote data collection started at a median of 6 days after study enrollment (average, 20 days). Once started, remote monitoring collected device data at approximately 95% of all scheduled collections and patients reported symptom data at 82% of all scheduled collections. Weights were available on 53% of the days with at least three weights/week reported in more than 63% of patients. Blood pressure data were similarly available 48% of the time. Device-based heart rate variability measures were available 45% of the time, but unrelated to the availability of weights or blood pressure.
Of the 643 patients (720.8 patient-years) there were 198 documented HF events in 117 patients between enrollment and the last follow-up (detailed following had 189 events in 108 patients). In the detailed following patients, 157 HF events occurred in 94 patients while remote data were available. Twenty-two of these occurred less than 14 days after a previous event leaving 135 HF events (112 admissions, 23 outpatient) in 94 patients for algorithm development and evaluation.
At enrollment, there were no significant differences between patients assigned the development and evaluation set as shown in Table 2. For each data element, raw values for each patient were averaged over the study duration (with periods 28 days before, during, and 14 days after documented HF events excluded). These patient values were then averaged across patients to compare the development and evaluation sets as shown in Table 3. The development set patients had slightly lower (3-4 mmHg) diastolic blood pressure, lower RA lead impedance and lower shock lead impedance. These impedance differences were not due to a different distribution of lead model types. The evaluation set patients used more pillows.
Algorithm development
The detection algorithm was developed prior to examination of the evaluation set patient data. For development set patients with at least one documented HF event, Table 4 shows data during the 28-day periods before the events and for other times. Values (except symptoms and high atrial rate percentage) were the difference between the current day’s measurement and the average of measurements during a 4-week rolling baseline ending 14 days earlier. For SDANN and heart rates, a 91-day rolling baseline was used. Symptom values were the percentage of symptomatic responses. A difference between pre-event and other times would suggest that the data element changed prior to the decompensation and might therefore be useful to detect it.
All data elements in Table 4 were initially incorporated into a multivariate logistic regression model to form the probability that the current day was within 28 days before a documented HF event. An effects test for a strong independent contribution to the model was used to remove data elements, starting with the least significant effect, until all remaining data elements had a strong significant independent effect to improve the model.
This regression process left the following development-set data elements in the model: RA and shock lead impedances, weight, the “Fatigue”, “Difficulty Walking/Climbing” and “Number of Pillows” symptoms, minimum heart rate, and SDANN.
Algorithm performance
Figure 2 shows the performance of the detection algorithm based on this model in the development and evaluation datasets. The threshold probability used to detect worsening HF varies with points along the curve. At a 2.0/patient-year false-detection rate, these probability thresholds were approximately 0.14 and 0.18 in the development and evaluation sets respectively. As seen in Figure 2, the detection algorithm showed a 13% lower performance for detecting impending HF events in the evaluation set versus the development set. Since the assignment of patients to the evaluation or development sets was completely random, there was no reason to suspect that performance in the evaluation set would be any different than in the development set. For comparison, the open triangle shows the percentage of true HF events that would have been detected solely by chance by having an equivalent number of detections randomly appearing in the datasets.
Figure 2: Performance of the algorithm in development (closed circles) and evaluation (open circles) sets. The y-axis is sensitivity for algorithm to make true detections within 28 days before the detectable heart failure events in the development (54) or evaluation (81) sets. The x-axis is the number of false detections per patient-year. The algorithm’s detection probability threshold varies along each curve. Open triangle shows performance for same number of detections placed at random throughout the dataset. |
The results of the DECODE trial demonstrate that implantable device-measured and patient-measured data elements do change leading up to HF decompensation events and that these measures could be used in an automated algorithm to detect impending HF events. The performance of the HF detection algorithm based on the available data elements in this study was not sufficient to reliably identify patients in need of care for worsening HF.
Patient involvement in the study was minimal (3-month phone calls) and thus a good reflection of HF standard of care and routine use of the remote monitoring system. Accordingly, these results are likely to be credible representations for the developed algorithm’s potential performance in the broader remote-monitoring population. Although the algorithm performance was modest, it might still provide useful additional information for guiding the management of HF patients.
Figure 3 shows the algorithm performance in the combined (triangles) datasets (201 patients, HF 135 events). The performance is similar without heart rate variability data (open circles), suggesting that heart rate variability improved performance only slightly in the combined set. Since approximately one-half of typical CRT-D patients are programmed to a rate-responsive mode (where heart rate variability is not available) the two performance curves may reflect the algorithm’s potential performance in the remote-monitored CRT-D population. Figure 3 suggests that when applied to the entire data set, about 40% of HF hospitalizations and unscheduled visits for IV decongestive therapies might have been successfully detected at a cost of approximately two false detections per patient-year. For a center following 100 HF patients with remotely monitored CRT-Ds, 25–30 HF decompensation events would be expected in a year. A detection algorithm with the performance of Figure 3 would predict 10–12 of these events while generating about four false detections per week. The practicality of a worsening HF detection algorithm depends on how much time and resources are expended due to false-detections and how much benefit is derived from advanced notice for true detections and possible prevention of ER visits and hospitalizations. As a patient management tool, an algorithm-based detection would not necessarily suggest that invasive intervention was required. Rather, it might help identify patients that could benefit from additional attention or phone contact to determine whether further evaluation and treatment are warranted.
Figure 3: Performance of the detection algorithm in the combined (open triangles) development and evaluation sets. The y-axis is sensitivity for algorithm to make true detections within 28 days before the 135 detectable heart failure (HF) events. The x-axis is the number of false-detections per patient-year. The similar performance for the combined data when HRV data are removed from the algorithm is shown (open circles). The algorithm’s detection probability threshold varies along each curve. Open triangle shows performance for same number of detections placed at random in the dataset. |
In Figure 3, the probability threshold that produced two false detections/patient-year was 0.17 (detection when daily probability for impending event was ≥0.17). Higher or lower thresholds could be used and the threshold could be increased for patients with higher false-detection rates. Individualizing thresholds for each patient was not part of this study. However, we note that at two false detections/patient-year, 26% of the patients produced about 55% of the false detections suggesting that individualizing the detection thresholds might be useful.
Development versus evaluation set performance
The most plausible explanation for the difference in algorithm performance between the development and evaluation sets is that the algorithm was over-fitted to the development set. However, further analyses revealed that the sets were different in nature with regard to their predictive content in those data elements used by the model (see Table 5). Although HF event rates were not different between the development and evaluation sets (0.75 versus 1.08 HF events/patient-year, p=0.52), all model data elements had smaller changes (i.e. less predictive content) in the evaluation versus development set. For weight, fatigue, and number of pillows, the changes were significantly smaller. The reason for these differences between randomly selected datasets is unknown. Taken in light of the significant differences between sets (Table 3), these observations help explain the lower performance in the evaluation set and suggest that the sets may have been too small.
Algorithm redevelopment
Once the formal evaluation was performed, the datasets were pooled to provide additional insight. The predictive content analysis for potential model data elements was repeated using all data from both sets. Reanalyses suggested that instead of 28-day pre-event periods for the model, 14-day pre-event periods were more appropriate for most data elements whereas a 21-day period was appropriate for symptoms. With a too-long pre-event window, the predictive content of a data element could be diluted by data that were not yet changing before the event. Additionally, the interpretation of the symptom responses considered “symptomatic” were revised to reflect those responses that increased the most significantly prior to true HF decompensations. With the benefit of these additional insights, the algorithm was redeveloped. Using the same methodology, a model for the probability of being within 14 days before a HF event was developed. All data elements were included in model redevelopment but only the following data elements made independent contributions and were thus used in the redeveloped model: RA, RV, and shock electrode impedances, the fatigue and waking breathless symptoms, weight, and the SDANN and average heart rate. The performance of the redeveloped algorithm to detect documented HF events within 28 days is shown in Figure 4. The redeveloped algorithm performance is only slightly better than the original algorithm performance. This suggests that performance is limited by the predictive content of the various data elements.
Figure 4: Performance of the redeveloped algorithm (closed circles) in the combined sets. The y-axis is sensitivity for algorithm to make true detections within 28 days before for the 135 detectable heart failure (HF) events. The x-axis is the number of false detections per patient-year. The similar performance is shown using only device and symptoms data (open circles), and device and symptom data with weights (open triangles) or HRV (open squares) added. Large open triangle shows performance for same number of detections placed at random in the dataset. |
Combining multiple data types
The examination of the data collected in this study leading up to documented HF events suggests that no single data element consistently changed before an event in the same direction for all patients. For example, of 85 patients with events that consistently measured weights, 35 had HF events and weights measured within 4 days before the event permitting a direct weight gain to be calculated. Of these patients, 69% had a weight increase prior to the event (median increase 3.6 lbs) but 31% had a weight loss (median loss –3.0 lbs). While weight increase may reflect fluid overload in advance of documented HF events, it may be that fluid overload does not always accompany HF events or that, in some patients, other factors controlling weight dominate the weight change or both. RV lead impedance is another example where 76% of patients had an impedance decrease (median –28 ohms) before the HF events but 24% had an increase (median 5 ohms). Although the reason for the relationship between RV pacing lead impedance and HF decompensation is not currently well understood, it may be that other factors dominate impedance changes in some patients. All data elements examined showed this type of variation among patients with regard to data before the HF events.
Although the concept of a single marker for predicting impending HF decompensation is appealing, this variation among patients suggests that a successful algorithm for early detection of HF events would need to make use of multiple sensors. The approach used in this study estimated the probability of an impending HF event by making use of all the data sources that were available at the time. As shown in Figure 4, the benefit of combining data sources was clear from the progressive improvements in sensitivity (at 2 FP False-positives/patient-year) using device-based impedances/symptoms alone (33%), both device impedances/symptoms and weights (37%), and using all data elements (41%). These findings support the use of multiple factors to improve performance. The modest performance of the detection algorithm suggests that additional data sources that are more closely related to the physiology of decompensated HF can add significant performance improvement. The utilization of ancillary data to predict clinical events may be of great help to the clinician to remotely monitor device function, and this has been demonstrated multiple times. The ability to remotely monitor a disease presents greater challenges. Much of clinical medicine is subjective and the ability to disregard some clinical findings while placing a greater weight on others is part of the challenge of the practice of clinical medicine. The recognition that any parameter while seemingly important or abnormal, may not be indicative of an impending clinical event. The challenge we face in remotely managing a disease absent of direct contact with the patient, is to determine whether the data collected are adequate to make clinical observations. The recognition of the limitations of remote monitoring to either overdiagnose or underdiagnose a problem is critical for the proper use of these systems. The increased use of these systems will be accompanied by a greater revelation as to their place in the management of complex diseases.
Comparison with earlier works
Patient weight increased by about 1.5–2 lbs starting 3–4 weeks before a HF event consistent with previous studies by Chaudhry et al.9 Lead impedances in this study were approximately 3.5% lower in the 7 days prior to HF decompensation events than at other times. The reduction was smaller than the 12.3% decrease in transthoracic impedance before HF hospitalization that have been previously reported,10 but different impedance vectors and times of day were used. This study used pacing lead impedances (e.g. between the RV tip to RV ring electrodes) that reflect the impedance of the lead conductors and tissue very local to the pacing electrodes. It is unlikely that these pacing impedances could have been substantially affected by the altered lung impedance, which was proposed as the main factor affecting intrathoracic impedance changes.11 The shock electrode impedance was more global and could have been affected by lung impedance. However, for devices in this study, the shock impedance was measured between the RV coil and the device which was electrically connected to the proximal defibrillation coil. Because of the proximal coil electrode, the ability of lung impedance to affect shock lead impedance may be reduced. In previous studies, transthoracic impedance was measured from the RV coil (or ring or tip) to the device can impedance.10 Further, impedances in the DECODE study were single daily measurements at the same time of day for the same patient, typically between 9:30 AM and 4:00 PM with peaks at 11:00 AM and 3:30 PM. Previously reported transthoracic impedance measures10 were averages between noon and 6:00 PM for each patient.
This study’s findings generally support the conclusions in a recent review by Adamson11 with regard to changes in heart rate and heart rate variability, but the device-based activity in this study was not significantly reduced prior to HF decompensations. The high atrial rate percentage from this study was only a surrogate for the presence of AF, since noise or oversensing could also affect it. However, a high atrial rate percentage did not appear to change prior to HF decompensation events.12 On the other hand, as Adamson11 suggests, the presence of AF may be useful to assess HF risk since the high atrial rate percentage was significantly higher in DECODE study patients that did (3.4±9.0) versus did not (2.0±7.3, p = 0.0002) decompensate during the study. Regarding Adamson’s suggestion that patients with changing or low heart rate variability should be considered at high risk for worsening HF, in the DECODE study patients who decompensated had similar SDANN (70.5±18.9) to those who did not (73.3±19.5, p = 0.53 by), although SDANN was significantly lower (–5.6 ms, p = 0.0002) in the 14 days prior to documented HF events.
Direct comparison with other studies that assessed performance of early detection of impending HF events is difficult since the definitions of a HF event have differed substantially. Most involved the use of transthoracic impedance10,13–16 monitoring to detect worsening HF. While hospital admission or need for IV therapy for HF were usually considered as worsening HF events, some studies also included a medication adjustment or other treatment for HF as evidence that a detection was a true positive. Using those studies’ reported data to estimate sensitivity for early detection of IV therapy or HF hospitalizations (where possible) led to a median estimated sensitivity of 58.5% (range 10.5–76.0). A similar median estimate for false-positive rate was 0.67/patient-year (range 0.46–1.5). The algorithm of this study did not achieve these performance estimates and may reflect a stronger connection of transthoracic impedance with decompensation.
Study limitations
Only worsening HF episodes leading to a protocol-defined HF event (i.e. HF admission or outpatient IV treatment) were counted as events for DECODE. It is unknown whether some false detections were due to worsening HF episodes that subsided (due to therapy or improved patient compliance) and thus failed to produce a protocol-defined event.
Another limitation was that 56 (8%) patients were withdrawn prior to the first 3-month follow-up. Little or no remote monitored data were collected for these patients, and it is unknown if HF events occurred. Also, 75 (11%) patients (including some with documented HF events) were withdrawn prior to remote data collection. DECODE started just after the initial launch of the remote monitoring system and many of these early withdrawals were because clinicians and patients were unfamiliar with the remote monitoring system.
Another limitation is that 32 (17%) of the 189 documented HF events in the detailed following occurred while remotely monitored data were not collected—mostly due to events that occurred before remote data collection started. If HF events occurred at random throughout the study, then only six events would have been expected to be lost prior to remote data collection, suggesting that the density of events was higher near study enrollment. There are insufficient data for further analysis, but it might be that clinical visits associated with worsening HF presented a good opportunity to enroll study patients so that, overall, patients had a worse HF status and more frequent events near enrollment.
Another limitation is that the algorithm did not consider the effect of atrial or ventricular arrhythmia episodes, premature ventricular complexes, or other rhythm disturbances, or loss of biventricular pacing.
The developed algorithm had only a modest sensitivity when formally applied to the unknown evaluation set, suggesting that remote monitoring of CRT-D and patient-measured data can result in early detection of approximately 35% of worsening HF events with a false-detection rate of two per patient-year. These findings support the hypothesis that it is possible to use remotely collected device and patient-measured data to detect impending HF decompensations. The algorithm developed in this study had limited sensitivity with an acceptable false-positive detection rate in a typical HF patient population. While the device and patient-measured data could be useful for individual patient management, additional sensors more closely related to the pathophysiology of worsening HF may be necessary to create a widely applicable automated detection algorithm with high performance levels.
|