Improved subgroup treatment effect estimation in heart failure trials using Bayesian hierarchical shrinkage models
A Henderson, B Bornkamp, M Packer, S Solomon, B Clagett, P Jhund, J McmurrayAbstract
Introduction
Although Phase 3 trials in heart failure are only powered to detect an overall average effect of treatment, considerable attention is often given to apparent variation in treatment effect among subgroups. However, subgroups are underpowered to detect differences and can lead to spurious findings of apparent heterogeneity in treatment effect. Bayesian hierarchical models, which can analyse all data simultaneously and borrow information between subgroups , may reduce this risk. We developed Bayesian hierarchical models to estimate subgroup treatment effects of sacubitril/valsartan in PARADIGM-HF and PARAGON-HF.
Methods
We used data from PARADIGM-HF and PARAGON-HF to compare estimates of the effect of treatment in subgroups using stratification (as is current practice in subgroup analyses) and a hierarchical Bayesian shrinkage model with regularised horseshoe priors. The primary outcome used in this analysis was the composite of time to first occurrence of HF hospitalisation or cardiovascular death. We analysed conventional subgroups (age dichotomised at 75, LVEF dichotomised at the median, NYHA class, sex) and implausible (Zodiac "star signs" at randomisation) subgroups.
Results
The overall average treatment effect (hazard ratio) in PARADIGM-HF was 0.80 (95% CI: 0.73-0.87), and in PARAGON-HF it was 0.92 (95% CI: 0.82-1.04). Although true differences in treatment effect are implausible for Zodiac "star signs", there was the appearance of treatment heterogeneity in PARADIGM-HF for patients randomised under Aries or Capricorn versus all other "star signs" (P-interaction=0.07). By contrast, Bayesian shrinkage estimates of the effects of treatment for these subgroups were closer to their respective overall estimate and were qualitatively similar to each other (Figure). Stratified estimates for conventional subgroups were more precise, therefore the Bayesian estimates shrunk less towards the overall estimate than for the Zodiac example. For example, in the sex subgroup in PARAGON-HF, the fully stratified treatment estimates were 0.81 (95% CI: 0.68-0.96) for males versus 1.04 (95% CI: 0.88 -1.23) for females (P-interaction=0.04). Bayesian shrinkage estimates were less extreme and closer to the overall average but still showed some evidence of differing efficacy although neither showed strong evidence of a benefit for this endpoint: 0.88 (95% CrI: 0.73-1.02) and 0.96 (95% CrI: 0.83-1.14), respectively.
Conclusion
Bayesian hierarchical models offer a balance between completely stratified estimates and the overall estimate for a prespecified subgroup. Using the overall estimate may miss true heterogeneity, and completely stratified results can be misleading because of small sample sizes and the play of chance. Bayesian hierarchical models are a valuable complement to conventional subgroup analyses to help interpret the findings from pivotal heart failure trials.Selected subgroup estimates in PARADIGM-For image description, please refer to the figure legend and surrounding text.