Causal forest in the evaluation of heterogeneity of treatment effects in medicine: basic principles and application_Chinese Journal of Evidence-Based Medicine

Authors：

ZHOU Wenyue ¹ , YI Fei ¹ , LI Bingli ² , SUN Feng ³ ,  YANG Zhirong ^1,2,4

1. Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, P. R. China;
2. Research Center for Biomedical Information Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, P. R. China;
3. Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, P. R. China;
4. Primary Care Unit, School of Clinical Medicine, University of Cambridge, Cambridge CB1 8RN, United Kingdom;

Corresponding?author：

YANG Zhirong, Email: zr.yang@siat.ac.cn

Keywords：

Causal forest; Treatment effect; Heterogeneity; Principles; Application

DOI：

10.7507/1672-2531.202212074

Video：

Export PDF Favorites Scan Get Citation

Abstract Full text Figures/Tables Video References Cited by

Randomized controlled trials are the gold standard for evaluating the effects of medical interventions, primarily providing estimates of the average effect of an intervention in the overall study population. However, there may be significant differences in the effect of the same intervention across sub-populations with different characteristics, that is, treatment heterogeneity. Traditional subgroup analysis and interaction analysis tend to have low power to examine treatment heterogeneity or identify the sources of heterogeneity. With the recent development of machine learning techniques, causal forest has been proposed as a novel method to evaluate treatment heterogeneity, which can help overcome the limitations of the traditional methods. However, the application of causal forest in the evaluation of treatment heterogeneity in medicine is still in the beginning stage. In order to promote proper use of causal forest, this paper introduces its purposes, principles and implementation, interprets the examples and R codes, and highlights some attentions needed for practice.

Citation： ZHOU Wenyue, YI Fei, LI Bingli, SUN Feng, YANG Zhirong. Causal forest in the evaluation of heterogeneity of treatment effects in medicine: basic principles and application. Chinese Journal of Evidence-Based Medicine, 2023, 23(4): 485-491. doi: 10.7507/1672-2531.202212074 Copy

1.	Rothwell PM. Treating individuals 2. Subgroup analysis in randomised controlled trials:importance, indications, and interpretation. Lancet, 2005, 365(9454): 176-186.
2.	Sun X, Ioannidis JP, Agoritsas T, et al. How to use a subgroup analysis: users' guide to the medical literature. JAMA, 2014, 311(4): 405-411.
3.	Kent DM, Steyerberg E, van Klaveren D. Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects. BMJ, 2018, 363: k4245.
4.	Gong X, Hu M, Basu M, et al. Heterogeneous treatment effect analysis based on machine-learning methodology. CPT Pharmacometrics Syst Pharmacol, 2021, 10(11): 1433-1443.
5.	Brookes ST, Whitely E, Egger M, et al. Subgroup analyses in randomized trials: risks of subgroup-specific analyses; power and sample size for the interaction test. J Clin Epidemiol, 2004, 57(3): 229-236.
6.	Athey S, Imbens G. Recursive partitioning for heterogeneous causal effects. Proc Natl Acad Sci U S A, 2016, 113(27): 7353-7360.
7.	Wager S, Athey S. Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc, 2018, 113(523): 1228-1242.
8.	何文靜, 尤東方, 張汝陽, 等. 利用因果森林估計異質性人群下個體的處理效應. 中華流行病學雜志, 2019, 40(6): 707-712.
9.	Podgorelec V, Kokol P, Stiglic B, et al. Decision trees: an overview and their use in medicine. J Med Syst, 2002, 26(5): 445-463.
10.	Breiman L. Random forests. Machine Learning. 2001: 45, 5-32.
11.	Athey S, Wager S. Estimating treatment effects with causal forests: an application. Observational Studies, 2019, 5(2): 37-51.
12.	Look AHEAD Regearch Group, Wing RR, Bolin P, et al. Cardiovascular effects of intensive lifestyle intervention in type 2 diabetes. N Engl J Med, 2013, 369(2): 145-154.
13.	Baum A, Scarpa J, Bruzelius E, et al. Targeting weight loss interventions to reduce cardiovascular complications of type 2 diabetes: a machine learning-based post-hoc analysis of heterogeneous treatment effects in the Look AHEAD trial. Lancet Diabetes Endocrinol, 2017, 5(10): 808-815.
14.	SPRINT Research Group, Wright JT, Williamson JD, et al. A randomized trial of intensive versus standard blood-pressure control. N Engl J Med, 2015, 373(22): 2103-2116.
15.	Scarpa J, Bruzelius E, Doupe P, et al. Assessment of risk of harm associated with intensive blood pressure management among patients with hypertension who smoke: a secondary analysis of the systolic blood pressure intervention trial. JAMA Netw Open, 2019, 2(3): e190005.
16.	Inoue K, Seeman TE, Horwich T, et al. Heterogeneity in the association between the presence of coronary artery calcium and cardiovascular events: a machine-learning approach in the MESA study. Circulation, 2023, 147(2): 132-141.
17.	Goldstein BA, Rigdon J. Using machine learning to identify heterogeneous effects in randomized clinical trials-moving beyond the forest plot and into the forest. JAMA Netw Open, 2019, 2(3): e190004.
18.	Künzel SR, Sekhon JS, Bickel PJ, et al. Metalearners for estimating heterogeneous treatment effects using machine learning. Proc Natl Acad Sci U S A, 2019, 116(10): 4156-4165.
19.	Blakely T, Lynch J, Simons K, et al. Reflection on modern methods: when worlds collide-prediction, machine learning and causal inference. Int J Epidemiol, 2021, 49(6): 2058-2064.

1. Rothwell PM. Treating individuals 2. Subgroup analysis in randomised controlled trials:importance, indications, and interpretation. Lancet, 2005, 365(9454): 176-186.
2. Sun X, Ioannidis JP, Agoritsas T, et al. How to use a subgroup analysis: users' guide to the medical literature. JAMA, 2014, 311(4): 405-411.
3. Kent DM, Steyerberg E, van Klaveren D. Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects. BMJ, 2018, 363: k4245.
4. Gong X, Hu M, Basu M, et al. Heterogeneous treatment effect analysis based on machine-learning methodology. CPT Pharmacometrics Syst Pharmacol, 2021, 10(11): 1433-1443.
5. Brookes ST, Whitely E, Egger M, et al. Subgroup analyses in randomized trials: risks of subgroup-specific analyses; power and sample size for the interaction test. J Clin Epidemiol, 2004, 57(3): 229-236.
6. Athey S, Imbens G. Recursive partitioning for heterogeneous causal effects. Proc Natl Acad Sci U S A, 2016, 113(27): 7353-7360.
7. Wager S, Athey S. Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc, 2018, 113(523): 1228-1242.
8. 何文靜, 尤東方, 張汝陽, 等. 利用因果森林估計異質性人群下個體的處理效應. 中華流行病學雜志, 2019, 40(6): 707-712.
9. Podgorelec V, Kokol P, Stiglic B, et al. Decision trees: an overview and their use in medicine. J Med Syst, 2002, 26(5): 445-463.
10. Breiman L. Random forests. Machine Learning. 2001: 45, 5-32.
11. Athey S, Wager S. Estimating treatment effects with causal forests: an application. Observational Studies, 2019, 5(2): 37-51.
12. Look AHEAD Regearch Group, Wing RR, Bolin P, et al. Cardiovascular effects of intensive lifestyle intervention in type 2 diabetes. N Engl J Med, 2013, 369(2): 145-154.
13. Baum A, Scarpa J, Bruzelius E, et al. Targeting weight loss interventions to reduce cardiovascular complications of type 2 diabetes: a machine learning-based post-hoc analysis of heterogeneous treatment effects in the Look AHEAD trial. Lancet Diabetes Endocrinol, 2017, 5(10): 808-815.
14. SPRINT Research Group, Wright JT, Williamson JD, et al. A randomized trial of intensive versus standard blood-pressure control. N Engl J Med, 2015, 373(22): 2103-2116.
15. Scarpa J, Bruzelius E, Doupe P, et al. Assessment of risk of harm associated with intensive blood pressure management among patients with hypertension who smoke: a secondary analysis of the systolic blood pressure intervention trial. JAMA Netw Open, 2019, 2(3): e190005.
16. Inoue K, Seeman TE, Horwich T, et al. Heterogeneity in the association between the presence of coronary artery calcium and cardiovascular events: a machine-learning approach in the MESA study. Circulation, 2023, 147(2): 132-141.
17. Goldstein BA, Rigdon J. Using machine learning to identify heterogeneous effects in randomized clinical trials-moving beyond the forest plot and into the forest. JAMA Netw Open, 2019, 2(3): e190004.
18. Künzel SR, Sekhon JS, Bickel PJ, et al. Metalearners for estimating heterogeneous treatment effects using machine learning. Proc Natl Acad Sci U S A, 2019, 116(10): 4156-4165.
19. Blakely T, Lynch J, Simons K, et al. Reflection on modern methods: when worlds collide-prediction, machine learning and causal inference. Int J Epidemiol, 2021, 49(6): 2058-2064.

Previous Article
Features and progress of metadata standards of clinical research
Next Article
Target trial emulation study based on real world data: status quo and prospect

Chinese Journal of Evidence-Based Medicine

Causal forest in the evaluation of heterogeneity of treatment effects in medicine: basic principles and application

Abstract Full text Figures/Tables Video References Cited by

Previous Article

Next Article

Format

Content