SMILESynergy: Anticancer drug synergy prediction based on Transformer pre-trained model_Journal of Biomedical Engineering

Authors：

ZHANG Liqiang ^1,2 ,  QIN Yufang ^1,2 , CHEN Ming ^1,2

1. College of Information, Shanghai Ocean University, Shanghai 201306, P. R. China;
2. Key Laboratory of Fisheries Information, Ministry of Agriculture and Rural Affairs, Shanghai 201306, P. R. China;

Corresponding?author：

QIN Yufang, Email: yfqin@shou.edu.cn

Keywords：

Synergy; Deep learning; Attention; Transformer

DOI：

10.7507/1001-5515.202209043

Video：

Export PDF Favorites Scan Get Citation

Abstract Full text Figures/Tables Video References Cited by

The synergistic effect of drug combinations can solve the problem of acquired resistance to single drug therapy and has great potential for the treatment of complex diseases such as cancer. In this study, to explore the impact of interactions between different drug molecules on the effect of anticancer drugs, we proposed a Transformer-based deep learning prediction model—SMILESynergy. First, the drug text data—simplified molecular input line entry system (SMILES) were used to represent the drug molecules, and drug molecule isomers were generated through SMILES Enumeration for data augmentation. Then, the attention mechanism in the Transformer was used to encode and decode the drug molecules after data augmentation, and finally, a multi-layer perceptron (MLP) was connected to obtain the synergy value of the drugs. Experimental results showed that our model had a mean squared error of 51.34 in regression analysis, an accuracy of 0.97 in classification analysis, and better predictive performance than the DeepSynergy and MulinputSynergy models. SMILESynergy offers improved predictive performance to assist researchers in rapidly screening optimal drug combinations to improve cancer treatment outcomes.

Citation： ZHANG Liqiang, QIN Yufang, CHEN Ming. SMILESynergy: Anticancer drug synergy prediction based on Transformer pre-trained model. Journal of Biomedical Engineering, 2023, 40(3): 544-551. doi: 10.7507/1001-5515.202209043 Copy

1.	Abeshouse A, Ahn J, Akbani R, et al. The molecular taxonomy of primary prostate cancer. Cell, 2015, 163(4): 1011-1025.
2.	Housman G, Byler S, Heerboth S, et al. Drug resistance in cancer: an overview. Cancers, 2014, 6(3): 1769-1792.
3.	Chou T C. Theoretical basis, experimental design, and computerized simulation of synergism and antagonism in drug combination studies. Pharmacol Rev, 2006, 58(3): 621-681.
4.	Bajorath J. Integration of virtual and high-throughput screening. Nat Rev Drug Discov, 2002, 1(11): 882-894.
5.	Preuer K, Lewis R P I, Hochreiter S, et al. DeepSynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics, 2018, 34(9): 1538-1546.
6.	O’neil J, Benita Y, Feldman I, et al. An unbiased oncology compound screen to identify novel combination strategies. Mol Cancer Ther, 2016, 15(6): 1155-1162.
7.	Zhang T, Zhang L, Payne P R O, et al. Synergistic drug combination prediction by integrating multiomics data in deep learning models. Methods Mol Biol, 2021, 2194: 223-238.
8.	陳希, 秦玉芳, 陳明, 等. 基于多輸入神經網絡的藥物組合協同作用預測. 生物醫學工程學雜志, 2020, 37(4): 676-682, 691.
9.	Sun Z, Huang S, Jiang P, et al. DTF: deep tensor factorization for predicting anticancer drug synergy. Bioinformatics, 2020, 36(16): 4483-4489.
10.	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need// 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach: NIPS, 2017: 6000-6010.
11.	Schwaller P, Laino T, Gaudin T, et al. Molecular Transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent Sci, 2019, 5(9): 1572-1583.
12.	Wang S, Guo Y, Wang Y, et al. Smiles-Bert: Large scale unsupervised pre-training for molecular property prediction// BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. Niagara Falls: Association for Computing Machinery, 2019: 429-436.
13.	Tetko I V, Karpov P, Van Deursen R, et al. State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat Commun, 2020, 11(1): 5575.
14.	Honda S, Shi S, Ueda H R. Smiles transformer: pre-trained molecular fingerprint for low data drug discovery. arXiv preprint arXiv, 2019: 1911.04738.
15.	He J, You H, Sandstrm E, et al. Molecular optimization by capturing chemist's intuition using deep neural networks. J Cheminform, 2021, 13(1): 26.
16.	Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans: Association for Computational Linguistics, 2018: 2227-2237.
17.	Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv, 2018: 1810.04805.
18.	Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci, 1988, 28(1): 31-36.
19.	Liu Q, Xie L. TranSynergy: Mechanism-driven interpretable deep neural network for the synergistic prediction and pathway deconvolution of drug combinations. PLoS Comput Biol, 2021, 17(2): e1008653.
20.	Di Veroli G Y, Fornari C, Wang D, et al. Combenefit: an interactive platform for the analysis and visualization of drug combinations. Bioinformatics, 2016, 32(18): 2866-2868.
21.	Holbeck S L, Camalier R, Crowell J A, et al. The national cancer institute ALMANAC: a comprehensive screening resource for the detection of anticancer drug pairs with enhanced therapeutic activity. Cancer Res, 2017, 77(13): 3564-3576.
22.	Landrum G. RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling. (2013) [2022-09-20]. http: //www.rdkit.org/RDKit_Overview.pdf.
23.	Gaulton A, Hersey A, Nowotka M, et al. The ChEMBL database in 2017. Nucleic Acids Res, 2017, 45(D1): D945-D954.
24.	Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. J Mach Learn Res, 2011, 12: 2825-2830.
25.	Hinselmann G, Rosenbaum L, Jahn A, et al. jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints. J Cheminform, 2011, 3(1): 3.
26.	Cao D S, Xu Q S, Hu Q N, et al. ChemoPy: freely available python package for computational biology and chemoinformatics. Bioinformatics, 2013, 29(8): 1092-1094.
27.	Singh P K, Negi A, Gupta P K, et al. Toxicophore exploration as a screening technology for drug design and discovery: techniques, scope and limitations. Arch Toxicol, 2016, 90(8): 1785-1802.

1. Abeshouse A, Ahn J, Akbani R, et al. The molecular taxonomy of primary prostate cancer. Cell, 2015, 163(4): 1011-1025.
2. Housman G, Byler S, Heerboth S, et al. Drug resistance in cancer: an overview. Cancers, 2014, 6(3): 1769-1792.
3. Chou T C. Theoretical basis, experimental design, and computerized simulation of synergism and antagonism in drug combination studies. Pharmacol Rev, 2006, 58(3): 621-681.
4. Bajorath J. Integration of virtual and high-throughput screening. Nat Rev Drug Discov, 2002, 1(11): 882-894.
5. Preuer K, Lewis R P I, Hochreiter S, et al. DeepSynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics, 2018, 34(9): 1538-1546.
6. O’neil J, Benita Y, Feldman I, et al. An unbiased oncology compound screen to identify novel combination strategies. Mol Cancer Ther, 2016, 15(6): 1155-1162.
7. Zhang T, Zhang L, Payne P R O, et al. Synergistic drug combination prediction by integrating multiomics data in deep learning models. Methods Mol Biol, 2021, 2194: 223-238.
8. 陳希, 秦玉芳, 陳明, 等. 基于多輸入神經網絡的藥物組合協同作用預測. 生物醫學工程學雜志, 2020, 37(4): 676-682, 691.
9. Sun Z, Huang S, Jiang P, et al. DTF: deep tensor factorization for predicting anticancer drug synergy. Bioinformatics, 2020, 36(16): 4483-4489.
10. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need// 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach: NIPS, 2017: 6000-6010.
11. Schwaller P, Laino T, Gaudin T, et al. Molecular Transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent Sci, 2019, 5(9): 1572-1583.
12. Wang S, Guo Y, Wang Y, et al. Smiles-Bert: Large scale unsupervised pre-training for molecular property prediction// BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. Niagara Falls: Association for Computing Machinery, 2019: 429-436.
13. Tetko I V, Karpov P, Van Deursen R, et al. State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat Commun, 2020, 11(1): 5575.
14. Honda S, Shi S, Ueda H R. Smiles transformer: pre-trained molecular fingerprint for low data drug discovery. arXiv preprint arXiv, 2019: 1911.04738.
15. He J, You H, Sandstrm E, et al. Molecular optimization by capturing chemist's intuition using deep neural networks. J Cheminform, 2021, 13(1): 26.
16. Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans: Association for Computational Linguistics, 2018: 2227-2237.
17. Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv, 2018: 1810.04805.
18. Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci, 1988, 28(1): 31-36.
19. Liu Q, Xie L. TranSynergy: Mechanism-driven interpretable deep neural network for the synergistic prediction and pathway deconvolution of drug combinations. PLoS Comput Biol, 2021, 17(2): e1008653.
20. Di Veroli G Y, Fornari C, Wang D, et al. Combenefit: an interactive platform for the analysis and visualization of drug combinations. Bioinformatics, 2016, 32(18): 2866-2868.
21. Holbeck S L, Camalier R, Crowell J A, et al. The national cancer institute ALMANAC: a comprehensive screening resource for the detection of anticancer drug pairs with enhanced therapeutic activity. Cancer Res, 2017, 77(13): 3564-3576.
22. Landrum G. RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling. (2013) [2022-09-20]. http: //www.rdkit.org/RDKit_Overview.pdf.
23. Gaulton A, Hersey A, Nowotka M, et al. The ChEMBL database in 2017. Nucleic Acids Res, 2017, 45(D1): D945-D954.
24. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. J Mach Learn Res, 2011, 12: 2825-2830.
25. Hinselmann G, Rosenbaum L, Jahn A, et al. jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints. J Cheminform, 2011, 3(1): 3.
26. Cao D S, Xu Q S, Hu Q N, et al. ChemoPy: freely available python package for computational biology and chemoinformatics. Bioinformatics, 2013, 29(8): 1092-1094.
27. Singh P K, Negi A, Gupta P K, et al. Toxicophore exploration as a screening technology for drug design and discovery: techniques, scope and limitations. Arch Toxicol, 2016, 90(8): 1785-1802.

Journal of Biomedical Engineering

SMILESynergy: Anticancer drug synergy prediction based on Transformer pre-trained model

Abstract Full text Figures/Tables Video References Cited by

Previous Article

Next Article

Format

Content