Drug Repositioning with Disease-Drug Clusters from Word Representations

Journal: Proceedings - 2022 IEEE International Conference on Big Data and Smart Computing, BigComp 2022

Citation: Proceedings - 2022 IEEE International Conference on Big Data and Smart Computing, BigComp 2022, pp.182-189

Keyword: Disease-drug clustering Drug repositioning Text mining Word embedding Word2vec

Mesh Keyword: Candidate drugs Clusterings Disease-drug clustering Drug repositioning Embeddings Text data Text-mining Word embedding Word representations Word2vec

All Science Classification Codes (ASJC): Artificial Intelligence Computer Science Applications Computer Vision and Pattern Recognition Information Systems and Management Health Informatics

Abstract: With the advent of easy access to a tremendous amount of text data, various studies utilizing text mining have been conducted in the biomedical field. However, most are only concerned with retrieving information solely from the perspective of either diseases or drugs. Extending from such boundary, we propose an approach of embedding disease and drugs from biomedical literature, determining direct relationships between them, and identifying possibilities of drug repositioning. To embed both disease and drugs, we utilize the word2vec algorithm and generate embedded word vectors for each disease and drug. Then hierarchical clustering with Ward's method is applied for categorization. Moreover, we suggest an evaluation measure that compares clusters from the text data with results from the molecular biology level. The proposed method was applied to 17,606,652 MEDLINE abstracts and extracted 4,163 diseases and 3,930 drugs. By examining heterogeneous clusters in which both disease and drug exist, nine candidate drugs were deduced for each disease in combination with 79 diseases and 84 drugs. The results are expected to serve as a baseline for the preliminary selection of candidate drugs for drug repositioning.

URI: https://aurora.ajou.ac.kr/handle/2018.oak/36790
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85127589291&origin=inward

Journal URL: http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=9736461

Funding: ACKNOWLEDGMENT The authors would like to gratefully acknowledge supported from the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1A2C2003474), BK21 FOUR program of the National Research Foundation of Korea funded by the Ministry of Education (NRF5199991014091) and the Ajou University research fund.

qrcode