Ajou University repository

Information-theoretic and graph-based approaches for biomarker discovery
  • 왕세희
Citations

SCOPUS

0

Citation Export

DC Field Value Language
dc.contributor.advisorKyung-Ah Sohn-
dc.contributor.author왕세희-
dc.date.issued2024-08-
dc.identifier.other33988-
dc.identifier.urihttps://aurora.ajou.ac.kr/handle/2018.oak/39194-
dc.description학위논문(박사)--인공지능학과,2024. 8-
dc.description.abstractBiomarkers are important characteristics that indicate normal biological processes, pathogenic processes, and pharmacological responses, making biomarker development crucial in the fields of medicine and life sciences. Recently, various artificial intelligence models have been developed to identify potential biomarkers, and there is an increasing need for more accurate and reliable methodologies. However, since the biological characteristics of the human body result from complex interactions among multiple features, it is important to employ methodologies that effectively reflect these interactions. Therefore, this thesis proposes biomarker discovery methods that utilize information-theoretic analysis and graph analysis to accurately reflect interactions between features. In the first study, the focus is on developing a stable feature scoring method by replacing the mutual information formula, an information-theoretic relevance measurement method. This approach facilitates faster and more reliable computation of correlations between features and diseases, aiding in biomarker discovery. In the second study, we propose methods for creating and analyzing correlation graphs, using information-theoretic measurements to generate and interpret meaningful graphs. The proposed methods have been validated through comparative experiments in various environments. The first experiment demonstrated that the selected features could be potential biomarker candidates, while the second experiment showed that the generated networks could be useful for biomarker exploration. Subsequent experiments include proposals and analyses of feature selection methods that consider graph structures among samples.-
dc.description.tableofcontents1. Introduction 1_x000D_ <br> 1.1 Biomarker discovery 1_x000D_ <br> 1.2 Information-theoretic based approach 2_x000D_ <br> 1.3 Graph-based methods in biomarker discovery 3_x000D_ <br> 1.4 Overview 5_x000D_ <br>2. Feature scoring methods using information-theoretic approaches 7_x000D_ <br> 2.1 Introduction 7_x000D_ <br> 2.2 Feature scoring using reconstruction error as a proxy for mutual information 9_x000D_ <br> 2.2.1 Transforming mutual information into a reconstruction error-based concept 11_x000D_ <br> 2.2.2 Reconstruction error-based feature scoring 13_x000D_ <br> 2.3 Improvements through clustering and limiting bottleneck layer information 16_x000D_ <br> 2.3.1 Simplifying bottleneck layer selection 18_x000D_ <br> 2.3.2 Advanced feature selection via feature-wise clustering 18_x000D_ <br> 2.4 Experiments 19_x000D_ <br> 2.4.1 Performance validation for benchmark datasets 19_x000D_ <br> 2.4.2 Computational cost validation 22_x000D_ <br> 2.4.3 Performance evaluation across varying feature and sample sizes 22_x000D_ <br> 2.4.4 Functional enrichment analysis 22_x000D_ <br> 2.5 Results 23_x000D_ <br> 2.5.1 Performance validation results for benchmark datasets 23_x000D_ <br> 2.5.2 Results of computational cost validation 29_x000D_ <br> 2.5.3 Results of performance evaluation across varying feature and sample sizes 31_x000D_ <br> 2.5.4 Results of functional enrichment analysis: TCGA 33_x000D_ <br> 2.5.5 Results of functional enrichment analysis: ARCHS4 37_x000D_ <br> 2.6 Discussion 39_x000D_ <br> 2.6.1 Discussion: results of TCGA dataset 39_x000D_ <br> 2.6.2 Discussion: results of ARCHS4 dataset 40_x000D_ <br> 2.6.3 Conclusion 41_x000D_ <br>3. Information theoretic graph-based methods in biomarker discovery 42_x000D_ <br> 3.1 Introduction 42_x000D_ <br> 3.2 Design of a fast algorithm to generate SNP networks 43_x000D_ <br> 3.2.1 Calculation of mutual information through reduction of large-size contingency table 46_x000D_ <br> 3.2.2 Experiments : Simulation Result 49_x000D_ <br> 3.3 Biomarker discovery through conversion from SNP to gene network 52_x000D_ <br> 3.3.1 Construction of SNP epistasis networks using information-theoretic measures 55_x000D_ <br> 3.3.2 Gene-gene interaction network construction from SNP epistasis network 56_x000D_ <br> 3.3.3 Extraction of a statistically significant interaction network 58_x000D_ <br> 3.3.4 Validation through prior knowledge databases 59_x000D_ <br> 3.3.5 Graph refinement and validation using network topology 61_x000D_ <br> 3.3.6 Functional enrichment analysis 69_x000D_ <br> 3.4 Discussion 73_x000D_ <br>4. Unsupervised feature scoring using reconstruction errors in low-dimensional GNN embeddings 75_x000D_ <br> 4.1 Unsupervised feature scoring that reflects graph characteristics 76_x000D_ <br> 4.2 Performance validation in benchmark datasets 78_x000D_ <br> 4.2.1 Comparative experiments with unsupervised feature selection methods 78_x000D_ <br> 4.2.2 Comparative experiments with the feature importance of a supervised explainable method 82_x000D_ <br> 4.3 Discussion 85_x000D_ <br>5. Conclusion 86_x000D_ <br>References 87_x000D_-
dc.language.isoeng-
dc.publisherThe Graduate School, Ajou University-
dc.rights아주대학교 논문은 저작권에 의해 보호받습니다.-
dc.titleInformation-theoretic and graph-based approaches for biomarker discovery-
dc.typeThesis-
dc.contributor.affiliation아주대학교 대학원-
dc.contributor.alternativeNameSehee Wang-
dc.contributor.department일반대학원 인공지능학과-
dc.date.awarded2024-08-
dc.description.degreeDoctor-
dc.identifier.urlhttps://dcoll.ajou.ac.kr/dcollection/common/orgView/000000033988-
dc.subject.keywordbiomarker-
dc.subject.keywordfeature scoring-
dc.subject.keywordgraph neural network-
dc.subject.keywordmutual information-
dc.subject.keywordsingle nucleotide polymorphisms-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Total Views & Downloads

File Download

  • There are no files associated with this item.