Ajou University repository

Efficient imputation of missing data using the information of local space defined by the geometric one-class classifier
Citations

SCOPUS

3

Citation Export

DC Field Value Language
dc.contributor.authorKim, Do Gyun-
dc.contributor.authorChoi, Jin Young-
dc.date.issued2024-05-15-
dc.identifier.issn0957-4174-
dc.identifier.urihttps://dspace.ajou.ac.kr/dev/handle/2018.oak/33835-
dc.description.abstractDatasets gathered from actual systems may include missing data owing to unintentional faults, such as the breakdown of equipment as well as intentional reasons such as sampling inspection. Because missing data can result in incorrect and distorted results when analyzed, they should be addressed before the analysis is performed. Imputation of missing data involves replacing missing entries of data with values calculated from observed features, which is a more reasonable alternative than simple methods, including a complete case analysis. Although various imputation methods exist for missing data, most ignore the local space around it, which may be closely related to missing values. Furthermore, the imputation method, which can partially reflect local relationships, is susceptible to overfitting and has parameter tuning issues owing to the lack of a systematic definition of the local space. Thus, we propose a composite fuzzy hyper-rectangle (H-RTGL) imputation (CFHRI) method with the following characteristics: (i) it defines the local space using an H-RTGL-based one-class classifier to thoroughly describe the data of the target class, and (ii) it imputes the missing entries using a fuzzy model comprising imputation models calculated from H-RTGLs. These features enable CFHRI to formulate the local space adjacent to missing data systematically and alleviate the hazards of overfitting into a certain region of the dataset. We validated our method based on numerical experiments conducted using a dataset gathered from an actual system and comparison of the imputation performance of our method with that of other imputation methods. CFHRI showed statistically significant improvement in 5 datasets among 7 datasets used, and around 10% enhanced in terms of Mean Absolute Error (MAE). Moreover, we could achieve 3–5% of increased classification accuracy of imputed dataset, which indicates CFHRI can be a useful pre-processor of dataset whose purpose is classification.-
dc.description.sponsorshipThis work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2017R1A2B4009841).-
dc.language.isoeng-
dc.publisherElsevier Ltd-
dc.subject.meshActual system-
dc.subject.meshComposite fuzzy model-
dc.subject.meshFuzzy modeling-
dc.subject.meshHyperrectangles-
dc.subject.meshImputation-
dc.subject.meshImputation methods-
dc.subject.meshLocal spaces-
dc.subject.meshMissing data-
dc.subject.meshOne-class classifier-
dc.subject.meshOverfitting-
dc.titleEfficient imputation of missing data using the information of local space defined by the geometric one-class classifier-
dc.typeArticle-
dc.citation.titleExpert Systems with Applications-
dc.citation.volume242-
dc.identifier.bibliographicCitationExpert Systems with Applications, Vol.242-
dc.identifier.doi10.1016/j.eswa.2023.122775-
dc.identifier.scopusid2-s2.0-85179000711-
dc.identifier.urlhttps://www.sciencedirect.com/science/journal/09574174-
dc.subject.keywordComposite fuzzy model-
dc.subject.keywordHyper-rectangle-
dc.subject.keywordImputation-
dc.subject.keywordLocal space-
dc.subject.keywordMissing data-
dc.subject.keywordOne-class classifier-
dc.description.isoafalse-
dc.subject.subareaEngineering (all)-
dc.subject.subareaComputer Science Applications-
dc.subject.subareaArtificial Intelligence-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Choi, Jin Young Image
Choi, Jin Young최진영
Department of Industrial Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.