Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Do Gyun | - |
dc.contributor.author | Choi, Jin Young | - |
dc.date.issued | 2024-05-15 | - |
dc.identifier.issn | 0957-4174 | - |
dc.identifier.uri | https://dspace.ajou.ac.kr/dev/handle/2018.oak/33835 | - |
dc.description.abstract | Datasets gathered from actual systems may include missing data owing to unintentional faults, such as the breakdown of equipment as well as intentional reasons such as sampling inspection. Because missing data can result in incorrect and distorted results when analyzed, they should be addressed before the analysis is performed. Imputation of missing data involves replacing missing entries of data with values calculated from observed features, which is a more reasonable alternative than simple methods, including a complete case analysis. Although various imputation methods exist for missing data, most ignore the local space around it, which may be closely related to missing values. Furthermore, the imputation method, which can partially reflect local relationships, is susceptible to overfitting and has parameter tuning issues owing to the lack of a systematic definition of the local space. Thus, we propose a composite fuzzy hyper-rectangle (H-RTGL) imputation (CFHRI) method with the following characteristics: (i) it defines the local space using an H-RTGL-based one-class classifier to thoroughly describe the data of the target class, and (ii) it imputes the missing entries using a fuzzy model comprising imputation models calculated from H-RTGLs. These features enable CFHRI to formulate the local space adjacent to missing data systematically and alleviate the hazards of overfitting into a certain region of the dataset. We validated our method based on numerical experiments conducted using a dataset gathered from an actual system and comparison of the imputation performance of our method with that of other imputation methods. CFHRI showed statistically significant improvement in 5 datasets among 7 datasets used, and around 10% enhanced in terms of Mean Absolute Error (MAE). Moreover, we could achieve 3–5% of increased classification accuracy of imputed dataset, which indicates CFHRI can be a useful pre-processor of dataset whose purpose is classification. | - |
dc.description.sponsorship | This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2017R1A2B4009841). | - |
dc.language.iso | eng | - |
dc.publisher | Elsevier Ltd | - |
dc.subject.mesh | Actual system | - |
dc.subject.mesh | Composite fuzzy model | - |
dc.subject.mesh | Fuzzy modeling | - |
dc.subject.mesh | Hyperrectangles | - |
dc.subject.mesh | Imputation | - |
dc.subject.mesh | Imputation methods | - |
dc.subject.mesh | Local spaces | - |
dc.subject.mesh | Missing data | - |
dc.subject.mesh | One-class classifier | - |
dc.subject.mesh | Overfitting | - |
dc.title | Efficient imputation of missing data using the information of local space defined by the geometric one-class classifier | - |
dc.type | Article | - |
dc.citation.title | Expert Systems with Applications | - |
dc.citation.volume | 242 | - |
dc.identifier.bibliographicCitation | Expert Systems with Applications, Vol.242 | - |
dc.identifier.doi | 10.1016/j.eswa.2023.122775 | - |
dc.identifier.scopusid | 2-s2.0-85179000711 | - |
dc.identifier.url | https://www.sciencedirect.com/science/journal/09574174 | - |
dc.subject.keyword | Composite fuzzy model | - |
dc.subject.keyword | Hyper-rectangle | - |
dc.subject.keyword | Imputation | - |
dc.subject.keyword | Local space | - |
dc.subject.keyword | Missing data | - |
dc.subject.keyword | One-class classifier | - |
dc.description.isoa | false | - |
dc.subject.subarea | Engineering (all) | - |
dc.subject.subarea | Computer Science Applications | - |
dc.subject.subarea | Artificial Intelligence | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.