Ajou University repository

Are We Training with The Right Data? Evaluating Collective Confidence in Training Data using Dempster Shafer Theory
Citations

SCOPUS

0

Citation Export

DC Field Value Language
dc.contributor.authorDey, Sangeeta-
dc.contributor.authorLee, Seok Won-
dc.date.issued2022-01-01-
dc.identifier.issn0270-5257-
dc.identifier.urihttps://aurora.ajou.ac.kr/handle/2018.oak/36811-
dc.identifier.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85132965104&origin=inward-
dc.description.abstractThe latest trend of incorporating various data-centric machine learning (ML) models in software-intensive systems has posed new challenges in the quality assurance practice of software engineering, especially in a high-risk environment. ML experts are now focusing on explaining ML models to assure the safe behavior of ML-based systems. However, not enough attention has been paid to explain the inherent uncertainty of the training data. The current practice of ML-based system engineering lacks transparency in the systematic fitness assessment process of the training data before engaging in the rigorous ML model training. We propose a method of assessing the collective confidence in the quality of a training dataset by using Dempster Shafer theory and its modified combination rule (Yager's rule). With the example of training datasets for pedestrian detection of autonomous vehicles, we demonstrate how the proposed approach can be used by the stakeholders with diverse expertise to combine their beliefs in the quality arguments and evidences about the data. Our results open up a scope of future research on data requirements engineering that can facilitate evidence-based data assurance for ML-based safety-critical systems.-
dc.description.sponsorshipThis work was supported by the BK21 FOUR program of the National Research Foundation (NRF) of Korea funded by the Ministry of Education (NRF5199991014091) and the Basic Science Research Program through the NRF funded by the Ministry of Science and ICT (NRF-2020R1F1A1075605).-
dc.language.isoeng-
dc.publisherIEEE Computer Society-
dc.subject.meshData centric-
dc.subject.meshData uncertainty-
dc.subject.meshDempster-Shafer theory-
dc.subject.meshHigh risk environment-
dc.subject.meshMachine learning models-
dc.subject.meshMachine-learning-
dc.subject.meshQuality assurance practices-
dc.subject.meshSoftware intensive systems-
dc.subject.meshTraining data-
dc.subject.meshTraining dataset-
dc.titleAre We Training with The Right Data? Evaluating Collective Confidence in Training Data using Dempster Shafer Theory-
dc.typeConference-
dc.citation.conferenceDate2022.5.22. ~ 2022.5.27.-
dc.citation.conferenceName44th ACM/IEEE International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2022-
dc.citation.editionProceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2022-
dc.citation.endPage15-
dc.citation.startPage11-
dc.citation.titleProceedings - International Conference on Software Engineering-
dc.identifier.bibliographicCitationProceedings - International Conference on Software Engineering, pp.11-15-
dc.identifier.doi10.1109/icse-nier55298.2022.9793521-
dc.identifier.scopusid2-s2.0-85132965104-
dc.subject.keyworddata uncertainty-
dc.subject.keywordDempster Shafer theory-
dc.subject.keywordmachine learning-
dc.subject.keywordsafety-
dc.type.otherConference Paper-
dc.description.isoafalse-
dc.subject.subareaSoftware-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Lee, Seok-Won Image
Lee, Seok-Won이석원
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.