Ajou University repository

SwCS: Section-wise content similarity approach to exploit scientific big dataoa mark
  • Irshad, Kashif ;
  • Afzal, Muhammad Tanvir ;
  • Rizvi, Sanam Shahla ;
  • Shahid, Abdul ;
  • Riaz, Rabia ;
  • Chung, Tae Sun
Citations

SCOPUS

4

Citation Export

DC Field Value Language
dc.contributor.authorIrshad, Kashif-
dc.contributor.authorAfzal, Muhammad Tanvir-
dc.contributor.authorRizvi, Sanam Shahla-
dc.contributor.authorShahid, Abdul-
dc.contributor.authorRiaz, Rabia-
dc.contributor.authorChung, Tae Sun-
dc.date.issued2021-01-01-
dc.identifier.urihttps://dspace.ajou.ac.kr/dev/handle/2018.oak/31777-
dc.description.abstractThe growing collection of scientific data in various web repositories is referred to as Scientific Big Data, as it fulfills the four “V’s” of Big Data—volume, variety, velocity, and veracity. This phenomenon has created new opportunities for startups; for instance, the extraction of pertinent research papers from enormous knowledge repositories using certain innovative methods has become an important task for researchers and entrepreneurs. Traditionally, the content of the papers are compared to list the relevant papers from a repository. The conventional method results in a long list of papers that is often impossible to interpret productively. Therefore, the need for a novel approach that intelligently utilizes the available data is imminent. Moreover, the primary element of the scientific knowledge base is a research article, which consists of various logical sections such as the Abstract, Introduction, Related Work, Methodology, Results, and Conclusion. Thus, this study utilizes these logical sections of research articles, because they hold significant potential in finding relevant papers. In this study, comprehensive experiments were performed to determine the role of the logical sections-based terms indexing method in improving the quality of results (i.e., retrieving relevant papers). Therefore, we proposed, implemented, and evaluated the logical sections-based content comparisons method to address the research objective with a standard method of indexing terms. The section-based approach outperformed the standard content-based approach in identifying relevant documents from all classified topics of computer science. Overall, the proposed approach extracted 14% more relevant results from the entire dataset. As the experimental results suggested that employing a finer content similarity technique improved the quality of results, the proposed approach has led the foundation of knowledge-based startups.-
dc.description.sponsorshipFunding Statement: This work was partially supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (2020-0-01592) and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2019R1F1A1058548).-
dc.language.isoeng-
dc.publisherTech Science Press-
dc.subject.meshContent similarity-
dc.subject.meshContent-based approach-
dc.subject.meshConventional methods-
dc.subject.meshKnowledge repository-
dc.subject.meshQuality of results-
dc.subject.meshRelevant documents-
dc.subject.meshResearch objectives-
dc.subject.meshScientific knowledge base-
dc.titleSwCS: Section-wise content similarity approach to exploit scientific big data-
dc.typeArticle-
dc.citation.endPage894-
dc.citation.startPage877-
dc.citation.titleComputers, Materials and Continua-
dc.citation.volume67-
dc.identifier.bibliographicCitationComputers, Materials and Continua, Vol.67, pp.877-894-
dc.identifier.doi10.32604/cmc.2021.014156-
dc.identifier.scopusid2-s2.0-85099364530-
dc.identifier.urlhttps://www.techscience.com/cmc/v67n1/41183-
dc.subject.keywordACM classification-
dc.subject.keywordContent similarity-
dc.subject.keywordCosine similarity-
dc.subject.keywordScientific big data-
dc.subject.keywordTerm indexing-
dc.description.isoatrue-
dc.subject.subareaBiomaterials-
dc.subject.subareaModeling and Simulation-
dc.subject.subareaMechanics of Materials-
dc.subject.subareaComputer Science Applications-
dc.subject.subareaElectrical and Electronic Engineering-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Chung, Tae-Sun Image
Chung, Tae-Sun정태선
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.