Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Taegyoung | - |
dc.contributor.author | Chung, Tae Sun | - |
dc.contributor.author | Kim, Jongik | - |
dc.date.issued | 2020-01-01 | - |
dc.identifier.issn | 2169-3536 | - |
dc.identifier.uri | https://aurora.ajou.ac.kr/handle/2018.oak/31359 | - |
dc.identifier.uri | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85086454823&origin=inward | - |
dc.description.abstract | In this paper, we study the problem of string similarity search to retrieve in a database all strings similar to a query string within a given threshold. To measure the similarity between strings, we use edit distance. Many algorithms have been proposed under a filtering-and-verification framework to solve the problem. To reduce the overhead of edit distance verification, it is crucial to efficiently generate a small number of candidates in the filtering phase. Recently, an index structure named HSTree has been proposed for efficiently generating candidate strings. To generate candidates, they select and utilize HSTree nodes at a specific level calculated from a given threshold. In this paper, we observe that there are many alternative ways to select HSTree nodes, and propose a novel technique that selects HSTree nodes in an optimized way based on the observation. We also propose a modified HSTree, named a threaded HSTree, which connects inverted lists of an HSTree node to inverted lists of its child nodes. With a threaded HSTree, we can reduce the overhead of index lookups in HSTree nodes while selecting optimal tree nodes. Experimental results show that the proposed technique significantly outperforms the existing technique using the HSTree. | - |
dc.description.sponsorship | This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korean Government (Ministry of Science and ICT) under Grant 2019R1F1A1059795 and Grant 2019R1F1A1058548. | - |
dc.language.iso | eng | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.subject.mesh | Edit distance | - |
dc.subject.mesh | Index structure | - |
dc.subject.mesh | Inverted list | - |
dc.subject.mesh | Novel techniques | - |
dc.subject.mesh | Query string | - |
dc.subject.mesh | Signature selections | - |
dc.subject.mesh | String similarity | - |
dc.subject.mesh | Verification framework | - |
dc.title | Optimized Signature Selection for Efficient String Similarity Search | - |
dc.type | Article | - |
dc.citation.endPage | 98204 | - |
dc.citation.startPage | 98193 | - |
dc.citation.title | IEEE Access | - |
dc.citation.volume | 8 | - |
dc.identifier.bibliographicCitation | IEEE Access, Vol.8, pp.98193-98204 | - |
dc.identifier.doi | 2-s2.0-85086454823 | - |
dc.identifier.scopusid | 2-s2.0-85086454823 | - |
dc.identifier.url | http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639 | - |
dc.subject.keyword | Edit distance | - |
dc.subject.keyword | Hierarchical tree index | - |
dc.subject.keyword | Optimized signature selection | - |
dc.subject.keyword | Partition signature scheme | - |
dc.subject.keyword | String similarity search | - |
dc.type.other | Article | - |
dc.description.isoa | true | - |
dc.subject.subarea | Computer Science (all) | - |
dc.subject.subarea | Materials Science (all) | - |
dc.subject.subarea | Engineering (all) | - |
dc.subject.subarea | Electrical and Electronic Engineering | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.