Citation Export
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Park, Kiejin | - |
dc.contributor.author | Peng, Limei | - |
dc.date.issued | 2018-01-01 | - |
dc.identifier.uri | https://dspace.ajou.ac.kr/dev/handle/2018.oak/30114 | - |
dc.description.abstract | Social data such as users' comments are unstructured in nature and up-to-date technologies for analyzing such data are constrained by the available storage space and processing time when fast storing and processing is required. On the other hand, it is even difficult in using a huge amount of dynamically generated social data to analyze the user features in a high speed. To solve this problem, we design and implement a topic association analysis system based on the latent Dirichlet allocation (LDA) model. The LDA does not require the training process and thus can analyze the social users' hourly interests on different topics in an easy way. The proposed system is constructed based on the Spark framework that is located on top of Hadoop cluster. It is advantageous of high-speed processing owing to that minimized access to hard disk is required and all the intermediately generated data are processed in the main memory. In the performance evaluation, it requires about 5 hours to analyze the topics for about 1 TB test social data (SNS comments). Moreover, through analyzing the association among topics, we can track the hourly change of social users' interests on different topics. | - |
dc.description.sponsorship | This paper is partially supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (No. 2015R1C1A1A02036536) and partially supported by Ajou University Research Fund. | - |
dc.language.iso | eng | - |
dc.publisher | Korea Information Processing Society | - |
dc.subject.mesh | Association analysis | - |
dc.subject.mesh | Design and implements | - |
dc.subject.mesh | Hadoop | - |
dc.subject.mesh | High-speed processing | - |
dc.subject.mesh | Latent dirichlet allocations | - |
dc.subject.mesh | LDA (latent Dirichlet allocation) | - |
dc.subject.mesh | Performance evaluations | - |
dc.subject.mesh | Topic Modeling | - |
dc.title | A development of LDA topic association systems based on spark-hadoop framework | - |
dc.type | Article | - |
dc.citation.endPage | 149 | - |
dc.citation.startPage | 140 | - |
dc.citation.title | Journal of Information Processing Systems | - |
dc.citation.volume | 14 | - |
dc.identifier.bibliographicCitation | Journal of Information Processing Systems, Vol.14, pp.140-149 | - |
dc.identifier.doi | 10.3745/jips.04.0057 | - |
dc.identifier.scopusid | 2-s2.0-85042754740 | - |
dc.identifier.url | http://www.jips-k.org/file/down?pn=532 | - |
dc.subject.keyword | Association analysis | - |
dc.subject.keyword | Hadoop | - |
dc.subject.keyword | LDA (latent Dirichlet allocation) | - |
dc.subject.keyword | Spark | - |
dc.subject.keyword | Topic model | - |
dc.subject.subarea | Software | - |
dc.subject.subarea | Information Systems | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.