Large scale text classification with efficient word embedding

Mesh Keyword: CNN models Convolutional neural network De noise Large-scale datasets Text classification Word level

All Science Classification Codes (ASJC): Industrial and Manufacturing Engineering

Abstract: This article offers an empirical exploration on the efficient use of word-level convolutional neural networks (word-CNN) for large-scale text classification. Generally, the word-CNNs are difficult to train on large-scale datasets as the size of word embedding dramatically increases as the size of vocabulary increases. In order to handle this issue, this paper presents a de-noise approach to word embedding. We compare our model with several recently proposed CNN models on publicly available dataset. The experimental results show that proposed method improves the usefulness of word-CNN and increases the accuracy of text classification.

URI: https://aurora.ajou.ac.kr/handle/2018.oak/36343
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85022219023&origin=inward

Funding: This research was supported by the MISP(Ministry of Science, ICT & Future Planning), Korea, under the National Program for Excellence in SW) supervised by the IITP(Institute for Information & communications Technology Promotion).

qrcode