Ajou University repository

Knowledge distillation meets recommendation: collaborative distillation for top-N recommendation
  • Lee, Jae woong ;
  • Choi, Minjin ;
  • Sael, Lee ;
  • Shim, Hyunjung ;
  • Lee, Jongwuk
Citations

SCOPUS

3

Citation Export

Publication Year
2022-05-01
Publisher
Springer Science and Business Media Deutschland GmbH
Citation
Knowledge and Information Systems, Vol.64, pp.1323-1348
Keyword
Collaborative filteringData ambiguityData sparsityKnowledge distillationTop-N recommendation
Mesh Keyword
Classification tasksData ambiguitiesData sparsityFeedback problemsHit rateKnowledge distillationRanking problemsStudent ModelingTeacher modelsTop-N recommendation
All Science Classification Codes (ASJC)
SoftwareInformation SystemsHuman-Computer InteractionHardware and ArchitectureArtificial Intelligence
Abstract
Knowledge distillation (KD) is a successful method for transferring knowledge from one model (i.e., teacher model) to another model (i.e., student model). Despite the success of KD in classification tasks, applying KD to recommender models is challenging because of the sparsity of positive feedback, ambiguity of missing feedback, and ranking problem for top-N recommendation. In this paper, we propose a new KD model for collaborative filtering, namely collaborative distillation (CD). Specifically, (1) we reformulate a loss function to deal with the ambiguity of missing feedback. (2) We exploit probabilistic rank-aware sampling for top-N recommendation. (3) To train the proposed model effectively, we develop two training strategies for the student model, called teacher- and student-guided training methods, adaptively selecting the most beneficial feedback from the teacher model. Furthermore, we extend our model using self-distillation, called born-again CD (BACD). That is, the teacher and student models with the same model capacity are trained by using the proposed distillation method. The experimental results demonstrate that CD outperforms the state-of-the-art method by 2.7–33.2% and 2.7–29.9% in hit rate (HR) and normalized discounted cumulative gain (NDCG), respectively. Moreover, BACD improves the teacher model by 3.5–12.0% and 4.9–13.3% in HR and NDCG, respectively.
Language
eng
URI
https://dspace.ajou.ac.kr/dev/handle/2018.oak/32653
DOI
https://doi.org/10.1007/s10115-022-01667-8
Fulltext

Type
Article
Funding
This work was supported by the National Research Foundation of Korea (NRF) (NRF-2018R1A5A1060031 and NRF-2021R1F1A1063843). Also, this work was supported by Institute of Information & communications Technology Planning & evaluation (IITP) funded by the Korea government (MSIT) (No. 2020-0-01821, ICT Creative Consilience Program).This work was supported by the National Research Foundation of Korea (NRF) (NRF-2018R1A5A1060031 and NRF-2021R1F1A1063843). Also, this work was supported by Institute of Information & communications Technology Planning & evaluation (IITP) funded by the Korea government (MSIT) (No. 2020-0-01821, ICT Creative Consilience Program).
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Lee, Sael Image
Lee, Sael이슬
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.