Multi-Channel Spatio-Temporal Transformer for Sign Language Production

Ma, Xiaohan; Jin, Rize; Chung, Tae Sun

DC Field	Value	Language
dc.contributor.author	Ma, Xiaohan	-
dc.contributor.author	Jin, Rize	-
dc.contributor.author	Chung, Tae Sun	-
dc.date.issued	2024-01-01	-
dc.identifier.uri	https://aurora.ajou.ac.kr/handle/2018.oak/37104	-
dc.identifier.uri	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85195911057&origin=inward	-
dc.description.abstract	The task of Sign Language Production (SLP) in machine learning involves converting text-based spoken language into corresponding sign language expressions. Sign language conveys meaning through the continuous movement of multiple articulators, including manual and non-manual channels. However, most current Transformer-based SLP models convert these multi-channel sign poses into a unified feature representation, ignoring the inherent structural correlations between channels. This paper introduces a novel approach called MCST-Transformer for skeletal sign language production. It employs multi-channel spatial attention to capture correlations across various channels within each frame, and temporal attention to learn sequential dependencies for each channel over time. Additionally, the paper explores and experiments with multiple fusion techniques to combine the spatial and temporal representations into naturalistic sign sequences. To validate the effectiveness of the proposed MCST-Transformer model and its constituent components, extensive experiments were conducted on two benchmark sign language datasets from diverse cultures. The results demonstrate that this new approach outperforms state-of-the-art models on both datasets.	-
dc.description.sponsorship	This work was supported by the Institute of Information & communications Technology Planning & Evaluation (IITP) under the Artificial Intelligence Convergence Innovation Human Resources Development (IITP-2024-RS-2023-00255968) grant, the ITRC (Information Technology Research Center) support program (IITP-2021-0-02051) funded by the Korea government (MSIT), and the Foreign Intelligence support program funded by Shijiazhuang Science and Technology Bureau (Project No. 20240024).	-
dc.language.iso	eng	-
dc.publisher	European Language Resources Association (ELRA)	-
dc.subject.mesh	Language production	-
dc.subject.mesh	Machine-learning	-
dc.subject.mesh	Multi channel	-
dc.subject.mesh	Production models	-
dc.subject.mesh	Sign language	-
dc.subject.mesh	Sign language production	-
dc.subject.mesh	Spatio-temporal	-
dc.subject.mesh	Spatio-temporal fusions	-
dc.subject.mesh	Spoken languages	-
dc.subject.mesh	Transformer	-
dc.title	Multi-Channel Spatio-Temporal Transformer for Sign Language Production	-
dc.type	Conference	-
dc.citation.conferenceDate	2024.05.20.~2024.05.25.	-
dc.citation.conferenceName	Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024	-
dc.citation.edition	2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings	-
dc.citation.endPage	11712	-
dc.citation.startPage	11699	-
dc.citation.title	2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings	-
dc.identifier.bibliographicCitation	2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings, pp.11699-11712	-
dc.identifier.scopusid	2-s2.0-85195911057	-
dc.subject.keyword	Sign Language Production	-
dc.subject.keyword	Spatio-Temporal Fusion	-
dc.subject.keyword	Transformer	-
dc.type.other	Conference Paper	-
dc.subject.subarea	Theoretical Computer Science	-
dc.subject.subarea	Computational Theory and Mathematics	-
dc.subject.subarea	Computer Science Applications	-

Show simple item record

qrcode

트윗하기

Related Researcher

Chung, Tae-Sun정태선: Department of Software and Computer Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download