Ajou University repository

EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speechoa mark
  • Cho, Deok Hyeon ;
  • Oh, Hyung Seok ;
  • Kim, Seung Bin ;
  • Lee, Sang Hoon ;
  • Lee, Seong Whan
Citations

SCOPUS

12

Citation Export

DC Field Value Language
dc.contributor.authorCho, Deok Hyeon-
dc.contributor.authorOh, Hyung Seok-
dc.contributor.authorKim, Seung Bin-
dc.contributor.authorLee, Sang Hoon-
dc.contributor.authorLee, Seong Whan-
dc.date.issued2024-01-01-
dc.identifier.issn1990-9772-
dc.identifier.urihttps://aurora.ajou.ac.kr/handle/2018.oak/38119-
dc.identifier.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85206434807&origin=inward-
dc.description.abstractDespite rapid advances in the field of emotional text-to-speech (TTS), recent studies primarily focus on mimicking the average style of a particular emotion. As a result, the ability to manipulate speech emotion remains constrained to several predefined labels, compromising the ability to reflect the nuanced variations of emotion. In this paper, we propose EmoSphere-TTS, which synthesizes expressive emotional speech by using a spherical emotion vector to control the emotional style and intensity of the synthetic speech. Without any human annotation, we use the arousal, valence, and dominance pseudo-labels to model the complex nature of emotion via a Cartesian-spherical transformation. Furthermore, we propose a dual conditional adversarial network to improve the quality of generated speech by reflecting the multi-aspect characteristics. The experimental results demonstrate the model's ability to control emotional style and intensity with high-quality expressive speech.-
dc.description.sponsorshipThis work was partly supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2019-0-00079, Artificial Intelligence Graduate School Program (Korea University), No. 2021-0-02068, Artificial Intelligence Innovation Hub, and AI Technology for Interactive Communication of Language Impaired Individuals).-
dc.language.isoeng-
dc.publisherInternational Speech Communication Association-
dc.subject.meshComplex nature-
dc.subject.meshEmotional speech-
dc.subject.meshEmotional speech synthesis-
dc.subject.meshEmotional style and intensity control-
dc.subject.meshExpressive emotional speech synthesis-
dc.subject.meshHuman annotations-
dc.subject.meshIntensity models-
dc.subject.meshSpeech emotions-
dc.subject.meshSynthetic speech-
dc.subject.meshText to speech-
dc.titleEmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech-
dc.typeConference-
dc.citation.conferenceDate2024.09.01.~2024.09.05.-
dc.citation.conferenceName25th Interspeech Conferece 2024-
dc.citation.editionInterspeech 2024-
dc.citation.endPage1814-
dc.citation.startPage1810-
dc.citation.titleProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.identifier.bibliographicCitationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp.1810-1814-
dc.identifier.doi10.21437/interspeech.2024-398-
dc.identifier.scopusid2-s2.0-85206434807-
dc.identifier.urlhttps://www.isca-speech.org/iscaweb/index.php/online-archive-
dc.subject.keywordemotional style and intensity control-
dc.subject.keywordexpressive emotional speech synthesis-
dc.subject.keywordText-to-speech-
dc.type.otherConference Paper-
dc.identifier.pissn2308457X-
dc.description.isoatrue-
dc.subject.subareaLanguage and Linguistics-
dc.subject.subareaHuman-Computer Interaction-
dc.subject.subareaSignal Processing-
dc.subject.subareaSoftware-
dc.subject.subareaModeling and Simulation-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Lee, Sang-Hoon Image
Lee, Sang-Hoon이상훈
Department of Software and Computer Engineering
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.