Software Review: Empowering Language Education with D-ID Creative Reality Studio’s Multimodal Capabilities

Chenghao Wang; Xueyun Li

doi:10.4018/IJCALLT.368218

Software Review: Empowering Language Education with D-ID Creative Reality Studio’s Multimodal Capabilities

Chenghao Wang^*, Xueyun Li

^*Corresponding author for this work

School of Humanities and Social Sciences

Research output: Contribution to journal › Article › peer-review

Abstract

D-ID Creative Reality Studio (D-ID) is a platform for creating Artificial Intelligence (AI) presenter (digital human) videos, translating videos, and designing conversational agents. D-ID seamlessly integrates deep-learning face animation technology, large language models (LLMs), natural language processing (NLP), and speech synthesis and recognition (SSR), offering new possibilities for immersive language teaching and learning. As highlighted in the slogan, “Amaze your audience with your art” (D-ID, 2024), both language learners and teachers can utilise D-ID to create personalised multimodal digital learning resources. Aligning with the interaction hypothesis (Long, 1996), a conversational agent can serve as a language partner, offering accurate linguistic input and facilitating conversational practice anytime and anywhere. Additionally, the incorporation of a digital human is consistent with the Cognitive Theory of Multimedia Learning (Mayer & Moreno, 2003), which asserts that dual-channel input—integrating visual and auditory modalities—enhances memory retention and overall learning effectiveness. This software review centres on the web-based version of D-ID and provides a detailed analysis of the main features that can contribute to second language acquisition.

Original language	English
Journal	International Journal of Computer-Assisted Language Learning and Teaching
Volume	15
Issue number	1
DOIs	https://doi.org/10.4018/IJCALLT.368218
Publication status	Published - Jan 2025

Access to Document

10.4018/IJCALLT.368218

Cite this

@article{af9cc45efffb4b789375b95471ede315,

title = "Software Review: Empowering Language Education with D-ID Creative Reality Studio{\textquoteright}s Multimodal Capabilities",

abstract = "D-ID Creative Reality Studio (D-ID) is a platform for creating Artificial Intelligence (AI) presenter (digital human) videos, translating videos, and designing conversational agents. D-ID seamlessly integrates deep-learning face animation technology, large language models (LLMs), natural language processing (NLP), and speech synthesis and recognition (SSR), offering new possibilities for immersive language teaching and learning. As highlighted in the slogan, “Amaze your audience with your art” (D-ID, 2024), both language learners and teachers can utilise D-ID to create personalised multimodal digital learning resources. Aligning with the interaction hypothesis (Long, 1996), a conversational agent can serve as a language partner, offering accurate linguistic input and facilitating conversational practice anytime and anywhere. Additionally, the incorporation of a digital human is consistent with the Cognitive Theory of Multimedia Learning (Mayer & Moreno, 2003), which asserts that dual-channel input—integrating visual and auditory modalities—enhances memory retention and overall learning effectiveness. This software review centres on the web-based version of D-ID and provides a detailed analysis of the main features that can contribute to second language acquisition.",

author = "Chenghao Wang and Xueyun Li",

year = "2025",

month = jan,

doi = "10.4018/IJCALLT.368218",

language = "English",

volume = "15",

journal = "International Journal of Computer-Assisted Language Learning and Teaching",

issn = "2155-7098",

number = "1",

}

TY - JOUR

T1 - Software Review: Empowering Language Education with D-ID Creative Reality Studio’s Multimodal Capabilities

AU - Wang, Chenghao

AU - Li, Xueyun

PY - 2025/1

Y1 - 2025/1

N2 - D-ID Creative Reality Studio (D-ID) is a platform for creating Artificial Intelligence (AI) presenter (digital human) videos, translating videos, and designing conversational agents. D-ID seamlessly integrates deep-learning face animation technology, large language models (LLMs), natural language processing (NLP), and speech synthesis and recognition (SSR), offering new possibilities for immersive language teaching and learning. As highlighted in the slogan, “Amaze your audience with your art” (D-ID, 2024), both language learners and teachers can utilise D-ID to create personalised multimodal digital learning resources. Aligning with the interaction hypothesis (Long, 1996), a conversational agent can serve as a language partner, offering accurate linguistic input and facilitating conversational practice anytime and anywhere. Additionally, the incorporation of a digital human is consistent with the Cognitive Theory of Multimedia Learning (Mayer & Moreno, 2003), which asserts that dual-channel input—integrating visual and auditory modalities—enhances memory retention and overall learning effectiveness. This software review centres on the web-based version of D-ID and provides a detailed analysis of the main features that can contribute to second language acquisition.

AB - D-ID Creative Reality Studio (D-ID) is a platform for creating Artificial Intelligence (AI) presenter (digital human) videos, translating videos, and designing conversational agents. D-ID seamlessly integrates deep-learning face animation technology, large language models (LLMs), natural language processing (NLP), and speech synthesis and recognition (SSR), offering new possibilities for immersive language teaching and learning. As highlighted in the slogan, “Amaze your audience with your art” (D-ID, 2024), both language learners and teachers can utilise D-ID to create personalised multimodal digital learning resources. Aligning with the interaction hypothesis (Long, 1996), a conversational agent can serve as a language partner, offering accurate linguistic input and facilitating conversational practice anytime and anywhere. Additionally, the incorporation of a digital human is consistent with the Cognitive Theory of Multimedia Learning (Mayer & Moreno, 2003), which asserts that dual-channel input—integrating visual and auditory modalities—enhances memory retention and overall learning effectiveness. This software review centres on the web-based version of D-ID and provides a detailed analysis of the main features that can contribute to second language acquisition.

U2 - 10.4018/IJCALLT.368218

DO - 10.4018/IJCALLT.368218

M3 - Article

SN - 2155-7098

VL - 15

JO - International Journal of Computer-Assisted Language Learning and Teaching

JF - International Journal of Computer-Assisted Language Learning and Teaching

IS - 1

ER -

Software Review: Empowering Language Education with D-ID Creative Reality Studio’s Multimodal Capabilities

Abstract

Access to Document

Fingerprint

Cite this