Zhipu AI has released GLM-4-Voice, an open-source speech large language model that combines speech recognition, text generation, and speech synthesis into a single system.
This model can translate speech to text, text to speech, and even speech to speech. GLM-4-Voice is built upon the GLM-4 language model and supports both English and Chinese.
This open-source release, like others such as LG's EXAONE 3.0 and Google's Gemma, provides researchers and developers with the tools to further explore and advance the field of speech artificial intelligence.
Ещё видео!