R&D

VOICE RECOGNITION AND COGNITION

Overview

Recently, Speech-to-Text (STT) and Text-to-Speech (TTS) Technology has been boosted significantly by the wave of AI and applied to all trades and professions, such as virtual assistants, intelligent customer service, smart home, smart speakers, and voice control systems for vehicle equipment. In the future, AI speech Technology will become the foundation for the "metaverse" virtual worlds. Furthermore, the AI speech Technology’s key elements are STT and TTS. Our research on AI speech technologies includes multilingual speech recognition, synthesis, speaker verification, sound event detection, etc. Then, we transfer research results to API solutions and interactive applications in speech.
VOICE RECOGNITION AND COGNITION

CORE TECHNOLOGY

  • Speech-to-Text
  • Voiceprint Recognition
  • Sound Event Detection
  • Text-to-Speech

Voice Recognition and Cognition

VOICE RECOGNITION AND COGNITION

Application Status

Speech-to-Text:We use AI deep learning models to convert speech into text or extract voice attributes. Our technologies include Taiwan localized (Mandarin/English/Taiwanese/Hakka) multilingual speech recognition, voice attribute (language/gender/age/emotion) analysis, voiceprint recognition, ambient sound/event sound detection, etc. The technical achievements have been successively applied to Chunghwa Telecom's "iBobby" smart speaker, MOD voice assistant, IVR voice navigation, outbound robot, voice of customer analysis, and more enterprise customer projects. We also won the TCCDA excellent customer service award "Best intelligent System Application Enterprise" for two consecutive years in 2020/2021.
Text-to-Speech:We use AI deep learning audio generation models to convert text into realistic speech with vivid and natural quality . Our technologies include Taiwan localized (Mandarin/English/Taiwanese/Hakka) multilingual speech synthesis, emotional speech synthesis, multilingual speaker voice conversion (to generate personal speech synthesis with only a few audio files). The technical achievements are applied to the "iBobby" smart speaker, MOD voice assistant, IVR voice navigation, outbound robot and enterprise customer projects.