Tencent: Enhanced Real-Time Speech Synthesis

Project Image
projectfase
Adopt
thema's
Transaction to interaction
Cloud Everywhere
Next UI
value chain
Service
Firm infrastructure
Technology
innovatie sector
Mensgerichte AI
SDGs
9. Industrie, innovatie en infrastructuur

Project Achtergrond

Intelligent speech applications are undergoing unprecedented breakthroughs and growth. The Chinese intelligent speech market is expected to reach CNY 19.48 billion by the end of 2021.1 Tencent has been dedicated to artificial intelligence (AI) research and Internet innovations to empower intelligent speech hardware vendors. The company is currently working hard on the development of the Xiaowei intelligent speech and video service access platform. The platform, with Text to Speech (TTS) based on neural-based vocoder at its core, performs high-quality TTS conversion and delivery via end-to end acoustic models.

Probleemstelling van het project

While classic vocoder models such as WaveNet can generate high-fidelity audio, the high complexity and huge computation required lengthen the synthesis of speeches, limiting their ability to satisfy the demand for real-time performance in real-world production scenarios. Continued access by a large number of devices also challenges the platform's throughput. Expanding server capacity is simply an imperfect solution, as it would cause deployment costs to skyrocket. For that reason, Tencent decided to adopt even more cutting-edge vocoder models to optimize the Xiaowei platform in-depth. In close collaboration with Intel, Tencent developed the Parallel WaveNet (pWaveNet) and WaveRNN custom vocoder model solutions to provide the platform with exceptional TTS performance while effectively reducing the total cost of ownership (TCO).

Technologische innovaties

Deep learning, Neural Networks, AI

Doelstelling van het project

The solution uses 3rd Generation Intel® Xeon® Scalable Processors integrated with BFloat extensions and Intel® Advanced Vector Extensions 512 which greatly reduce access to memory and supports hardware acceleration when working in conjunction with the Intel® oneAPI Deep Neural Network Library.

Technology Providers

Intel

To top