π Publications
π Speech Synthesis
NeurIPS 2019

FastSpeech: Fast, Robust and Controllable Text to Speech
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu
- FastSpeech is the first fully parallel end-to-end speech synthesis model.
- Academic Impact: This work is included by many famous speech synthesis open-source projects, such as ESPNet
. Our work are promoted by more than 20 media and forums, such as ζΊε¨δΉεΏγInfoQ.
- Industry Impact: FastSpeech has been deployed in Microsoft Azure TTS service and supports 49 more languages with state-of-the-art AI quality. It was also shown as a text-to-speech system acceleration example in NVIDIA GTC2020.
ICLR 2021

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu
- This work is included by many famous speech synthesis open-source projects, such as PaddlePaddle/Parakeet
, ESPNet
and fairseq
.
NeurIPS 2021

PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Yi Ren, Jinglin Liu, Zhou Zhao
Project | |
AAAI 2022

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao
Project | |
|
- SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech, Zhenhui Ye, Zhou Zhao, Yi Ren, Fei Wu, IJCAI 2022
- EditSinger: Zero-Shot Text-Based Singing Voice Editing System with Diverse Prosody Modeling, Lichao Zhang, Zhou Zhao, Yi Ren, Liqun Deng, IJCAI 2022 (Oral)
- FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis, Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao, IJCAI 2022 (Oral)
- Revisiting Over-Smoothness in Text to Speech, Yi Ren, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu, ACL 2022
- Learning the Beauty in Songs: Neural Singing Voice Beautifier, Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao, ACL 2022 |
- ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech, Yi Ren, Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Zhou Zhao, ICASSP 2022
- EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model, Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei and Zhou Zhao, INTERSPEECH 2021
- WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution, Kexun Zhang, Yi Ren, Changliang Xu and Zhou Zhao, INTERSPEECH 2021 (best student paper award candidate)
- Denoising Text to Speech with Frame-Level Noise Modeling, Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu, ICASSP 2021 | Project
- Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus, Rongjie Huang, Feiyang Chen, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao, ACM-MM 2021 (Oral)
- FedSpeech: Federated Text-to-Speech with Continual Learning, Ziyue Jiang, Yi Ren, Ming Lei and Zhou Zhao, IJCAI 2021
- DeepSinger: Singing Voice Synthesis with Data Mined From the Web, Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu, KDD 2020 | Project
- LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition, Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu, KDD 2020 | Project
- MultiSpeech: Multi-Speaker Text to Speech with Transformer, Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin, INTERSPEECH 2020 | Project
- Almost Unsupervised Text to Speech and Automatic Speech Recognition, Yi Ren, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu, ICML 2019 (Oral) | Project
π Lip Generation/Understanding
- Parallel and High-Fidelity Text-to-Lip Generation, Jinglin Liu, Zhiying Zhu, Yi Ren, Wencan Huang, Baoxing Huai, Nicholas Yuan, Zhou Zhao, AAAI 2022 |
- Flow-based Unconstrained Lip to Speech Generation, Jinzheng He, Zhou Zhao, Yi Ren, Jinglin Liu, Baoxing Huai, Nicholas Yuan, AAAI 2022
- FastLR: Non-Autoregressive Lipreading Model with Integrate-and-Fire, Jinglin Liu, Yi Ren, Zhou Zhao, Chen Zhang, Baoxing Huai, Jing Yuan, ACM-MM 2020
π Machine Translation
- UWSpeech: Speech to Speech Translation for Unwritten Languages, Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Kejun Zhang, Tie-Yan Liu, AAAI 2021 | Project
- Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation, Jinglin Liu, Yi Ren, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao and Tie-Yan Liu, IJCAI 2020
- SimulSpeech: End-to-End Simultaneous Speech to Text Translation, Yi Ren, Jinglin Liu, Xu Tan, Chen Zhang, Qin Tao, Zhou Zhao, Tie-Yan Liu, ACL 2020
- A Study of Non-autoregressive Model for Sequence Generation, Yi Ren, Jinglin Liu, Xu Tan, Zhou Zhao, Sheng Zhao, Tie-Yan Liu, ACL 2020
- Multilingual Neural Machine Translation with Knowledge Distillation, Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu, ICLR 2019
πΌ Music Generation
- SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint, Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin, AAAI 2021
- PopMAG: Pop Music Accompaniment Generation, Yi Ren, Jinzheng He, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu, ACM-MM 2020 (Oral) | Project
π§βπ¨ Generative Model
- Pseudo Numerical Methods for Diffusion Models on Manifolds, Luping Liu, Yi Ren, Zhijie Lin, Zhou Zhao, ICLR 2022 |
|