Group Relative Policy Optimization for Text-to-Speech with Large Language Models
Chang Liu, Ya-Jun Hu, Ying-Ying Gao, Shi-Lei Zhang, Zhen-Hua Ling
Abstract: This paper proposes a GRPO-based approach to enhance the performance of large language model (LLM)-based text-to-speech (TTS) models by deriving rewards from an off-the-shelf automatic speech recognition (ASR) model. Compared to previous reinforcement learning methods for LLM-based TTS, our method requires no dedicated model for reward computation or training. Moreover, we design a composite reward function that combines character error rate (CER) with negative log-likelihood (NLL) obtained from the ASR model, providing more informative and accurate reward signals. We apply GRPO fine-tuning to pre-trained LLM-based TTS models and evaluate their zero-shot TTS performance. Experimental results show that the proposed method substantially improves both the intelligibility and naturalness of synthesized speech. Ablation studies and further analyses confirm the effectiveness of integrating the two reward components.
Codes and Models will be released after review.
Code is in progress at https://github.com/ryuclc/CosyVoice2-GRPO
Contents
Zero-shot In-context Generation
| Language | Prompt | Text | CosyVoice2 | + GRPO-CER | + GRPO-NLL | + GRPO-CER-NLL |
|---|---|---|---|---|---|---|
| zh | 哎呀,你都多少岁的人了,还像个小孩子! |
哦,等他酒醒以后,我打电话请公司的人来保他出去。 | ||||
没什么没什么,只是平时他总是站在这里,有点奇怪而已。 |
北京在出行规模,城市影响力方面表表现优异。 | |||||
站住。几位是要去沙漠吧?我劝你们,还是考虑清楚再出发。 |
事实上,不是淘宝,阿里巴巴太便宜了,而是商场贵了。 | |||||
偶尔我也会在那边做代班酒保,有什么话想跟我说的话,可以趁那时来酒馆找我。 |
我终生的使命就是爬啊爬啊,爬向大海,不管有多远。 | |||||
天地之间,无处不是野草般的生命。 |
记得我拉满弓弦的那一刻,每一支箭都承载了我的希望与决心,射日亦是如此。 | |||||
乔治,如果你假装做一个小宝宝,我保证会一直对你好的,永远都不会改变。 |
妈妈,我今天在幼儿园学到了好多新东西,不过最有意思的还是和大家一起玩跷跷板。 | |||||
这舞步只有俺老猪会跳!顶多还有那个紫色的家伙! |
只要有我在,咱们的团队就坚不可摧!来吧,让他们见识见识咱的实力! | |||||
寻寻觅觅,冷冷清清,凄凄惨惨戚戚。 |
夜幕低垂,星光璀璨,仿佛是无数诗人在天穹中镌刻下的永恒诗篇,直抵心灵。 | |||||
| en | No no! Just strange for him to not be around. Paimon always sees him standing here. |
The pilot was then re-shot with the different actors and aired. | ||||
Good morning. I'm in the habit of enjoying a cup of tea before breakfast. Shall I pour some for you? |
The nest is a flimsy platform of sticks is built by both sexes. | |||||
Everyone has their own desires. To bring together and fulfill those desires and make everyone happy — that is the purpose my Archon has bestowed upon me. |
Criticism centered on swing states such as Kentucky, Tennessee, Pennsylvania, and Maryland. | |||||
Hey, what's that supposed to mean!? You trying to say I'm not famous enough and my intentions are no good? |
It's a long way to Tipperary, it's a long way to go! | |||||
President Carter, President Clinton, President Bush, President Obama, fellow Americans. |
We must ensure the security and prosperity of our nation. Together, we can restore America's greatness and pride for future generations. | |||||
I have many brothers. For every Christmas there has ever been. |
In that distant era, countless legendary tales unfolded on this land, evoking a long-lost emotion in people's hearts. | |||||
While the second is part of presentaton. |
Our research indicates that prolonged high-intensity work may lead to significant mental health issues, challenging workplace health policies. | |||||
All right, check out this bad boy, twelve megabytes of RAM. |
As the sun set, he finally bid farewell to the town filled with memories, with new hopes in his heart. | |||||
| ja | どんなに長い戦いだろうと、あなたは耐え抜いてきた。その強敵を倒せば、苦労の末に成功を収めることができるわ! |
具体的当為は、我々が自己自身を否定するものによって生きるという個人的存在、 | ||||
あぁん、どういう意味だ?この俺様が有名じゃないと?それともなんだ、俺様が何か企んでるとでも思ってんのか? |
思いっきり投げたボールは見事にすっぽ抜けた | |||||
そうだね。アタシのお店にも沢山来てほしいし、楽しく食べて思いっきり遊ぼう! |
これ面白いと思ってるの俺だけだろうなあ | |||||
友人が集まるのは、楽しさを求めてのこと——考えすぎは、かえってある種の束縛になってしまうでござるな。 |
入試前の夏休みだというのに、本ばかり読みふけっていた。 | |||||
実は前回ここを訪れた時、我は胸が締め付けられるような思いであった。だが…此度はなぜか、違うようだ。 |
知るということも、存在と存在との関係である。 | |||||
| ko | 아무리 기나긴 전투라 한들 끝까지 버텨 이 강적을 처치한다면, 우여곡절 끝에 성공을 거머쥘 수 있을 거야! |
자네 허구 살려면 자연히 고생을 하지 않을 수가 없게 되지 않겠나. | ||||
별거 아냐. 항상 여기 있던 사람이 없길래, 이상해서 |
그들은 지게를 지고 갈서서 가면서 이런 말을 하였다. | |||||
그랬구나…. 어떤 표정을 지어야 할지 모르겠네 |
내 입은 진리를 말하며 내 입술은 악을 미워하느니라 | |||||
벗이란 즐거우려고 만나는 거지, 생각이 많아지면 속박이 되는 법이니까 |
간난이는 선비의 허리를 껴안으며 이렇게 중얼거렸다. | |||||
고발자가 한 말이 사실이라면, 이건 정말 심각한 문제인데… |
너는 벙어리와 고독한 자의 송사를 위하여 입을 열지니라 |