Speech AI on AI Tech Blog

Speech AI on AI Tech Blog https://jesamkim.github.io/ai-tech-blog/categories/speech-ai/ Recent content in Speech AI on AI Tech Blog Hugo -- 0.147.6 ko Tue, 09 Jun 2026 20:00:00 +0900 음성을 이해하고 만들기까지: SSL부터 Zero-Shot Voice Cloning으로 가는 길 https://jesamkim.github.io/ai-tech-blog/posts/2026-06-09-modern-speech-ai-from-ssl-to-zero-shot-voice-cloning/ Tue, 09 Jun 2026 20:00:00 +0900 https://jesamkim.github.io/ai-tech-blog/posts/2026-06-09-modern-speech-ai-from-ssl-to-zero-shot-voice-cloning/ 라벨 없이 음성을 학습하는 표현 학습(SSL)과 텍스트-음성 합성(TTS)이 어떻게 발전했고, 두 흐름이 zero-shot voice cloning에서 어떻게 만나는지를 정리했습니다. CPC와 wav2vec 2.0, HuBERT의 설계 결정부터 WaveNet 계열 보코더, Voicebox의 flow matching, CosyVoice 2의 LLM 기반 합성까지 직관 위주로 따라가 봅니다.