2026-05-24 16:49:21 +01:00
2026-05-24 16:49:21 +01:00
2026-05-22 20:24:58 +01:00
2026-05-22 19:42:08 +01:00
2026-05-22 20:24:58 +01:00
2026-05-22 20:24:58 +01:00
2026-03-30 18:18:41 +01:00
2026-05-22 20:36:33 +01:00
2026-05-24 16:49:21 +01:00
2026-05-22 19:56:46 +01:00
2026-05-24 16:25:39 +01:00

YouTube Auto Dub

YouTube Auto Dub is a Python pipeline that downloads a YouTube video, transcribes its speech with Whisper, translates the subtitle text through a local LM Studio server, and renders a subtitled output video.

What Changed

  • Translation now uses an OpenAI-compatible /v1/chat/completions endpoint.
  • Google Translate scraping has been removed from the active runtime path.
  • OpenAI compatible backend is now the default with no option for Google Translate.
  • Translation settings can be configured with environment variables or CLI flags.

Requirements

  • Python 3.10+
  • uv
  • FFmpeg and FFprobe available on PATH
  • An OpenAI-compatible server

Setup

Create a UV-managed virtual environment in a repo subfolder and install dependencies:

uv venv --python "C:\pinokio\bin\miniconda\python.exe" .venv
uv pip install --python .venv\Scripts\python.exe -r requirements.txt

Verify the local toolchain:

.venv\Scripts\python.exe --version
ffmpeg -version
ffprobe -version
.venv\Scripts\python.exe main.py --help

LM Studio Configuration

Start LM Studio's local server and load a translation-capable model. The default model name in this repo is:

gemma-3-4b-it

If your local LM Studio model name differs, set it with an environment variable or --lmstudio-model.

Environment Variables

$env:LM_STUDIO_BASE_URL="http://127.0.0.1:1234/v1"
$env:LM_STUDIO_API_KEY="lm-studio"
$env:LM_STUDIO_MODEL="gemma-3-4b-it"

Defaults if unset:

  • LM_STUDIO_BASE_URL=http://127.0.0.1:1234/v1
  • LM_STUDIO_API_KEY=lm-studio
  • LM_STUDIO_MODEL=gemma-3-4b-it

Usage

Basic example:

.venv\Scripts\python.exe main.py "https://youtube.com/watch?v=VIDEO_ID" --lang es

Gradio Web UI

Gradio provides a local browser UI for starting dub jobs, watching progress, and downloading finished videos:

.venv\Scripts\python.exe web_app.py

Open http://127.0.0.1:7860 and submit a YouTube URL. Jobs run through the same main.py pipeline, so the CLI options and environment variables still apply.

The OpenAI-compatible translation endpoint, API key, and model can be changed in the UI under OpenAI-Compatible Settings. Click Save Settings to persist them to .cache/web_settings.json for future web jobs. Unsaved values in the fields are still used for the next job you start.

You can also upload a local .mp4 instead of entering a YouTube URL. Uploaded videos are staged under .cache/uploads and processed with the same transcription, translation, dubbing, and render pipeline. Restricted YouTube videos can use the Upload Cookies File control instead of typing a local cookies path.

The web UI automatically refreshes job status, progress, steps, and output choices every few seconds while it is open. The manual Refresh button is still available.

Translations and raw TTS clips are cached under .cache/translations and .cache/tts. This lets reruns skip work that already succeeded, which is especially useful after transient TTS failures. Set TRANSLATION_CACHE_ENABLED=0 or TTS_CACHE_ENABLED=0 to disable those caches.

Docker

Build and run the Gradio UI in a container:

docker build -t youtube-auto-dub:gradio .
docker run --rm -p 7860:7860 `
  -e LM_STUDIO_BASE_URL=http://host.docker.internal:1234/v1 `
  -e LM_STUDIO_API_KEY=lm-studio `
  -e LM_STUDIO_MODEL=gemma-3-4b-it `
  -v ${PWD}\.cache:/app/.cache `
  -v ${PWD}\output:/app/output `
  -v ${PWD}\logs:/app/logs `
  -v ${PWD}\temp:/app/temp `
  youtube-auto-dub:gradio

Or use Compose:

docker compose up --build

When LM Studio runs on the host machine, use http://host.docker.internal:1234/v1 from inside Docker instead of http://127.0.0.1:1234/v1.

Override the LM Studio endpoint or model from the CLI:

.venv\Scripts\python.exe main.py "https://youtube.com/watch?v=VIDEO_ID" `
  --lang fr `
  --translation-backend lmstudio `
  --lmstudio-base-url http://127.0.0.1:1234/v1 `
  --lmstudio-model gemma-3-4b-it

Authentication options for restricted videos still work as before:

.venv\Scripts\python.exe main.py "https://youtube.com/watch?v=VIDEO_ID" --lang ja --browser chrome
.venv\Scripts\python.exe main.py "https://youtube.com/watch?v=VIDEO_ID" --lang de --cookies cookies.txt

Process a local MP4:

.venv\Scripts\python.exe main.py --input-file "C:\path\to\video.mp4" --lang es

CLI Options

Option Description
url YouTube video URL to process
--input-file Local MP4 file to process instead of a YouTube URL
--lang, -l Target language code
--browser, -b Browser name for cookie extraction
--cookies, -c Path to exported cookies file
--gpu Prefer GPU acceleration when CUDA is available
--whisper_model, -wm Override Whisper model
--translation-backend Translation backend, currently lmstudio
--lmstudio-base-url Override LM Studio base URL
--lmstudio-model Override LM Studio model name

Translation Behavior

The LM Studio translator is tuned for subtitle-like text:

  • preserves meaning, tone, and intent
  • keeps punctuation natural
  • returns translation text only
  • preserves line and segment boundaries
  • leaves names, brands, URLs, emails, code, and proper nouns unchanged unless transliteration is clearly needed
  • avoids commentary, summarization, and censorship

Translation is currently performed segment-by-segment to keep subtitle ordering deterministic and reduce the risk of malformed batched output corrupting timing alignment.

Testing

Run the focused validation suite:

.venv\Scripts\python.exe -m pytest
.venv\Scripts\python.exe main.py --help

The tests cover:

  • LM Studio request payload construction
  • response parsing
  • retry handling for transient HTTP failures
  • empty or malformed response handling
  • CLI and environment config precedence

Troubleshooting

LM Studio connection errors

  • Make sure LM Studio's local server is running.
  • Confirm the base URL ends in /v1.
  • Check that the loaded model name matches LM_STUDIO_MODEL or --lmstudio-model.

Empty or malformed translations

  • Try a stronger local instruction-tuned model if your current model ignores formatting.
  • Keep LM Studio in non-streaming OpenAI-compatible mode.
  • Review the server logs for model-side failures.

FFmpeg missing

If startup reports missing ffmpeg or ffprobe, install FFmpeg and add it to your system PATH.

Project Layout

youtube-auto-dub/
|-- main.py
|-- requirements.txt
|-- language_map.json
|-- README.md
|-- LM_STUDIO_MIGRATION.md
|-- src/
|   |-- core_utils.py
|   |-- engines.py
|   |-- media.py
|   |-- translation.py
|   `-- youtube.py
`-- tests/
    |-- conftest.py
    |-- test_main_cli.py
    `-- test_translation.py
Description
No description provided
Readme MIT 215 KiB
Languages
Python 96.5%
PowerShell 3.5%