transcriber

A fast video transcription CLI tool powered by whisper.cpp. Extracts audio from video files and transcribes it to text with GPU acceleration.

Features

Video → Audio → Text: Automatic audio extraction via FFmpeg
GPU Accelerated: Metal (macOS M*), Vulkan (NVIDIA), CUDA, or CPU fallback
Multiple Output Formats: TXT, SRT (subtitle), JSON (with word-level timestamps), MD (Markdown with YAML front matter)
Batch Processing: Transcribe entire directories recursively
Model Quantization: Q4_K/Q5_K/Q6_K/Q8_0 for speed/size tradeoffs
Configurable: YAML configuration file with CLI argument overrides
Progress Reporting: Real-time progress bars for downloads and transcription

Installation

Prerequisites

FFmpeg: Required for audio extraction
- macOS: brew install ffmpeg
- Ubuntu: sudo apt install ffmpeg
- Windows: Download from https://ffmpeg.org/

Homebrew (macOS / Linux)

brew tap RedAtman/tap
brew install transcriber

Taps into RedAtman/homebrew-tap. Formula is automatically updated on each release.

From Source

git clone <repo-url>
cd transcriber
cargo build --release

The binary is at ./target/release/transcriber.

Usage

Single File Transcription

# Transcribe with default settings (base model, txt output)
transcriber -i video.mp4

# Specify model and language
transcriber -i video.mp4 -m medium -l zh

# Custom output directory and SRT format
transcriber -i video.mp4 -o ./subtitles --format srt

Batch Processing

# Transcribe all videos in a directory
transcriber -d ./videos

# Skip already-transcribed files
transcriber -d ./videos --skip-existing

# Multiple output formats
transcriber -d ./videos --format "txt,srt,json"

Inference Parameters

# Initial prompt for decoder context
transcriber -i video.mp4 --initial-prompt "technical terms"

# Sampling temperature (0.0 = deterministic, 1.0 = more random)
transcriber -i video.mp4 --temperature 0.2

# Suppress non-speech tokens
transcriber -i video.mp4 --suppress-non-speech

# No-speech detection threshold
transcriber -i video.mp4 --no-speech-threshold 0.5

# Split on word boundaries
transcriber -i video.mp4 --split-on-word

# Combined example
transcriber -i video.mp4 -m medium -l en --initial-prompt "technology" --temperature 0.3
transcriber -i video.mp4 -m medium -l zh --initial-prompt "科技" --temperature 0.3

Media Metadata

# Add custom metadata to the output (stored in Markdown YAML front matter)
transcriber -i video.mp4 --format md --meta title="My Talk" --meta location=Beijing

# Media metadata is auto-detected via ffprobe when using md format:
#   - source: filename, size, format, bitrate, duration
#   - video: codec, resolution, FPS
#   - audio: codec, sample rate, channels

Configuration

# Generate default config file
transcriber init

# Use custom config
transcriber -i video.mp4 --config ./my-config.yaml

Default config location: ~/.config/transcriber/config.yaml

Available Models

Model	Size	Description
tiny	75 MB	Fastest, lowest quality
base	148 MB	Default, balanced
small	488 MB	Better quality
medium	1.5 GB	High quality
large-v3-turbo	800 MB	Best quality/speed ratio

Output Formats

TXT: Plain text, one segment per line
SRT: SubRip subtitle format with timestamps
JSON: Structured data with word-level timing and metadata
MD: Markdown with YAML front matter — includes auto-detected media metadata (source, video, audio info), custom metadata, and plain text body (no timestamps)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.githooks		.githooks
.github/workflows		.github/workflows
openspec		openspec
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
RELEASE.md		RELEASE.md
build.rs		build.rs
config.yaml.example		config.yaml.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

transcriber

Features

Installation

Prerequisites

Homebrew (macOS / Linux)

From Source

Usage

Single File Transcription

Batch Processing

Inference Parameters

Media Metadata

Configuration

Available Models

Output Formats

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

transcriber

Features

Installation

Prerequisites

Homebrew (macOS / Linux)

From Source

Usage

Single File Transcription

Batch Processing

Inference Parameters

Media Metadata

Configuration

Available Models

Output Formats

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages