A fast, efficient, and feature-rich Telegram bot built with grammY, Bun, and SQLite. It provides powerful regex-based substitution (sed style) with a focus on performance, scalability, and robust error handling.
- Sed-Style Substitution: Use
s/pattern/replacement/flagscommands to perform regex substitutions on messages within the chat history or on specific replies. - Edit Support: Edit your
s/.../.../commands, and the bot will automatically update its corresponding reply with the new substitution result. - High-Performance Worker Pool: Regex operations are offloaded to a pool of Bun Worker threads, ensuring the bot remains responsive even under heavy load or with complex patterns.
- Dynamic Worker Pool V2: Optional advanced worker pool with dynamic scaling, health monitoring, and automatic idle worker termination.
- Performance Timing: Use the
pflag (e.g.,s/pattern/repl/p) to measure and display the execution time of the substitution chain. - Regex Pattern Caching: LRU cache with TTL for compiled regex patterns, significantly improving performance for repeated patterns.
- Per-User Rate Limiting: Configurable rate limiting to prevent spam and abuse (default: 30 commands/minute per user).
- Health Monitoring: Real-time health metrics with automatic status detection (healthy/degraded/unhealthy).
- Configurable Logging: Features a custom, module-based logger with configurable levels (
none,debug,info,warn,error,fatal) and a customizable output template. - Target Protection: Prevents
s/.../.../commands from operating on others/.../.../command messages, avoiding unintended behavior. - Runtime Safety: Includes a configurable timeout (default 60 seconds) for regex execution to prevent hanging on potentially malicious or extremely slow patterns.
- Opportunistic Cleanup: Automatically removes message history and bot reply mappings older than 48 hours on every bot update for efficiency.
- Error Resilience: Handles Telegram API errors gracefully (e.g., "message is not modified", flood control) and avoids resending identical messages unnecessarily.
- Custom Error Hierarchy: Granular error types with user-friendly messages (RegexError, RateLimitError, WorkerError, etc.).
- Circuit Breaker Pattern: Prevents cascading failures by stopping requests to failing services.
- Multi-Language Support: Full i18n with 11 languages including English, German, Spanish, Italian, Polish, Swedish, Russian, Ukrainian, Japanese, Korean, and Chinese (Simplified).
- Grouping Support: Fully supports regex capture groups (
(\w+)) and referencing them in the replacement string using$1(modern way), or\1(old regexbot, legacy way), with support for mixed syntax
regexYbot supports multiple languages with automatic detection based on Telegram user settings:
Supported Languages:
- 🇺🇸 English (default)
- 🇩🇪 German (Deutsch)
- 🇪🇸 Spanish (Español)
- 🇮🇹 Italian (Italiano)
- 🇵🇱 Polish (Polski)
- 🇸🇪 Swedish (Svenska)
- 🇷🇺 Russian (Русский)
- 🇺🇦 Ukrainian (Українська)
- 🇯🇵 Japanese (日本語)
- 🇰🇷 Korean (한국어)
- 🇨🇳 Chinese Simplified (简体中文)
Language Commands:
/language- Show your current language/language list- List all available languages/language set <code>- Change language (e.g.,/language set de)
The bot automatically detects your language from Telegram settings. If your language isn't supported, it falls back to English. Translations are stored in the locales/ directory using the Fluent format. Contributions for new languages or improvements are welcome!
/start: Get a greeting message and a brief guide on how to use the bot./privacy: Displays the bot's privacy policy.s/find/replace/flags: Performs a regex substitution.- Example:
s/old/new/gireplaces all occurrences of "old" (case-insensitive) with "new". - Example with Groups:
s/(\w+) (\w+)/$2 $1/(modern way), ors/(\w+) (\w+)/\2 \1/(old regexbot, legacy way) swaps the first two words in a message, regexy supports both modes at the same time, mixing(/$2 \1/) is supported too. - Example with Performance:
s/complex_pattern/replacement/gipperforms a global, case-insensitive substitution and prints the execution time.
- Example:
Configure the bot's behavior with the following environment variables:
| Variable | Required | Description | Default Value |
|---|---|---|---|
TOKEN |
Yes | Your Telegram bot token. | — |
BASE_URL |
No | Base URL for the Telegram Bot API, useful for local testing. | https://api.telegram.org |
LOG_LEVEL |
No | Sets the minimum log level. Available levels: none, debug, info, warn, error, fatal. |
debug (development)info (production) |
LOG_TEMPLATE |
No | Customizes the log output format. | [{level}: {module}]: {message} |
NODE_ENV |
No | Set to production to default the log level to info. |
— |
WORKER_TIMEOUT_MS |
No | Maximum time a regex operation can run before being terminated (milliseconds). | 60000 |
WORKER_POOL_MIN_WORKERS |
No | Minimum number of workers to maintain. | 0 |
WORKER_POOL_MAX_WORKERS |
No | Maximum number of workers allowed. | 8 |
WORKER_POOL_INITIAL_WORKERS |
No | Number of workers to spawn at startup. | 1 |
WORKER_POOL_IDLE_TIMEOUT_MS |
No | Time before idle workers are terminated. | 300000 |
WORKER_POOL_IDLE_CHECK_INTERVAL_MS |
No | How often to check for idle workers. | 60000 |
GRACEFUL_DRAIN |
No | Enable graceful drain on shutdown. Processes pending tasks instead of rejecting them. | false |
GRACEFUL_DRAIN_TIMEOUT_MS |
No | Maximum time to spend draining queue during shutdown (milliseconds). Max 9500ms for Docker compatibility. | 8000 |
MAX_CHAIN_LENGTH |
No | Maximum number of sed commands that can be chained together. | 5 |
MAX_MESSAGE_LENGTH |
No | Maximum length of the bot's response message. | 4096 |
CLEANUP_INTERVAL_MS |
No | How often to clean up old message history (milliseconds). | 172800000 (48 hours) |
MAX_HISTORY_PER_CHAT |
No | Maximum number of messages to keep in history per chat. | 20 |
HISTORY_QUERY_LIMIT |
No | Maximum number of messages to search when finding a target. | 10 |
RETRY_MAX_RETRIES |
No | Maximum number of retries for Telegram API calls. | 3 |
RETRY_MAX_DELAY_MS |
No | Maximum delay between retries for Telegram API calls (milliseconds). | 30000 |
RATE_LIMIT_ENABLED |
No | Enable per-user rate limiting to prevent spam. | true |
RATE_LIMIT_COMMANDS_PER_MINUTE |
No | Maximum number of commands a user can send per minute. | 30 |
CACHE_ENABLED |
No | Enable LRU caching for compiled regex patterns. | true |
CACHE_MAX_SIZE |
No | Maximum number of entries in the regex pattern cache. | 1000 |
CACHE_TTL_MS |
No | Time-to-live for cached patterns in milliseconds. | 300000 (5 min) |
ENABLE_FILE_HEALTHCHECK |
No | Enable file-based healthcheck for Docker environments. | false |
LIVENESS_FILE |
No | Path to the liveness file when healthcheck is enabled. | /tmp/bot-alive |
LIVENESS_INTERVAL_MS |
No | How often to update the liveness file (milliseconds). | 30000 |
Pre-built binaries are available for Linux and Windows:
-
Download the latest release from GitHub Releases
- Linux:
regexybot-linux-x64.tar.gz - Windows:
regexybot-windows-x64.zip
- Linux:
-
Extract the binary:
# Linux tar -xzf regexybot-linux-x64.tar.gz # Windows (PowerShell) Expand-Archive regexybot-windows-x64.zip
-
Run with your bot token:
# Linux/macOS TOKEN=your_telegram_bot_token ./regexybot-linux-x64 # Windows (Command Prompt) set TOKEN=your_telegram_bot_token regexybot-windows-x64.exe # Windows (PowerShell) $env:TOKEN="your_telegram_bot_token" .\regexybot-windows-x64.exe
Note: Binaries are self-contained and don't require Bun or Node.js to be installed.
- Ensure you have Bun installed.
- Clone this repository.
- Set your Telegram bot token in an environment variable:
export TOKEN="YOUR_TELEGRAM_BOT_TOKEN"
- Run the bot from the project's root directory:
bun index.ts
Important: This bot uses an in-memory SQLite database (sqlite://:memory:) by default. This means:
- All message history and reply mappings are ephemeral - they are lost when the bot restarts
- The retention window (48 hours by default) is designed to support Telegram's edit window and reply-less sed behavior
- No persistent storage is required or used
- For production deployments, this design is intentional - the bot does not store any data permanently
If you need persistent storage (not recommended for this use case), you would need to modify index.ts to use a file-based SQLite database instead.
The project is organized into several modules for clarity and maintainability:
index.ts: The main application entry point and bot wiring. Thin composition root that orchestrates other modules.config.ts: Centralized configuration with typed env var loading and validation.database.ts: Database service layer withDatabaseServiceclass for message history and reply tracking.workerPool.ts: Worker pool management for concurrent regex processing.sed.ts: Sed command parsing and handling logic (parseSedCommands,SedHandler).hellspawn.ts: The worker script that performs the actual regex substitution in separate threads.logger.ts: A custom, configurable logging utility.types.ts: Contains shared TypeScript types and interfaces.utils.ts: Houses shared helper functions (regex patterns, escaping, flag normalization).
- grammY: Modern Telegram Bot Framework.
- @grammyjs/runner: For concurrent update processing.
- @grammyjs/commands: For structured command handling.
- Bun: High-performance JavaScript runtime.
- bun:sqlite: Bun's native, fast SQLite driver.
- Bun Worker API: For parallel, non-blocking regex execution.
This project uses a two-branch workflow:
- Contains production-ready code
- Merges happen from
devvia pull requests - Docker images are tagged with:
release- stable release markerlatest- floats to most recent build (becomes stable after merge)- Version numbers from
package.json(e.g.,0.1.7.1,0.1.7,0.1) - Git commit hash
- Active development happens here
- Feature branches merge into
dev - Docker images are tagged with:
dev- latest development buildnext- upcoming release previewlatest- floats to most recent build (overwritten by dev activity)dev-<version>- version-specific dev build (e.g.,dev-0.1.7.1)
- Create feature branches from
dev - Open PRs targeting
dev - When ready for release, open PR from
devtomain - After merging to
main, Docker images are built with release tags
The bot supports graceful shutdown for Docker deployments:
Default Behavior (Immediate Shutdown):
- On SIGTERM/SIGINT, immediately stops accepting updates
- Queued tasks are rejected
- Fast shutdown suitable for most use cases
Graceful Drain Mode (Optional):
Enable with GRACEFUL_DRAIN=true to process pending tasks before shutting down:
environment:
- GRACEFUL_DRAIN=true
- GRACEFUL_DRAIN_TIMEOUT_MS=8000Important considerations:
- Graceful drain must complete within Docker's stop grace period (default: 10s)
- Default drain timeout is 8000ms (8s) to fit within Docker's grace period
- Maximum recommended: 9500ms (9.5s) to avoid SIGKILL
- If queue is too large to drain in time, remaining tasks are lost
- Useful for deployments where you don't want to lose pending operations
Adjusting Docker Grace Period: If you need more time for graceful drain, increase the container's grace period:
services:
regexybot:
stop_grace_period: 20s # Increase from default 10s
environment:
- GRACEFUL_DRAIN=true
- GRACEFUL_DRAIN_TIMEOUT_MS=18000 # 18s (under 20s grace period)A test script is provided to verify graceful shutdown behavior:
cd docker
./test-graceful-shutdown.shThis tests:
- Immediate shutdown behavior
- Graceful drain with pending tasks
- Docker Compose stop/restart scenarios
- SIGINT vs SIGTERM handling