A powerful command-line diagnostic tool for Apache Kafka connectivity
kshark acts like a network sniffer for Kafka, providing comprehensive health checks of your entire agent-to-broker communication path. It systematically tests every layer from DNS resolution through TLS security to Kafka protocol-level interactions, helping developers and SREs quickly identify and resolve connectivity issues.
Tools are the difference between AI assistants and AI operators.
Without reliable, structured tools, AI agents are limited to conversation. With them, they become autonomous operators capable of real diagnostic work. kshark is designed from the ground up to be agent-executable:
- Deterministic Output: Structured JSON and HTML reports that agents can parse, analyze, and act upon
- Layered Diagnostics: Clear failure isolation (L3βL4βL5-6βL7) so agents know exactly where problems occur
- Zero Human Intervention: Fully automated execution with
-yflag, timeout controls, and exit codes - Rich Context: Every test returns actionable data β not just pass/fail, but metrics, timestamps, and error details
Why This Matters:
Human operators diagnose Kafka issues by running multiple commands: nslookup, telnet, openssl s_client, kafka-console-producer. They correlate failures, infer root causes, and retry with different parameters.
AI agents need to do the same work β but they need tools that package that workflow into a single, reliable interface. kshark is that interface for Kafka connectivity validation.
When an agent needs to validate a client configuration, diagnose a connection failure, or verify topic accessibility, kshark provides the structured, comprehensive output required to make informed decisions and take corrective action.
This is the future of infrastructure operations: agents equipped with purpose-built diagnostic tools, operating autonomously to validate, diagnose, and remediate connectivity issues before they impact production.
Made with β€οΈ for the Kafka community
- Key Features
- Quick Start
- Installation
- Usage
- Configuration
- Documentation
- Architecture
- Contributing
- License
-
Layered Diagnostics: Systematically tests all layers of connectivity:
- L3 (Network): DNS resolution and latency
- L4 (Transport): TCP connection establishment
- L5-6 (Security): TLS handshake, certificate validation, and expiry monitoring
- L7 (Application): Kafka protocol, metadata retrieval, topic visibility
- L7 (HTTP): Schema Registry and REST Proxy connectivity
- Diagnostics: Traceroute, MTU discovery, network path analysis
-
End-to-End Testing: Full produce-and-consume loop validation to verify complete data flow
-
Multiple Authentication Methods:
- SASL/PLAIN
- SASL/SCRAM-SHA-256
- SASL/SCRAM-SHA-512
- Mutual TLS (mTLS)
- SASL/GSSAPI (Kerberos) - with build tag
- Intelligent Root Cause Detection: Identifies which layer is causing issues
- Actionable Recommendations: Provides specific fix suggestions
- Multiple AI Providers: Supports OpenAI, Scalytics-Connect, and custom endpoints
- Automatic Problem Prioritization: Focuses on critical failures first
- Familiar Configuration: Java properties file format (works with existing Kafka configs)
- Rich Output Formats:
- Color-coded console output
- Detailed HTML reports with visual summaries
- Machine-readable JSON export (Premium)
- Quick Presets: Pre-configured templates for common Kafka distributions
- Confluent Cloud
- Bitnami
- AWS MSK
- Plaintext (development)
- Cross-Platform: Linux, macOS, Windows (amd64, arm64)
- Docker Support: Minimal Alpine-based container (~50MB)
- Kubernetes-Ready: CronJob examples for continuous monitoring
- CI/CD Integration: Automated releases via GitHub Actions + GoReleaser
- Security-Focused:
- Credential redaction in reports
- Command injection prevention
- TLS 1.2+ enforcement
- Non-root container execution
# 1. Download the latest release for your platform
wget https://github.com/your-org/kshark-core/releases/latest/download/kshark-linux-amd64.tar.gz
tar -xzf kshark-linux-amd64.tar.gz
# 2. Create a configuration file
cat > client.properties <<EOF
bootstrap.servers=your-broker.example.com:9092
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.username=your-api-key
sasl.password=your-api-secret
EOF
# 3. Run the diagnostic
./kshark -props client.properties
# 4. Test with a specific topic
./kshark -props client.properties -topic test-topic# Use a preset for Confluent Cloud
./kshark --preset cc-plain -props client.propertiesDownload the latest release for your platform from the Releases page:
# Linux (amd64)
wget https://github.com/your-org/kshark-core/releases/latest/download/kshark-linux-amd64.tar.gz
tar -xzf kshark-linux-amd64.tar.gz
# macOS (arm64 - Apple Silicon)
wget https://github.com/your-org/kshark-core/releases/latest/download/kshark-darwin-arm64.tar.gz
tar -xzf kshark-darwin-arm64.tar.gz
# Windows (amd64)
wget https://github.com/your-org/kshark-core/releases/latest/download/kshark-windows-amd64.zip
unzip kshark-windows-amd64.zipVerify the checksum:
sha256sum -c checksums.txtPrerequisites: Go 1.23 or newer
# Clone the repository
git clone https://github.com/your-org/kshark-core.git
cd kshark-core
# Build
go build -o kshark ./cmd/kshark
# Verify
./kshark --version# Pull the image (when published)
docker pull your-registry/kshark:latest
# Or build locally
docker build -t kshark:latest .
# Run with mounted configuration
docker run -v $(pwd):/config kshark:latest -props /config/client.properties# Basic connectivity check
./kshark -props client.properties
# Check with topic validation
./kshark -props client.properties -topic my-topic
# Include produce/consume test
./kshark -props client.properties -topic my-topic
# Skip confirmation prompt (for automation)
./kshark -props client.properties -y
# Adjust global timeout
./kshark -props client.properties -timeout 120s
# Adjust Kafka metadata timeout
./kshark -props client.properties -kafka-timeout 20s
# Adjust produce/consume timeout
./kshark -props client.properties -op-timeout 30s
# Adjust produce/consume timeouts independently
./kshark -props client.properties -produce-timeout 20s -consume-timeout 45s
# Control probe read start offset
./kshark -props client.properties -start-offset latest
# Select partition balancer for probes
./kshark -props client.properties -topic my-topic -balancer rr
# Generate HTML report
./kshark -props client.properties -topic my-topic
# Report saved to: reports/analysis_report_<hostname>_<timestamp>.html# AI-powered analysis
./kshark -props client.properties -topic my-topic --analyze
# Export to JSON
./kshark -props client.properties -json report.json# Confluent Cloud
./kshark --preset cc-plain -props client.properties
# AWS MSK with IAM
./kshark --preset self-scram -props client.properties
# Local development (no security)
./kshark --preset plaintext -props client.properties| Flag | Description | Default | Example |
|---|---|---|---|
-props |
Path to properties file | (required) | -props config.properties |
-topic |
Comma-separated topics to test | (optional) | -topic orders,payments |
-group |
Consumer group for probe | (ephemeral) | -group kshark-probe |
--preset |
Configuration preset | (none) | --preset cc-plain |
-timeout |
Global timeout for entire scan | 60s | -timeout 120s |
-kafka-timeout |
Kafka metadata/dial timeout | 10s | -kafka-timeout 20s |
-op-timeout |
Produce/consume timeout | 10s | -op-timeout 30s |
-produce-timeout |
Produce timeout (overrides -op-timeout) |
(none) | -produce-timeout 20s |
-consume-timeout |
Consume timeout (overrides -op-timeout) |
(none) | -consume-timeout 45s |
-start-offset |
Probe read start offset (`earliest | latest`) | earliest |
-balancer |
Probe partition balancer (`least | rr | random`) |
-diag |
Enable traceroute/MTU diagnostics | true | -diag=false |
-log |
Write detailed scan log to file | auto | -log /tmp/kshark.log |
--analyze |
Enable AI analysis | false | --analyze |
-no-ai |
Skip AI analysis even if enabled | false | -no-ai |
-provider |
AI provider name from ai_config.json |
(default) | -provider openai |
-json |
Export to JSON file | (none) | -json output.json |
-y |
Skip confirmation prompt | false | -y |
--version |
Show version info | - | --version |
kshark uses standard Java properties format, compatible with Kafka client configurations:
# Broker connection
bootstrap.servers=broker1.example.com:9092,broker2.example.com:9092
# Security protocol
security.protocol=SASL_SSL
# SASL configuration
sasl.mechanism=SCRAM-SHA-256
sasl.username=your-username
sasl.password=your-password
# TLS configuration
ssl.ca.location=/path/to/ca-cert.pem
ssl.certificate.location=/path/to/client-cert.pem
ssl.key.location=/path/to/client-key.pem
# Optional: Schema Registry
schema.registry.url=https://schema-registry.example.com
basic.auth.user.info=sr-key:sr-secret
# Optional: REST Proxy
rest.proxy.url=https://rest-proxy.example.comSee Configuration Guide for complete list.
For AI-powered analysis, create ai_config.json:
{
"provider": "openai",
"api_key": "sk-...",
"api_endpoint": "https://api.openai.com/v1/chat/completions",
"model": "gpt-4"
}Or use environment variables:
export KSHARK_AI_PROVIDER=openai
export KSHARK_AI_API_KEY=sk-...Comprehensive documentation is available in the docs/ directory:
| Document | Description |
|---|---|
| ARCHITECTURE.md | System architecture, components, and design patterns |
| FEATURES.md | Complete feature list and usage examples |
| DEPLOYMENT.md | Deployment guides for Docker, Kubernetes, and CI/CD |
| SECURITY.md | Security audit, OWASP analysis, and recommendations |
- Architecture Overview: docs/ARCHITECTURE.md
- Feature Documentation: docs/FEATURES.md
- Deployment Guide: docs/DEPLOYMENT.md
- Security Best Practices: docs/SECURITY.md
- API Documentation: GoDoc
kshark uses a layered testing approach to systematically validate connectivity:
βββββββββββββββββββββββββββββββββββββββββββ
β L3: Network Layer β
β β’ DNS Resolution β
β β’ Hostname to IP mapping β
βββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββ
β L4: Transport Layer β
β β’ TCP Connection β
β β’ Latency Measurement β
βββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββ
β L5-6: Security Layer β
β β’ TLS Handshake β
β β’ Certificate Validation β
β β’ Expiry Monitoring β
βββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββ
β L7: Application Layer β
β β’ Kafka Protocol β
β β’ SASL Authentication β
β β’ Metadata Retrieval β
β β’ Produce/Consume Test β
βββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββ
β Diagnostics β
β β’ Traceroute / Path Analysis β
β β’ MTU Discovery β
βββββββββββββββββββββββββββββββββββββββββββ
For detailed architecture information, see ARCHITECTURE.md.
kshark-core/
βββ cmd/kshark/ # Main application source code
β βββ main.go # Single-file application (1,350 lines)
βββ web/templates/ # HTML report templates
βββ docs/ # Documentation
β βββ ARCHITECTURE.md # Architecture overview
β βββ FEATURES.md # Feature documentation
β βββ DEPLOYMENT.md # Deployment guide
β βββ SECURITY.md # Security recommendations
β βββ images/ # Documentation images
βββ .github/workflows/ # CI/CD automation
βββ reports/ # Generated reports (gitignored)
βββ Dockerfile # Container build definition
βββ .goreleaser.yaml # Release configuration
βββ go.mod # Go module definition
βββ LICENSE # Apache 2.0 license
βββ README.md # This file
Console Output:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β kshark Diagnostic Report β
β Target: broker.example.com β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
[L3: Network Layer]
β DNS Resolution: broker.example.com β 192.0.2.1 (45ms)
[L4: Transport Layer]
β TCP Connection: 192.0.2.1:9092 established (123ms)
[L5-6: Security Layer]
β TLS Handshake: TLS 1.3 successful (234ms)
β Certificate: CN=broker.example.com, expires in 87 days
β Certificate Expiry: Certificate expires in <90 days
[L7: Application Layer]
β Kafka Metadata: 3 brokers, 42 partitions
β Topic Visibility: 'orders' found with 6 partitions
β Produce/Consume: Message round-trip successful (456ms)
[Diagnostics]
β Network Path: 8 hops, avg latency 45ms
β MTU: 1500 bytes (standard Ethernet)
Summary: 9 OK, 1 WARN, 0 FAIL
# Build image
docker build -t kshark:latest .
# Run diagnostic
docker run --rm \
-v $(pwd)/client.properties:/config/client.properties:ro \
-v $(pwd)/reports:/reports \
kshark:latest -props /config/client.properties -topic test -y
# Check the report
open reports/analysis_report_*.htmlapiVersion: batch/v1
kind: CronJob
metadata:
name: kafka-health-check
spec:
schedule: "*/15 * * * *" # Every 15 minutes
jobTemplate:
spec:
template:
spec:
containers:
- name: kshark
image: kshark:latest
args: ["-props", "/config/client.properties", "-topic", "health-check", "-y"]
volumeMounts:
- name: config
mountPath: /config
readOnly: true
volumes:
- name: config
secret:
secretName: kafka-credentials
restartPolicy: OnFailureProblem: DNS resolution fails
Solution: Check DNS server configuration, verify hostname is correct
Check: nslookup your-broker.example.com
Problem: TLS handshake fails
Solution: Verify TLS version support, check certificate chain
Check: openssl s_client -connect broker.example.com:9092 -showcerts
Problem: SASL authentication fails
Solution: Verify credentials, check SASL mechanism matches broker config
Common issues: Wrong mechanism (PLAIN vs SCRAM), incorrect credentials
Problem: "license.key required" error
Solution: AI analysis and JSON export are premium features
Option 1: Obtain a license.key file
Option 2: Use standard console/HTML output (free)
For verbose output, check the generated HTML report which includes:
- Full configuration (credentials redacted)
- Detailed error messages
- Network diagnostic output
- Timestamp and version information
We welcome contributions! Please see our contributing guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Clone your fork
git clone https://github.com/your-username/kshark-core.git
cd kshark-core
# Install dependencies
go mod download
# Run tests (if available)
go test ./...
# Build
go build -o kshark ./cmd/kshark
# Test locally
./kshark -props client.properties.exampleFor security concerns and vulnerability reports, please see SECURITY.md.
Security Features:
- Credential redaction in all outputs
- Command injection prevention
- Path traversal protection
- TLS 1.2+ enforcement
- Non-root container execution
Known Security Considerations:
- Credentials stored in plain text configuration files (use file permissions 0600)
- SSRF risk with Schema Registry URLs (validate URLs before use)
- See SECURITY.md for detailed analysis and recommendations
- Unit test coverage
- Concurrency for multi-broker checks
- Historical trend analysis
- Prometheus metrics export
- OpenTelemetry integration
- Modular architecture (separate packages)
- Additional authentication methods (OAuth)
- REST API mode
This project is licensed under the Apache License 2.0. See the LICENSE file for complete details.
Copyright 2025 kshark Contributors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
- Built with segmentio/kafka-go
- Inspired by network diagnostic tools like
tcpdump,wireshark, andnetcat - Special thanks to the Kafka community
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Made with β€οΈ for the Kafka community
