I build production web scraping platforms that deliver structured data at scale
Python | Playwright | FastAPI
Production news platform indexing 6,900+ articles per run from 250+ news sources updated hourly. (BBC, Reuters, Guardian, TechCrunch).
4-tier RSS fallback β Playwright scraper (paywalls) β hourly APScheduler β full-text search + REST API
6,900+ articles/run - 250+ sources - 99.9% uptime
FastAPI - PostgreSQL - Playwright - Next.js - APScheduler
OSINT dataset marketplace β 250K+ social records across 100+ datasets enriched with sentiment/topic analysis updated daily. (Reddit/YouTube/GitHub/Medium).
Pre-processed: sentiment scores, topic tags, engagement signals. Drop-in ready for Python/Tableau/LLMs.
250K+ total records - 100+ datasets - 15+ free datasets
FastAPI - PostgreSQL - Next.js - Paddle - AWS S3
Production Goodreads scraper β structured CSV/JSON datasets (quotes, authors, tags). BeautifulSoup - Pagination - Data validation - Multi-page
- Scraping: Playwright - BeautifulSoup - Asyncio - Proxy rotation
- Data: Pandas - NumPy - Parquet/JSONL exports
- Backend: FastAPI - PostgreSQL - APScheduler - JWT
- Frontend: Next.js 15 - Tailwind - TypeScript
- Infra: Railway - Vercel - Supabase - Docker
freeCodeCamp Certified: Responsive Web Design (Mar 2024) β’ Scientific Computing with Python (Nov 2025)
Hire Me β Fiverr
Custom scrapers β’ ETL pipelines β’ Data platforms β’ REST APIs