Heshan Sanjuka hexsyro

Hi, I'm Heshan 👨‍💻

I build production web scraping platforms that deliver structured data at scale

Python | Playwright | FastAPI

Featured Projects

Pulse Aggregator

Production news platform indexing 6,900+ articles per run from 250+ news sources updated hourly. (BBC, Reuters, Guardian, TechCrunch).

4-tier RSS fallback → Playwright scraper (paywalls) → hourly APScheduler → full-text search + REST API 6,900+ articles/run - 250+ sources - 99.9% uptime FastAPI - PostgreSQL - Playwright - Next.js - APScheduler

Social Intel

OSINT dataset marketplace — 250K+ social records across 100+ datasets enriched with sentiment/topic analysis updated daily. (Reddit/YouTube/GitHub/Medium).

Pre-processed: sentiment scores, topic tags, engagement signals. Drop-in ready for Python/Tableau/LLMs. 250K+ total records - 100+ datasets - 15+ free datasets FastAPI - PostgreSQL - Next.js - Paddle - AWS S3

GoodQuote Scraper

Production Goodreads scraper → structured CSV/JSON datasets (quotes, authors, tags). BeautifulSoup - Pagination - Data validation - Multi-page

Production Tech Stack

Scraping: Playwright - BeautifulSoup - Asyncio - Proxy rotation
Data: Pandas - NumPy - Parquet/JSONL exports
Backend: FastAPI - PostgreSQL - APScheduler - JWT
Frontend: Next.js 15 - Tailwind - TypeScript
Infra: Railway - Vercel - Supabase - Docker

freeCodeCamp Certified: Responsive Web Design (Mar 2024) • Scientific Computing with Python (Nov 2025)

Hire Me → Fiverr

Custom scrapers • ETL pipelines • Data platforms • REST APIs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly