🎯 Data Platform Engineer in Beijing
Passionate about building scalable systems and bridging AI with data processing.
I love exploring distributed computing, workflow orchestration, and modern data storage formats.
- Focused on Serverless-based data processing platforms serving AI workloads
- Experienced with Ray, Lance, Spark, Flink, and KS3
- Enjoy designing composable pipelines: sampling → validation → caching → export
- Currently exploring how LLMs can enhance data workflow automation
- AI Data Infrastructure – storage–compute decoupling, scalable pipelines
- Distributed Systems – Ray, Daft, and high-performance data processing
- System Design & Platformization – from developer experience to performance
- Learning How to Learn – structured note-taking with MarginNote
Languages & Backend:
Go · Python · Java
Frontend:
HTML · CSS · JavaScript · Vue
Big Data / Storage / Lakehouse:
Hadoop · HDFS · Spark · Flink · Hive · serverless computing · Lance
Modern Data Infra & Tools:
Ray · LanceDB · JuiceFS · KS3 · Kubernetes · Daft
Observability / DevOps :
Prometheus · Grafana · Docker · Helm
- AI + data engineering integration
- Efficient note-taking & reflective learning
- Building intelligent data workflows with LLMs
📍 Location: Beijing, China
💬 GitHub: github.com/wangpi26
⭐ "Think deeply, build elegantly, and learn continuously."
