I help teams stabilize data pipelines, improve ML system performance, and integrate modern AI (LLMs) into production systems that drive reliable business decisions and measurable impact.
Former Google Data Scientist with 9+ years of experience building and improving production ML systems with real business impact (~$400M annual revenue impact on YouTube Ads). Currently, I focus on building end-to-end data and predictive ML systems for long term Investment research through Axiom Data Lab.
Typical Problems I Solve
- Fixing unreliable or slow data pipelines
- Reducing false alarms and missed failures in data validation systems
- Logging systems that are difficult to debug or maintain
- Making ML systems robust in production (not just offline)
- Improving observability and debugging for complex systems
- Scaling data systems and pipelines efficiently
- Simplifying systems that are too costly or hard to maintain
- Unclear where AI/LLMs actually add value vs unnecessary complexity
Work With Me
Free 30-min consultation
Quick assessment of your data / ML system to identify bottlenecks and opportunities.
Book a Call
📩 Email: guang@axiomdatalab.com
Selected Impact
Ex-Google Data Scientist. Led production ML improvements contributing to ~$400M annual YouTube Ads revenue.
- Built and deployed production ML systems in C++ for large-scale YouTube Ads ranking under strict serving latency constraints
- Balanced precision/recall trade-offs to meet latency and revenue goals
- Designed reusable ML pipeline frameworks to accelerate iteration
- Built validation systems that reduced production failures and false alerts
Investment Research System at Axiom Data Lab
In parallel, I am building a scalable research system for long-horizon investment strategies, focused on extracting predictive signals from large-scale financial data including both price action and fundamentals.
- Proprietary processed database integrating licensed historical data with live data feeds (65+ years, 10k+ equities) including schema unification and automated updates
- End-to-end pipeline: ingestion → normalization → validation → feature engineering → modeling → backtesting → evaluation
- Feature systems for structured price behavior detection (e.g., consolidation and pre-breakout setups)
- Backtesting engine with realistic execution constraints
Ongoing work focuses on scaling the system to full-market coverage, improving parallelization, and enhancing predictive modeling for robust long-term performance.