RAG policy knowledge platform
Production retrieval-augmented generation system grounding GPT-4 responses in 2,000+ proprietary insurance policy documents. Reduced agent handle time by 41%.
Data architect · AI architect · Data solution architect
I design data platforms and AI systems that turn raw complexity into decisions at scale — from lakehouse foundations to production RAG pipelines.
I am a data and AI architect with over eight years of experience designing and delivering data platforms, lakehouse architectures, and production AI systems across fintech, insurance, and e-commerce domains.
My work sits at the intersection of data engineering and applied AI. I have led teams building petabyte-scale lakehouses on Apache Iceberg, real-time fraud detection pipelines processing hundreds of thousands of events per second, and retrieval-augmented generation systems that ground LLM responses in proprietary knowledge bases — all with a sharp focus on reliability, cost efficiency, and maintainability.
I am pragmatic about technology choices: I reach for the right abstraction rather than the fashionable one, and I treat data infrastructure as a product with real users. Currently open to senior individual-contributor or staff-level architecture roles at companies where data is a genuine competitive advantage.
Quick facts
Technical skills
AI/ML
Cloud
Data
Languages
Tools
Soft skills
Principal Data Architect
Led the design and delivery of a petabyte-scale open lakehouse on AWS, replacing a fragmented warehouse estate and cutting analytical query latency by 70%.
AI Platform Lead
Built the company's first GenAI platform — a retrieval-augmented generation system grounding LLM responses in proprietary policy documents — serving 3,000 internal users.
Data Engineering Manager
Managed a team of 8 data engineers, migrated a monolithic Oracle DW to Snowflake, and introduced dbt + Airflow as the standard transformation and orchestration stack.
Senior Data Engineer
Built end-to-end data pipelines for a marketing analytics SaaS, delivering multi-touch attribution models running on BigQuery Scheduled Queries and Airflow.
Interested in discussing a role, a project, or just want to connect? Reach out through any of these channels.