Data Engineer with 7+ years of experience building scalable data infrastructure and AI-powered solutions. I specialize in pipeline optimization, cloud architecture, and leveraging modern data technologies to solve complex business problems. What I Do:
- Design and optimize high-performance data pipelines using PySpark, Databricks, and Apache Airflow
- Build serverless data processing solutions on AWS (Lambda, EMR, SQS, SNS, Firehose)
- Implement AI agent systems using Python, PydanticAI, and LLM APIs
- Develop backend systems with Python, Django, Node.js, and Go
- Reduced ETL processing time from 2+ hours to under 20 minutes through PySpark optimization
- Built serverless event processing handling thousands of JSON files for under $10
- Architected AI agent framework for automated insurance application processing
- Developed self-service SQL platform empowering data analysts to create custom reports
Tech Stack:
Python • PySpark • Databricks • Apache Airflow • AWS • PostgreSQL • Docker • Django • Node.js • Go • Apache HUDI • Terraform
Background:
Computer Science graduate from Universidade Federal de Minas Gerais (UFMG). Former backend developer at Studio Sol, working on platforms serving tens of millions of daily users across Brazil and Latin America.
When I'm not engineering data solutions, I'm passionate about technology education—I've taught robotics and programming to 70 public school students.
📫 Connect with me on LinkedIn



