Carolina Dias

Machine Learning Engineer / Software Engineer (Python Backend)

Proven track record (3+ years of exp.) of developing and integrating REST APIs, implementing software engineering best practices, and automating crucial operations. Passionate about solving complex problems and applying interdisciplinary knowledge in data science, software engineering, and MLOps.


Professional Experience

Machine Learning Engineer / Python Software Engineer

  • Regularly engaged in the use and development of REST APIs (FastAPI and Pydantic), demonstrating a strong understanding of web service architecture and its application in software development.
  • Implemented software engineering best practices to streamline code workflows, significantly reducing manual tasks, ensuring more robust and maintainable code, aligning with industry standards.
  • Developed and deployed automation solutions for crucial operations, including model retraining and database table updates. These improvements played a pivotal role in reducing operational bottlenecks and enhancing system reliability.
  • Worked on developing a personalization project from scratch, leveraging LLMs (GPT-4) and prompt engineering.
  • Tech stack: Python (FastAPI, Pydantic, Pandas, ibis, Pytest), Google Cloud (BigQuery), Docker, GitHub Actions, SQL

June 2023 - August 2024

Machine Learning Engineer / Python Software Engineer

  • Improved Creditas Machine Learning Platform, refactoring the use of 15+ production models and reducing duplicate code.
  • Creation of an example production model with step-by-step patterns and software engineering good practices, with an extensive documentation used by 20+ data scientists.
  • Development of a POC using Amazon SageMaker for building and deploying models directly in the cloud, incorporating an AutoML solution to speed up the model experiments.
  • Upgraded the logging and monitoring of models and ML applications, using Kibana and Datadog.
  • Tech stack: Python (FastAPI, Pydantic, Pandas, Pytest), MLflow, Docker, CI/CD, AWS (EKS, Athena, Elastic Search, SageMaker), Kotlin.

September 2021 - June 2023

Other Relevant Experience

Educational Content Creator

  • Created and developed the entire curriculum for the Data Visualization discipline for the Data Science Degree, from class notes to exercises and video scripts for the lessons.

December 2021 - February 2022

Undergraduate Student Researcher

  • Member of the ParGo Research Group (Parallel Computing, Graph Theory and Optimization).
  • Research focused on graphs and algorithm complexity.
  • Constant presentations at research events.

August 2018 - June 2018

Technology Intern

  • Helped improve the use of Assistive Technologies for people with disabilities, mainly visual disabilities, allowing for independence in the use of computers and smartphones for studying and working.

January 2018 - June 2018

Education

State University of Ceará

Master of Science in Computer Science

  • Thesis: A Knowledge Graph Approach for Analyzing User Engagement in Immersive Virtual Learning Environments

March 2022 - August 2024

PUC Minas

Specialization in Artificial Intelligence and Machine Learning

  • Relevant coursework: Machine Learning, Neural Networks and Deep Learning, Natural Language Processing, Computer Vision, MLOps and DataOps.

October 2021 - October 2022

Federal University of Ceará

Bachelor of Science in Mathematics

  • Member of the ParGo Research Group (Parallel Computing, Graph Theory and Optimization) in the area of Graph Theory, focusing on graph convexity and algorithmic complexity.
  • Relevant coursework: Besides the Pure Math disciplines, I took Data Structures, Construction and Analysis of Algorithms, Graph Theory, Combinatorics.

January 2018 - September 2021

Federal University of Ceará

Bachelor of Eng. in Electrical Engineering (Not finished)

  • After finishing about 40% of the degree, I transfered to Mathematics.
  • Relevant coursework: Programming in Python, C, C++ and Matlab, Statistics, Economics.

March 2016 - December 2017

Skills

  • Python
  • Docker
  • Machine Learning
  • Data Analysis
  • Data Visualization
  • Agile Development
  • To be continued...

Projects

Insurance Forecast

Data Science Project - MLZoomcamp

  • Analysis, understanding, modelling, deploy and presentation of a regression model predicting insurance prices.
  • Tech stack: Python, Docker, Streamlit, Heroku.

Drug Discovery With ML

Data Immersion 3 - Alura

  • One of the top 10 best projects chosen from more than 200 participants to receive a scholarship for Alura's Data Science Bootcamp
  • Data analysis, understanding and modelling about drug discovery.

more projects...


Latest Blog Posts


more blog posts...

Extras

Extra Courses

  • Introduction to Machine Learning in Production, DeepLearning.AI (15h)
  • AWS Machine Learning Foundations, Udacity
  • Introduction to Python for Natural Language Processing, USP (20h)
  • Data Engineering Bootcamp, How Bootcamps (100h)
  • Data Science Bootcamp, Alura (160h)
  • All Courses Completed at Alura

Talks & Presentations


Awards

  • Alura Stars, Alura (Website)
  • 2nd Place Data Engineering Hackathon, A3 Data (Project)