I'm a Data Engineer & Full-Stack Developer pursuing an MSc in Computer Science and Data Science at ESILV Paris. I build real-time data pipelines, ML systems, and web/mobile applications — bridging the gap between data engineering and product development.
Real-time pipelines with Kafka, Spark Streaming, and InfluxDB. ML models with scikit-learn, PyTorch, TensorFlow, and MLflow. Insights visualised with Grafana, Power BI, and Seaborn.
Web apps with React.js and Next.js. Mobile apps with React Native. Serverless backends with AWS Lambda, DynamoDB, FastAPI, and Firebase.
Deployed on AWS and Cloudflare Pages. Data with PostgreSQL, MongoDB, Firebase. Docker, Azure Data Factory, and Git for orchestration and version control.
End-to-end real-time pipeline ingesting live cryptocurrency data from CoinGecko via Kafka, processed with Spark Streaming, stored in InfluxDB, and visualised on interactive Grafana dashboards with price alerts.
Cross-domain VR education platform built in Unity with the Meta XR SDK. Features 3D-modelled classroom scenes, real-time teacher–student interaction, and performance-optimised rendering for standalone VR headsets.
Production-grade MLOps pipeline for diamond price prediction. Automated ingestion, feature engineering, model versioning with MLflow, workflow orchestration via Prefect, and a REST API served with FastAPI in Docker.
Large-scale recommendation engine modelling user–song interactions as a graph. Applies Personalised PageRank alongside collaborative filtering in PySpark to surface relevant tracks at scale.
Scraped smartphone listings, engineered features (RAM, storage, brand), analysed market pricing segments, and benchmarked linear and non-linear regression models for price prediction.
Distributed recommendation system using PySpark's ALS collaborative filtering on a large MovieLens dataset. Optimised Spark SQL queries for performance at scale.
Scraped product reviews and applied multiple sentiment models (VADER, TextBlob, fine-tuned classifier). Combined predictions through a majority-vote ensemble for robust sentiment classification.
PyTorch-based CNN to classify NASA space imagery into multiple categories. Conducted architecture search and hyperparameter optimisation, achieving strong accuracy on a curated astronomical dataset.
Designed and trained CNNs for multi-class image classification. Applied data augmentation, batch normalisation, and transfer learning to improve generalisation on benchmark datasets.
Full-stack web application for managing student records built with Jakarta EE and JSP following the MVC pattern. Features secure session handling, role-based access, and full CRUD backed by MySQL.
Analysed a large UK road accidents dataset through EDA and predictive modelling to uncover patterns in accident severity, time, and geography, producing actionable road safety insights.
Merged multiple global climate datasets to analyse historical CO₂ trends, forecast future emissions, and surface the policy implications of different emission scenarios through rich visualisations.
Scraped and analysed historical Olympic medal data to surface trends in country performance, athlete demographics, and sport popularity across the modern Games.
Implemented and benchmarked classical NLP summarisation algorithms (TextRank, TF-IDF extractive, LexRank) on news corpora, evaluating quality with ROUGE metrics to identify the best approach per document type.
Developed and shipped an Android app enabling small businesses to showcase products and allowing users to discover nearby shops through location-based services. Live on the Google Play Store.