Ronan Tanios

Data Scientist & Software Engineer | ETH Zürich MSc | Building Production ML/Software Systems at Scale

Explore My Work

About Me

Data Scientist & Software Engineer with expertise in building production-grade ML systems and scalable distributed software. Currently pursuing MSc in Data Science at ETH Zürich while conducting research on diffusion language models at IBM Research. Strong foundation in systems engineering, quantitative methods, and applied machine learning from experience at leading tech and finance firms.


Specialized in high-performance computing, low-latency systems, and end-to-end ML pipelines spanning LLMs, agentic workflows, and quantitative analysis. Winner of 200+ participant Optiver trading challenge and contributor to Fortune 500 partnerships. Passionate about building robust, scalable systems that bridge theoretical research with production deployment.

Professional Experience

Visiting Researcher

Sep 2025 – Feb 2026

IBM Research Zürich

Machine Learning Engineer

Feb 2025 – Aug 2025

Julius Baer

Teaching Assistant & Researcher

Aug 2024 – Aug 2025

ETH Zürich

Data Scientist

Sep 2024 – Jan 2025

PSI Paul Scherrer Institute

Software Engineer

Feb 2024 – Aug 2024

Axpo Group

Machine Learning Engineer

Aug 2023 – Sep 2023

Keyvian

Data Scientist

Jun 2023 – Jul 2023

CME Offshore

2+ Years in AI and Software Engineering

Technical Skills

Programming Languages

Python

Advanced

SQL

Advanced

Java

Advanced

C/C++

Advanced

Scala

Advanced

JavaScript

Intermediate

Technologies & Frameworks

Git

Advanced

Spark

Advanced

HDFS

Advanced

Docker

Advanced

Linux/Unix

Advanced

Kubernetes

Intermediate

PowerBI

Intermediate

Software Engineering

Computer Networks

Advanced

Concurrency & Parallelism

Advanced

Object-Oriented Design

Advanced

High Performance Computing

Advanced

Distributed Systems

Advanced

Low-Latency Systems

Advanced

Cloud Infrastructure

Intermediate

System Reliability

Intermediate

AI & Machine Learning

Machine Learning

Advanced

Deep Learning

Advanced

Computer Vision

Advanced

NLP

Advanced

MLOps

Advanced

LLMs

Advanced

Statistical Modeling

Advanced

Probability & Stochastic Processes

Advanced

Bayesian Inference

Advanced

Time Series Analysis

Advanced

Optimization Algorithms

Advanced

Signal Processing

Intermediate

Featured Projects

Computer Vision for Autonomous Cars

Semantic Segmentation and Depth Estimation project for autonomous vehicles. Implemented DeepLabV3+ with multi-task learning for dense prediction tasks including semantic segmentation and depth estimation.

Computer Vision Deep Learning Autonomous Cars PyTorch

3D Object Detection from LiDAR Point Clouds

Built a 2-stage object detector for autonomous vehicles using LiDAR point clouds. Implemented Region Proposal Network and Refinement Network for coarse and refined detections, with multi-modal data visualization including 3D bounding boxes and semantic segmentation.

LiDAR 3D Object Detection Point Clouds Autonomous Cars

Aircraft Engine Anomaly Detection

Production-scale computer vision pipeline for Fortune 500 aviation client (Lufthansa). Built end-to-end anomaly detection system processing 10k+ aircraft images, achieving 95%+ accuracy. Deployed scalable MLOps pipeline with automated retraining and monitoring.

Production ML Computer Vision MLOps Fortune 500

LLM Security & Watermarking

Advanced research on Large Language Model security, implementing watermarking schemes, attack methods, and content filtering using CLIP embeddings. Published-quality research addressing critical AI safety challenges in production LLM systems.

AI Safety LLM Security CLIP Research

LLM Prompting & Calibration

Comprehensive study on Large Language Model calibration using Microsoft Phi-2 and GSM8K dataset. Implemented various prompting strategies including Chain-of-Thought, subquestion decomposition, and self-consistency approaches with temperature sampling.

LLM Calibration Chain-of-Thought Phi-2 vLLM

Retrieval Augmented Generation (RAG)

Built a question-answering system using RAG with Google's FLAN-T5-small model. Demonstrated significant performance improvements by providing relevant context, creating an adaptable QA system that updates without retraining.

RAG FLAN-T5 Question Answering Context Retrieval

Swiss Legal Documents RAG System

Enterprise-grade RAG pipeline processing 50k+ legal documents for professional legal research. Built hybrid retrieval system with ColBERT embeddings, Elasticsearch, and advanced query routing. Implemented novel LLM-as-judge evaluation framework achieving 90%+ accuracy on domain-specific queries.

Production RAG Legal Tech ColBERT Enterprise Scale

Student Dropout Prediction

Machine learning system to predict student dropout risk using comprehensive data science pipeline with feature engineering, model selection, and deployment using various ML frameworks.

Machine Learning Scikit-learn XGBoost Keras

Pollution Forecasting with Gaussian Processes

Built predictive ML models using Gaussian Process regression on Europe-wide dataset for air pollution forecasting (PM2.5), achieving superior accuracy for environmental health applications.

Gaussian Processes Environmental Science Regression Meteorology

Big Data Technologies

Comprehensive implementation of big data technologies including S3, DynamoDB, HDFS, and Spark for scalable data processing and analytics on large-scale distributed systems.

Spark HDFS S3 DynamoDB

Chatbot with Custom Training Data

Developed intelligent chatbot enhanced with university website data using OpenAI, LangChain, and Pinecone for question-answering and information retrieval with web scraping capabilities.

OpenAI LangChain Pinecone Web Scraping

Cryptocurrency Data Pipeline

Developed a fully automated data pipeline for collection of tokenomics and whitepapers for up to 10,000 cryptocurrencies, utilizing web scraping and multiple API integrations. Built scalable infrastructure for real-time crypto data processing and analysis.

Data Pipeline Web Scraping API Integration Cryptocurrency

LLM-Powered Whitepaper Analysis

Replaced obsolete keyword-based whitepaper analysis with an advanced pipeline using Large Language Models for in-depth automated whitepaper analysis. Implemented sophisticated NLP techniques for extracting insights from cryptocurrency project documentation at scale.

LLM Analysis NLP Document Processing Blockchain

Optiver Trading Challenge Winner

Won 200-participant hackathon by building linear regression model with advanced feature engineering for option pricing. Demonstrates quantitative finance expertise and competitive programming skills.

Quantitative Finance Feature Engineering Competition Winner Options Trading

Get In Touch