Philipp Guldimann

Data & ML Systems Engineer · Zürich, Switzerland

I build evaluation and data infrastructure for AI products — from zero to production. At a Zürich AI startup I built LLM evaluation infrastructure (23 evaluators, 10+ models assessed) that cut QA cycles from two days to four hours, and I’m now building multi-jurisdiction legal AI pipelines that process roughly one million documents across three countries. I hold an MSc in Machine Intelligence from ETH Zürich, and I care about pragmatic, scalable systems that ship and prove their value with metrics.

Technical focus

LLM evaluation · RAG systems · Data pipelines · MLOps · TypeScript / Python · AWS / Azure

Experience

Data Engineer — Omnilex

Feb 2026 – Present · Zürich, Switzerland

Built ingestion and transformation pipelines for legal content across 3 jurisdictions (Switzerland, Germany, Austria), processing ~1M documents from APIs, scraping outputs, and bulk sources.
Developed TypeScript-based data workflows for normalization, citation-aware chunking, embeddings, classification, and entity extraction.
Contributed to RAG-ready indexing and Azure-based search infrastructure for precise, traceable legal AI responses.
Introduced data contracts and validation checks that caught 50K+ duplicate entries before they reached production.

Machine Learning Engineer — LatticeFlow AI

Jan 2025 – Nov 2025 · Zürich, Switzerland

Built v0 evaluation infrastructure for a new AI product from scratch: 23 evaluators, integrations with multiple data and chat-model providers, assessing 10+ LLMs.
Automated evaluation workflows for QA and regression testing, reducing review cycles from ~2 days of manual inspection to ~4 hours.
Extended evaluators with targeted dataset generation to increase coverage across model behaviours and failure modes.

Research & Publications

COMPL-AI — A Benchmarking Framework for Evaluating LLM Compliance with the EU AI Act

Research Intern, Secure Reliable Intelligence Lab, ETH Zürich (Oct 2023 – Mar 2024)

Core contributor to COMPL-AI, evaluating 10+ models across 20 benchmarks spanning capabilities, cybersecurity, privacy, and bias/fairness. Worked on benchmark design, evaluation pipelines, and model integration via Hugging Face Transformers.

Read the paper (arXiv:2410.07959) · Code on GitHub

Education

MSc Computer Science — Machine Intelligence, ETH Zürich (Sept 2022 – Dec 2024) Thesis (top grade): Speech Recognition for Children with Congenital Disorders Using Adaptive Methods — read on the ETH Research Collection.
BSc Computer Science, ETH Zürich (Sept 2018 – Aug 2022) Thesis (top grade): Detecting Disinformation on Twitter Targeting Non-Profit Organisations (in collaboration with the ICRC).

Technical Skills

Languages: Python, TypeScript / JavaScript, SQL
LLM / AI: LLM evaluation, RAG, embeddings, Hugging Face Transformers, LangChain / LangGraph, MCP, PyTorch
Data & orchestration: Airflow, Dagster, vector databases (FAISS, Weaviate, Pinecone), PostgreSQL
Backend & cloud: FastAPI, Node.js, Docker, Kubernetes, AWS, Azure, CI/CD

Philipp Guldimann

Philipp Guldimann

Data & ML Systems Engineer · Zürich, Switzerland

Technical focus

Experience

Data Engineer — Omnilex

Machine Learning Engineer — LatticeFlow AI

Research & Publications

COMPL-AI — A Benchmarking Framework for Evaluating LLM Compliance with the EU AI Act

Education

Technical Skills

Earlier Projects

Computational Intelligence Lab — Text Classification (ETH, 2023)

Bachelor Thesis — Detecting Disinformation on Twitter