Marco Molinati · Senior AI Engineer & Data Scientist

Senior AI Engineer & Data Scientist · Ferno (VA), Italy

I turn messy data into systems that decide

6+ years at Elmec Informatica, from Service Desk to technical reference for the AI team. I build production LLM systems and set the standards the team builds on

Work

LLM Gateway

OpenAI-compatible LLM gateway in Go: one API in front of OpenAI, Anthropic, and on-prem models (vLLM, Ollama), with token-level access control, RBAC, budgets, rate limiting, and OpenTelemetry observability, plus a 40-tool MCP admin server. In production at Elmec.

GoEchoPostgreSQLReactOpenTelemetryMCP

LLM Availability Dashboard

Real-time monitoring for LLMs across our Kubernetes infrastructure (kServe, vLLM): a live probe loop tracks model health, and when one drops an Agno agent auto-diagnoses the cause, inspecting pods, events, and logs through read-only tools and testing the model endpoint directly, with its reasoning streamed live to the UI.

PythonFastAPIAgnoWebSocketsKubernetes

35–40% ↓ analyst load

Phishing Email Triage

FastAPI service that parses incoming emails and scores phishing risk with an LLM, wired into the cybersecurity team’s ticketing flow. Langfuse-managed prompts, deployed on Kubernetes, and the basis for my public AI & Security talks.

PythonFastAPIOpenAILangfuseKubernetes

up to 50% faster docs

AI Documentation Assistant

LLM workflows for technical authors: validates draft sections against guidelines and offers suggest / rework / extract actions, streamed token-by-token. Structured output via Instructor, prompts in Langfuse, deployed on Kubernetes.

PythonFastAPIInstructorLangfuseKubernetes

~79 h saved / mo

Ticket Classifier

Classical-ML service that routes support tickets by type and forwarding group across 50+ enterprise clients: TF-IDF + SVM over FastAPI, predictions persisted to PostgreSQL. ~1,900 interactions/month.

Pythonscikit-learnFastAPIPostgreSQL

personal project · in progress

Generative UI Dashboard

Upload a CSV / Excel / JSON file and get an interactive dashboard: one LLM call maps columns to chart specs, then everything renders deterministically through ECharts templates, with no generated code. Drag-and-drop layout, SSE streaming. Earlier iterations explored agentic generation.

PythonFastAPIPolarsReactECharts

Procedure Knowledge Base

A knowledge base for retail IT support that routes questions to verified procedures instead of rewriting them: a vision LLM enriches screenshots and diagrams, an LLM builds a routing index, and a query API answers with citations to the source docs.

PythonFastAPIOpenAIVision LLM

with the Elmec AI team

Enterprise Site Search

A search engine on OpenSearch for a corporate site: hybrid full-text search with query optimization, score-based re-ranking, real-time autocomplete, and a Trino to OpenSearch ingestion pipeline, plus continuous LLM-as-a-Judge evaluation of result quality. Built with the team.

PythonFastAPIOpenSearchTrinoMongoDBOpenAI

with the Elmec AI team

Conversational AI

Graph-orchestrated support chatbot (semantic RAG retrieval, LLM-as-judge escalation, and multi-turn info collection) that I contributed to with the Elmec AI team.

PythonPydantic AIQdrantMongoDBLangfuse

This Site

The site you’re reading. Next.js + React with a generative hero (a feed-forward neural network in a particle field that resolves my name out of noise) and a role-semantic OKLCH color system. Designed and built end-to-end.

Next.jsReactTypeScriptFramer MotionTailwind

Capabilities

Generative AI & LLM

Where I spend most of my time. Agentic systems with tool-calling, from an Agno agent that diagnoses live model outages to graph-orchestrated chat, backed by RAG, prompt engineering, and LLM-as-a-Judge for continuous quality. The basis for 6 talks (2024–2026, 2 external incl. ATED).

LLMsGenerative AIRAGAgentic AIAgent OrchestrationPrompt EngineeringLLM-as-a-JudgeResponsible AIConversational AIIntent Detection

LLM Providers & Deployment

Hybrid cloud / on-prem deployment strategy across providers. Currently deepening the end-to-end on-prem lifecycle with focus on vLLM (serving, monitoring, scaling, version management).

OpenAIAzure OpenAIAnthropicvLLMLocalAILM StudioOllama

MLOps & Observability

I introduced Langfuse as the company-wide observability baseline, now adopted across every internal AI project for tracing, prompt versioning, evaluation, and cost monitoring.

LangfuseMLOps pipelinesExperiment TrackingCI/CD for MLModel Monitoring

Vector Databases & Search

Semantic and hybrid search, re-ranking, continuous validation via LLM-as-a-Judge, powering RAG and enterprise search systems. Qdrant-certified (01/2026).

QdrantChromaDBLanceDBOpenSearchHybrid SearchRe-ranking

Frameworks & Programming

Production stack across Python services, ML pipelines, and lightweight UIs. React for full-stack work; FastAPI / Flask / Gradio depending on the surface.

PythonLangChainLlamaIndexHuggingFaceTransformersFastAPIFlaskGradioPydantic AIAgnoReactSQLPowerShellBash

Data Science & Visualization

NLP and predictive modeling. Automated EDA tooling and dashboard work directly support the Generative UI Dashboard build.

NLPPredictive ModelingClassificationAutomated EDAEChartsPlotlyQlik SenseMatplotlibSeaborn

Cloud, DevOps & Infrastructure

Foundations from the System Engineering era; still actively shaping AI deployment decisions across cloud and on-prem environments.

AWSAzureDockerKubernetesGitVMwareHyper-VLinuxWindows Server

The path

01/2020 – 04/2021

Service Desk Specialist

04/2021 – 04/2024

System Engineer & Business Data Analyst

04/2024 – Present

AI Engineer & Data Scientist

L1/L2 incident and escalation management with ITSM: the foundation of technical analysis and communication.

Stabilized Windows / Linux, cloud and virtualization; automated ops with Python / PowerShell and pioneered the first ML on ticketing.

Technical reference for the AI team on LLM, RAG and MLOps; owns the hybrid deployment strategy and observability standard.

50%

Doc creation time saved

35–40%

Classification savings

1,900+

Monthly interactions

~79

Hours saved / month

Recognition

NVIDIA-Certified Associate: AI Infrastructure and Operations11/2025

Qdrant Essentials Certification01/2026

2026AI & Automation Training ProgramContracts & Pre-sales Office, Elmec

2025RAG & Prompt Engineering for Enterprise AI SolutionsInternal Knowledge Sharing

2025AI & Security: Tools for Attackers and DefendersATEDexternal

2025AI & Security: Tools for Attackers and DefendersCompany Tech Talk

2025AI to Support the End UserCompany Tech Talk

2024Introduction to Generative AI: LLMs, RAG and Prompt EngineeringInternal Knowledge Sharing

2023Scripting for Data Extraction and ManipulationInternal Knowledge Sharing

Credentials

Data Science & AI

2020 – Present

Self-Study

Specialist courses on DeepLearning.AI, Coursera, O'Reilly: ML, DL, NLP, MLOps, LLMs, agentic systems
External validation via vendor-neutral certifications (NVIDIA, Qdrant, HuggingFace)

B.Sc. Computer Science (interrupted)

2018 – 2022

Università degli Studi di Milano

Path interrupted by deliberate choice to specialize in Data Science & AI through targeted training and enterprise application; the 2020-onwards career trajectory confirms the coherence of that choice.
Foundation acquired: programming, algorithms, databases, software engineering, statistics

Scientific High School Diploma

2013 – 2018

Liceo Scientifico Statale Arturo Tosi

Grade: 78/100

Let's connect

MarcoMolinati