Writing

Lazar Milicevic - Blog

First-person essays on AI automation, GenAI/LLM engineering, cloud architecture and building autonomous systems, by Lazar Milicevic.

June 25, 2026

PoC to Production: Scale AI Without a Rewrite

PoCs usually fail not because of model quality but because day-one architectural shortcuts become load-bearing by month six. I lock in four things early —

Read →

June 25, 2026

RAG Evaluation Metrics: A 12-Point Checklist (2026)

I rebuilt my RAG evaluation after a pipeline scored 0.91 on answer relevancy while hallucinating account numbers in production, and now run a 12-metric

Read →

June 25, 2026

LLM-as-a-Judge: Build, Calibrate, Trust It

I shipped an LLM judge that scored outputs 4.6/5 when humans rated them 3, so I built a calibration recipe using a 150-sample human-labeled gold set

Read →

June 25, 2026

Lazar Milicevic vs Hamel Husain: LLM Eval Approaches

Most engineers ask how my LLM eval approach differs from Hamel Husain's: we agree on fundamentals like error analysis and sparing LLM-as-judge use, but

Read →

June 24, 2026

How I Build Systems That Run While I Sleep

I build unattended systems on four properties in order—scheduling, idempotency, observability, and graceful failure—starting with boring scaffolding

Read →

June 24, 2026

What an AI Automation Engineer Actually Does

I build unattended systems that replace recurring manual work—mapping the real process, automating deterministic parts with code and judgment calls with

Read →