AI · Automation · Engineering

n8n in Production: From First Node to Self-Running AI Workflows

By Lazar MilicevicJuly 5, 202610 min read
Server racks with glowing blue lights representing automated n8n production workflows running autonomously

I have a confession: I used to dismiss visual automation tools as toys for people who couldn't write code. That changed when I needed to orchestrate multiple LLM calls, API fetches, and publish steps across several sites, all on a schedule, with no human babysitting. Rebuilding that orchestration in pure code was taking weeks. Wiring it in n8n took days. Now I run content and SEO pipelines that publish on their own, and the workflows are stable enough that I stop thinking about them.

Here is what I have learned running n8n in production for real AI workloads. Not the drag-and-drop tutorial stuff. The architecture, the failure modes, and the decisions that determine whether your workflow survives Monday morning.

Why n8n Works for AI Agent Orchestration

n8n is a source-available workflow automation platform that lets you connect APIs, databases, and LLMs through a visual canvas while dropping into raw JavaScript or HTTP calls whenever you need to. For AI automation engineering, it sits in a sweet spot between no-code tools (Zapier, Make) and fully custom code.

The key advantage is the AI agent node. It supports the standard ReAct pattern: the LLM decides which tools to call, executes them, observes the result, and loops until it has a final answer. You can attach custom tools (HTTP requests, database queries, code execution) and the agent orchestrates them autonomously. I have used this pattern in my BizFlowAI ContentStudio to let a Claude-powered agent research topics, pull keyword data, and write structured content without a hard-coded sequence.

The trade-off is debugging complexity. When a multi-step agent fails, the visual canvas hides the execution trace behind a UI that was not designed for long ReAct loops. You will spend time clicking through node outputs to find where the prompt went wrong. For simple deterministic pipelines, this is fine. For complex agent loops, log everything to an external store.

The Architecture I Actually Run

My production stack for autonomous content and SEO pipelines looks like this:

Layer Tool Why
Orchestration n8n (self-hosted, Docker) Visual canvas, cron triggers, error workflows
LLM Claude API (Sonnet for writing, Haiku for classification) Reliability, long context, structured output
Database PostgreSQL with pgvector Content storage, RAG, vector search
Hosting Single VPS, Docker Compose Simple, predictable cost, full control
Monitoring n8n error workflow + Slack alert Catches failures within minutes

The entire setup costs me under $50/month in infrastructure, not counting LLM API usage. The LLM cost depends on volume, but for my content pipeline, Claude's API runs roughly $0.15 to $0.40 per published piece depending on research depth and revision rounds.

One architectural decision that paid off immediately: I separated the orchestration layer (n8n decides what happens next) from the execution layer (custom code does the actual work). n8n calls webhooks that run my Node.js functions for complex operations like RAG retrieval and content formatting. This means I can test and debug the execution layer independently, and if n8n's internal state gets corrupted (it happens), I rebuild the workflow in minutes because the logic lives in my code, not in the canvas.

Getting Past the Toy Stage: Error Handling That Actually Works

The single biggest mistake I see with n8n workflows is assuming happy path execution. In production, APIs rate-limit, LLMs return malformed JSON, databases timeout, and webhooks fail silently. Your workflow needs to handle all of it.

Step 1: Build a dedicated error workflow.

n8n lets you attach an "Error Trigger" node to any workflow. When any node fails, n8n fires the error workflow with the full execution context. My error workflow does three things: logs the error to a PostgreSQL table, sends a Slack message with the workflow name and error details, and retries the failed execution up to two times with a 5-minute delay.

This is the error workflow structure I use:

[Error Trigger] → [Set (format error data)] → [Postgres (log)] → [Slack (alert)]
                                                          → [Wait 5min] → [HTTP (retry webhook)]

The retry hits a webhook on the original workflow with the same input data. This is important: your original workflow must be idempotent. If it processes the same input twice, the result should be the same, not duplicated. For content generation, I check whether a piece with the same slug already exists before writing.

Step 2: Handle LLM output validation.

LLMs will eventually return something you do not expect. Never pass raw LLM output directly into a database write or a publish action. I use an intermediate validation step:

// Code node: validate LLM output before proceeding
const item = $input.first().json;
const content = item.content;

const errors = [];

if (!content || content.length < 500) {
  errors.push('Content too short');
}

if (!item.metaDescription || item.metaDescription.length > 160) {
  errors.push('Meta description missing or too long');
}

if (!item.slug || !/^[a-z0-9-]+$/.test(item.slug)) {
  errors.push('Invalid slug format');
}

if (errors.length > 0) {
  // Route to a manual review queue instead of failing
  return [{ json: { ...item, status: 'needs_review', errors } }];
}

return [{ json: { ...item, status: 'approved' } }];

This pattern, routing failures to a review queue instead of crashing the workflow, saved me from publishing broken content more times than I can count. The workflow continues running. The problem gets flagged. You handle it manually.

Step 3: Set execution timeouts.

n8n workflows can hang indefinitely if an external service stops responding. Go to workflow settings and set a maximum execution time. I use 10 minutes for content generation workflows and 2 minutes for simple data syncs. If a workflow hits the timeout, the error workflow fires and you get alerted.

Secrets and Credentials: Do Not Leak Your API Keys

n8n stores credentials in an encrypted SQLite database by default. For production, this is acceptable for a single-instance setup, but you need to follow some rules.

First, never hardcode API keys in nodes. Always use n8n's credential system. This keeps keys out of your workflow JSON, which matters because workflow JSON gets exported, shared, and committed to version control. I learned this the hard way when I exported a workflow to debug an issue and almost shared it with a key embedded in an HTTP node.

Second, use environment variables for the n8n encryption key itself. In your Docker Compose file:

services:
  n8n:
    image: n8nio/n8n
    environment:
      - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=${DB_HOST}
      - DB_POSTGRESDB_PASSWORD=${DB_PASSWORD}
    env_file:
      - .env
    volumes:
      - n8n_data:/home/node/.n8n

Third, rotate keys regularly. If an LLM API key leaks, the cost can be astronomical. I set calendar reminders to rotate Claude and OpenAI keys every 90 days and set spending limits on both platforms. A $200 monthly spend limit on OpenAI has saved me from a runaway agent loop that would have cost thousands.

Self-Hosting: What I Actually Deploy

I run n8n on a single Hetzner VPS (CCX13, 2 vCPU, 8GB RAM) for about $15/month. Docker Compose handles everything: n8n, PostgreSQL (for n8n's internal state and my application data), and a Cloudflare tunnel for HTTPS without managing certificates.

Here is the setup I recommend for a production single-instance deployment. Use PostgreSQL as n8n's backend database, not the default SQLite. SQLite works for testing but under concurrent execution loads, you will see lock errors. PostgreSQL handles parallel workflow executions without issues.

The Docker Compose stack runs three containers: PostgreSQL, n8n, and a Cloudflare tunnel. The tunnel exposes n8n on a subdomain without opening any inbound ports on the VPS. This is cleaner than running Nginx with Let's Encrypt and significantly more secure.

For the n8n container, set these environment variables for production reliability:

  • EXECUTIONS_DATA_PRUNE=true, old execution data bloats the database and slows the UI
  • EXECUTIONS_DATA_MAX_AGE=168, keep 7 days of execution history (adjust based on your debugging needs)
  • N8N_METRICS=true, expose Prometheus metrics if you want external monitoring
  • N8N_CONCURRENCY_PRODUCTION_LIMIT=20, prevent runaway parallel executions from exhausting memory

With 8GB RAM, this setup comfortably runs 15 to 20 concurrent workflow executions. If you need more, you are looking at n8n's queue mode with Redis and multiple worker instances, which adds operational complexity I would avoid until you genuinely outgrow a single instance.

Scaling Beyond One Workflow: Multi-Agent Patterns

The real power of n8n for AI automation emerges when you chain workflows and let agents call each other. I run a content pipeline with this structure:

  1. Research workflow (triggered by cron or webhook): a Claude-powered agent with web search tools gathers sources, extracts key facts, and writes structured research notes to PostgreSQL.

  2. Writing workflow (triggered by a webhook from the research workflow): a second agent reads the research notes, pulls relevant context from a pgvector RAG store, and drafts a full article.

  3. Optimization workflow (triggered by the writing workflow): checks the draft against SEO criteria, runs a self-critique pass, and either approves or sends back for revision.

  4. Publishing workflow (triggered by the optimization workflow): formats the final content, generates images, and publishes via CMS API.

Each workflow is independent and idempotent. If the writing workflow fails, the research is already saved. If publishing fails, the approved draft is in the database. I can re-run any single step without replaying the entire pipeline.

The connection between workflows is HTTP webhooks. n8n's "Execute Workflow" node works for synchronous calls, but for long-running agent tasks, async webhooks are more reliable. The writing workflow takes 2 to 4 minutes per article because of multiple LLM calls. Synchronous execution would timeout. Async webhooks let each workflow run at its own pace.

The critical pattern here is separation of concerns. Do not build one giant workflow with 30 nodes. Build small, focused workflows that trigger each other. Each one should do one thing well, have its own error handling, and be independently testable. This is the same principle as writing small functions instead of one monolithic block of code. Apply it to your automation architecture.

What I Would Do Differently

If I were starting fresh today, three things would change.

First, I would add structured logging to an external service from day one. n8n's execution history UI is fine for simple workflows but becomes painful when you are debugging a multi-agent loop with 15+ steps across chained workflows. Logging key events (prompt sent, tokens used, tool called, result received) to a separate system would have saved me countless hours.

Second, I would build a proper evaluation pipeline before scaling content output. Running automated quality checks on every piece of generated content, checking factual accuracy against sources, measuring readability, and comparing against top-ranking competitors, would catch issues that a single LLM self-critique pass misses. I am building this now, but it should have been part of the original architecture.

Third, I would use n8n's version control features from the start. n8n supports git-backed workflow versioning. I ignored this for months and ended up manually tracking workflow changes in a spreadsheet. Connect your n8n instance to a private git repo early. Every workflow change gets committed with a diff, and you can roll back broken changes in seconds.

The Takeaway

n8n is not a toy. It is a legitimate orchestration layer for production AI workflows, as long as you treat it like production infrastructure. That means proper error handling, credential management, external logging, idempotent operations, and a deployment setup that does not fall over under load. The visual canvas lets you move fast on orchestration logic while your custom code handles the heavy execution. For anyone building autonomous content systems, multi-agent pipelines, or scheduled AI workflows, this stack gets you to production faster than building everything from scratch.

If you are working on an AI automation project and want to talk architecture, orchestration patterns, or getting a pipeline from prototype to reliable production, reach out at lazar-milicevic.com/#contact. I work with companies on exactly these problems, from first workflow to scaled deployment.

Frequently asked questions

Is n8n good enough for production AI workflows or is it just a hobby tool?

n8n is absolutely production-ready for AI automation if you architect it correctly. I run autonomous content and SEO pipelines on it that publish without human intervention, and the key is treating it as an orchestration layer rather than a place to dump all your logic. I separate n8n (which decides what happens next) from custom Node.js functions (which do the heavy lifting via webhooks), so I can debug independently and rebuild workflows quickly if internal state gets corrupted. The entire self-hosted Docker setup costs me under $50/month in infrastructure.

How much does it cost to run n8n with Claude API for automated content generation?

My self-hosted n8n production stack runs on a single VPS with Docker Compose, costing under $50/month in infrastructure. On top of that, Claude API usage for content generation runs roughly $0.15 to $0.40 per published piece, depending on research depth and revision rounds. I use Claude Sonnet for writing tasks and Haiku for classification to keep costs optimized. Compared to rebuilding the same orchestration in pure code, which took weeks, the n8n setup paid for itself almost immediately.

How do you handle errors and failures in n8n production workflows?

I built a dedicated error workflow using n8n's Error Trigger node that does three things: logs the error to PostgreSQL, sends a Slack alert with workflow details, and retries the failed execution up to two times with a 5-minute delay. The retry hits a webhook on the original workflow, which means every workflow must be idempotent, processing the same input twice should produce the same result, not a duplicate. I also set execution timeouts in workflow settings (10 minutes for content generation) because n8n workflows can hang indefinitely if an external service stops responding.

How do you validate LLM output in n8n before publishing or saving to a database?

Never pass raw LLM output directly into a database write or publish action. I use an intermediate Code node that validates content length, meta description length, slug format, and other fields before proceeding. If validation fails, the item gets routed to a manual review queue with a 'needs_review' status instead of crashing the workflow. This pattern has saved me from publishing broken content more times than I can count, because LLMs will eventually return something unexpected, malformed, or incomplete.

What does an n8n AI agent architecture look like for autonomous content pipelines?

I use n8n's AI agent node, which supports the ReAct pattern, the LLM decides which tools to call, executes them, observes results, and loops until it has a final answer. My stack includes n8n for orchestration with cron triggers, Claude API (Sonnet for writing, Haiku for classification), PostgreSQL with pgvector for content storage and RAG, and Slack alerts for monitoring. The agent autonomously researches topics, pulls keyword data, and writes structured content without a hard-coded sequence. The main trade-off is debugging complexity: when a multi-step agent fails, the visual canvas hides execution traces behind a UI not designed for long ReAct loops, so I recommend logging everything to an external store.

Lazar Milicevic

Lazar Milićević

Senior Technical Engineer. I build AI automation, GenAI/LLM systems and cloud architecture — autonomous systems that run while you sleep. Founder of BizFlowAI.

Building something hard with AI or automation? I am open to talk.

Get in touch

← All posts