<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Supra Builds</title>
    <link>https://suprahuang.cc</link>
    <description>Full-stack product engineer sharing practical insights on modern web development, AI integration, and open source. Focused on building real-world solutions.</description>
    <language>en</language>
    <ttl>60</ttl>
    <atom:link href="https://suprahuang.cc/rss.xml" rel="self" type="application/rss+xml" />
    <image>
      <url>https://cdn.hashnode.com/res/hashnode/image/upload/v1770478457183/d54d5e81-7845-4264-a213-af27eb137d6d.png</url>
      <title>Supra Builds</title>
      <link>https://suprahuang.cc</link>
    </image>
    <item>
      <title>Beyond Chatbots: Building Real-World Stateful AI Agents on Cloudflare</title>
      <link>https://suprahuang.cc/beyond-chatbots-building-stateful-ai-agents-cloudflare-agents-sdk</link>
      <guid isPermaLink="true">https://suprahuang.cc/beyond-chatbots-building-stateful-ai-agents-cloudflare-agents-sdk</guid>
      <description>Most &quot;AI agents&quot; you see today are just LLM wrappers with a fancy prompt. They process a request, return a response, and forget everything. No memory. No scheduling. No persistence.
Real agents are different. They remember what happened yesterday. Th...</description>
      <content:encoded><![CDATA[<p>Most "AI agents" you see today are just LLM wrappers with a fancy prompt. They process a request, return a response, and forget everything. No memory. No scheduling. No persistence.</p>
<p>Real agents are different. They remember what happened yesterday. They wake up at 3 AM to check on things. They pause and ask for human approval when stakes are high. They maintain state across sessions, making decisions based on accumulated context — not just the current prompt.</p>
<p>In this tutorial, we'll build exactly that: a <strong>Smart Site Reliability Agent</strong> that monitors your websites, uses AI to detect anomalies, and escalates critical issues to you — all running on Cloudflare's edge network with zero cost when idle.</p>
<p>No chatbot UI. No conversational fluff. Just a stateful, autonomous agent doing real work.</p>
<hr />
<h2 id="heading-what-makes-an-ai-agent-stateful">What Makes an AI Agent "Stateful"?</h2>
<p>A <strong>stateful AI agent</strong> is a long-running program that persists its memory, decisions, and context across interactions and restarts. Unlike stateless LLM calls where each request starts from scratch, a stateful agent accumulates knowledge over time.</p>
<p>Here's the key difference:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td></td><td>Stateless LLM Wrapper</td><td>Stateful AI Agent</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Memory</strong></td><td>None between requests</td><td>Persistent across sessions</td></tr>
<tr>
<td><strong>Scheduling</strong></td><td>Only responds when called</td><td>Can wake itself up on a schedule</td></tr>
<tr>
<td><strong>Context</strong></td><td>Single conversation turn</td><td>Accumulated history and patterns</td></tr>
<tr>
<td><strong>Decision Making</strong></td><td>Reactive only</td><td>Proactive — acts on its own</td></tr>
<tr>
<td><strong>Cost When Idle</strong></td><td>$0</td><td>$0 (with hibernation)</td></tr>
</tbody>
</table>
</div><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771268561137/b384ebc7-5817-4335-a89e-9f1534cd91fd.webp" alt="Stateful vs Stateless AI Agents: key differences in memory, scheduling, and decision making" class="image--center mx-auto" /></p>
<p>Think of it this way: a stateless LLM call is like asking a stranger for directions every time. A stateful agent is like having an assistant who knows your route, remembers the traffic patterns, and proactively suggests alternatives before you even ask.</p>
<p>The challenge has always been: <strong>where do you run a stateful agent in production?</strong> Traditional serverless functions are stateless by design. Containers require always-on infrastructure. That's where Cloudflare's approach gets interesting.</p>
<hr />
<h2 id="heading-why-cloudflare-for-ai-agents">Why Cloudflare for AI Agents?</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771267527413/5e7aa7a7-098e-4d4b-882d-9c9b91d356fa.webp" alt="Cloudflare Agents SDK architecture: Worker routing to Durable Object agents with built-in SQLite, WebSocket, and scheduling" class="image--center mx-auto" /></p>
<p>Cloudflare's <a target="_blank" href="https://github.com/cloudflare/agents">Agents SDK</a> is built on top of <strong>Durable Objects</strong> — essentially stateful micro-servers that live on Cloudflare's global edge network. Each agent instance is its own isolated server with:</p>
<ul>
<li><p><strong>Built-in SQLite database</strong> — No external database needed. Your agent's memory lives right next to its compute.</p>
</li>
<li><p><strong>WebSocket support with hibernation</strong> — Real-time connections that cost nothing when idle. The agent wakes up only when a message arrives.</p>
</li>
<li><p><strong>Scheduled tasks (alarms)</strong> — Cron-like scheduling built into the runtime. Your agent can wake itself up to do work.</p>
</li>
<li><p><strong>Automatic global distribution</strong> — Each agent instance runs closest to where it's needed.</p>
</li>
</ul>
<p>The killer feature? <strong>Hibernation</strong>. When your agent has no active connections and no pending alarms, it literally costs $0. It's like having a dedicated server that only charges you when it's thinking.</p>
<h3 id="heading-when-to-use-what">When to Use What</h3>
<p>Before reaching for the Agents SDK, consider the alternatives:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Use Case</td><td>Best Choice</td></tr>
</thead>
<tbody>
<tr>
<td>Simple request/response AI</td><td>Regular Worker + Workers AI</td></tr>
<tr>
<td>Multi-step background jobs</td><td>Cloudflare Workflows</td></tr>
<tr>
<td>Stateful, long-lived agent with real-time sync</td><td><strong>Agents SDK</strong> ✅</td></tr>
<tr>
<td>Key-value state without real-time</td><td>Durable Objects directly</td></tr>
</tbody>
</table>
</div><p>The Agents SDK shines when you need <strong>persistent state + real-time communication + scheduled tasks</strong> in one package.</p>
<hr />
<h2 id="heading-what-well-build-a-smart-site-reliability-agent">What We'll Build: A Smart Site Reliability Agent</h2>
<p>Our agent isn't a simple uptime checker. It's an AI-powered reliability monitor that:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Feature</td><td>SDK Capability</td></tr>
</thead>
<tbody>
<tr>
<td>⏰ Runs health checks every 5 minutes</td><td><code>scheduleEvery()</code></td></tr>
<tr>
<td>💾 Stores check history in SQLite</td><td><code>this.sql</code></td></tr>
<tr>
<td>🧠 Uses AI to detect anomaly patterns</td><td>AI SDK integration</td></tr>
<tr>
<td>📡 Pushes live updates to a dashboard</td><td>WebSocket + <code>useAgent</code></td></tr>
<tr>
<td>🔧 Supports manual controls via RPC</td><td><code>@callable()</code></td></tr>
<tr>
<td>🚨 Escalates critical issues for human approval</td><td>Human-in-the-loop</td></tr>
</tbody>
</table>
</div><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771268610732/3ad61907-634f-4659-adf6-4829414602f1.webp" alt="Smart Site Reliability Agent: feature overview showing scheduled checks, AI analysis, real-time dashboard, and human-in-the-loop escalation" class="image--center mx-auto" /></p>
<p>By the end, you'll have a fully deployed agent that watches over your sites and thinks about what it sees — not just whether a URL returns 200.</p>
<hr />
<h2 id="heading-project-setup">Project Setup</h2>
<h3 id="heading-prerequisites">Prerequisites</h3>
<ul>
<li><p>Node.js 20+ (Node 24+ recommended)</p>
</li>
<li><p>A <a target="_blank" href="https://dash.cloudflare.com/sign-up">Cloudflare account</a> (Workers Paid plan for Durable Objects)</p>
</li>
<li><p>An API key from any LLM provider (OpenAI, Anthropic, or Cloudflare Workers AI)</p>
</li>
</ul>
<h3 id="heading-scaffold-the-project">Scaffold the Project</h3>
<pre><code class="lang-bash">npm create cloudflare@latest site-reliability-agent -- --template cloudflare/agents-starter
<span class="hljs-built_in">cd</span> site-reliability-agent
npm install
</code></pre>
<h3 id="heading-project-structure">Project Structure</h3>
<pre><code class="lang-plaintext">site-reliability-agent/
├── src/
│   ├── server.ts          # Agent class + Worker entry
│   └── client.tsx         # React dashboard with useAgent
├── wrangler.jsonc         # Cloudflare configuration
├── .dev.vars              # Local secrets (API keys)
└── package.json
</code></pre>
<h3 id="heading-wrangler-configuration">Wrangler Configuration</h3>
<pre><code class="lang-plaintext">// wrangler.jsonc
{
  "name": "site-reliability-agent",
  "main": "src/server.ts",
  "compatibility_flags": ["nodejs_compat"],
  "durable_objects": {
    "bindings": [
      {
        "name": "SiteAgent",
        "class_name": "SiteAgent"
      }
    ]
  },
  "migrations": [
    {
      "tag": "v1",
      "new_sqlite_classes": ["SiteAgent"]
    }
  ]
}
</code></pre>
<p>Add your LLM API key to <code>.dev.vars</code>:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># .dev.vars (never commit this file)</span>
OPENAI_API_KEY=sk-your-key-here
</code></pre>
<hr />
<h2 id="heading-building-the-agent-core">Building the Agent Core</h2>
<h3 id="heading-defining-state-and-the-agent-class">Defining State and the Agent Class</h3>
<p>Let's start with the agent's state shape and core class:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// src/server.ts</span>
<span class="hljs-keyword">import</span> { Agent, routeAgentRequest } <span class="hljs-keyword">from</span> <span class="hljs-string">"agents"</span>;

<span class="hljs-keyword">type</span> Env = {
  SiteAgent: DurableObjectNamespace;
  OPENAI_API_KEY: <span class="hljs-built_in">string</span>;
};

<span class="hljs-keyword">type</span> SiteStatus = <span class="hljs-string">"healthy"</span> | <span class="hljs-string">"degraded"</span> | <span class="hljs-string">"down"</span> | <span class="hljs-string">"unknown"</span>;

<span class="hljs-keyword">type</span> AgentState = {
  monitoredUrls: <span class="hljs-built_in">string</span>[];
  checkIntervalMinutes: <span class="hljs-built_in">number</span>;
  lastCheckAt: <span class="hljs-built_in">string</span> | <span class="hljs-literal">null</span>;
  currentStatus: Record&lt;<span class="hljs-built_in">string</span>, SiteStatus&gt;;
  alertsEnabled: <span class="hljs-built_in">boolean</span>;
  pendingEscalation: {
    url: <span class="hljs-built_in">string</span>;
    reason: <span class="hljs-built_in">string</span>;
    timestamp: <span class="hljs-built_in">string</span>;
  } | <span class="hljs-literal">null</span>;
};

<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> SiteAgent <span class="hljs-keyword">extends</span> Agent&lt;Env, AgentState&gt; {
  <span class="hljs-comment">// Default state when the agent is first created</span>
  initialState: AgentState = {
    monitoredUrls: [],
    checkIntervalMinutes: <span class="hljs-number">5</span>,
    lastCheckAt: <span class="hljs-literal">null</span>,
    currentStatus: {},
    alertsEnabled: <span class="hljs-literal">true</span>,
    pendingEscalation: <span class="hljs-literal">null</span>,
  };

  <span class="hljs-keyword">async</span> onStart() {
    <span class="hljs-comment">// Initialize the SQLite table for check history</span>
    <span class="hljs-built_in">this</span>.sql<span class="hljs-string">`
      CREATE TABLE IF NOT EXISTS check_history (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        url TEXT NOT NULL,
        status_code INTEGER,
        response_time_ms INTEGER,
        status TEXT NOT NULL,
        ai_analysis TEXT,
        checked_at TEXT DEFAULT (datetime('now'))
      )
    `</span>;
  }
}
</code></pre>
<p>A few things to notice:</p>
<ul>
<li><p><code>initialState</code> sets the default state for new agent instances</p>
</li>
<li><p><code>this.sql</code> is a tagged template literal — it gives you direct SQLite access, no ORM needed</p>
</li>
<li><p>State updates via <code>setState()</code> are automatically synced to all connected WebSocket clients</p>
</li>
</ul>
<h3 id="heading-health-check-logic-with-scheduled-tasks">Health Check Logic with Scheduled Tasks</h3>
<p>Now let's add the scheduled health checks:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// Inside the SiteAgent class</span>

<span class="hljs-keyword">async</span> onStart() {
  <span class="hljs-comment">// ... SQLite init from above ...</span>

  <span class="hljs-comment">// Start the health check schedule</span>
  <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.state.monitoredUrls.length &gt; <span class="hljs-number">0</span>) {
    <span class="hljs-built_in">this</span>.scheduleEvery(<span class="hljs-string">"runHealthChecks"</span>, <span class="hljs-string">`*/<span class="hljs-subst">${<span class="hljs-built_in">this</span>.state.checkIntervalMinutes}</span> * * * *`</span>);
  }
}

<span class="hljs-keyword">async</span> runHealthChecks() {
  <span class="hljs-keyword">const</span> results: Record&lt;<span class="hljs-built_in">string</span>, SiteStatus&gt; = {};

  <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> url <span class="hljs-keyword">of</span> <span class="hljs-built_in">this</span>.state.monitoredUrls) {
    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.checkUrl(url);
    results[url] = result.status;

    <span class="hljs-comment">// Store in SQLite</span>
    <span class="hljs-built_in">this</span>.sql<span class="hljs-string">`
      INSERT INTO check_history (url, status_code, response_time_ms, status)
      VALUES (<span class="hljs-subst">${url}</span>, <span class="hljs-subst">${result.statusCode}</span>, <span class="hljs-subst">${result.responseTime}</span>, <span class="hljs-subst">${result.status}</span>)
    `</span>;
  }

  <span class="hljs-built_in">this</span>.setState({
    currentStatus: results,
    lastCheckAt: <span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>().toISOString(),
  });

  <span class="hljs-comment">// Broadcast to all connected dashboard clients</span>
  <span class="hljs-built_in">this</span>.broadcast(<span class="hljs-built_in">JSON</span>.stringify({
    <span class="hljs-keyword">type</span>: <span class="hljs-string">"health_check_complete"</span>,
    results,
    timestamp: <span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>().toISOString(),
  }));
}

<span class="hljs-keyword">private</span> <span class="hljs-keyword">async</span> checkUrl(url: <span class="hljs-built_in">string</span>): <span class="hljs-built_in">Promise</span>&lt;{
  statusCode: <span class="hljs-built_in">number</span>;
  responseTime: <span class="hljs-built_in">number</span>;
  status: SiteStatus;
}&gt; {
  <span class="hljs-keyword">const</span> start = <span class="hljs-built_in">Date</span>.now();

  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> fetch(url, {
      method: <span class="hljs-string">"GET"</span>,
      signal: AbortSignal.timeout(<span class="hljs-number">10</span>_000), <span class="hljs-comment">// 10s timeout</span>
    });

    <span class="hljs-keyword">const</span> responseTime = <span class="hljs-built_in">Date</span>.now() - start;
    <span class="hljs-keyword">let</span> status: SiteStatus = <span class="hljs-string">"healthy"</span>;

    <span class="hljs-keyword">if</span> (!response.ok) {
      status = response.status &gt;= <span class="hljs-number">500</span> ? <span class="hljs-string">"down"</span> : <span class="hljs-string">"degraded"</span>;
    } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (responseTime &gt; <span class="hljs-number">3000</span>) {
      status = <span class="hljs-string">"degraded"</span>;
    }

    <span class="hljs-keyword">return</span> { statusCode: response.status, responseTime, status };
  } <span class="hljs-keyword">catch</span> {
    <span class="hljs-keyword">return</span> { statusCode: <span class="hljs-number">0</span>, responseTime: <span class="hljs-built_in">Date</span>.now() - start, status: <span class="hljs-string">"down"</span> };
  }
}
</code></pre>
<p>The <code>scheduleEvery</code> method accepts a cron expression. Every 5 minutes, the agent wakes up from hibernation, runs all health checks, stores results, updates its state, and broadcasts to any connected dashboards — then goes back to sleep.</p>
<h3 id="heading-querying-history-with-sqlite">Querying History with SQLite</h3>
<p>The built-in SQLite database makes historical queries trivial:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// Inside the SiteAgent class</span>

<span class="hljs-keyword">private</span> getRecentHistory(url: <span class="hljs-built_in">string</span>, limit = <span class="hljs-number">20</span>) {
  <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.sql&lt;{
    status_code: <span class="hljs-built_in">number</span>;
    response_time_ms: <span class="hljs-built_in">number</span>;
    status: <span class="hljs-built_in">string</span>;
    ai_analysis: <span class="hljs-built_in">string</span> | <span class="hljs-literal">null</span>;
    checked_at: <span class="hljs-built_in">string</span>;
  }&gt;<span class="hljs-string">`
    SELECT status_code, response_time_ms, status, ai_analysis, checked_at
    FROM check_history
    WHERE url = <span class="hljs-subst">${url}</span>
    ORDER BY checked_at DESC
    LIMIT <span class="hljs-subst">${limit}</span>
  `</span>;
}

<span class="hljs-keyword">private</span> getStatusTrend(url: <span class="hljs-built_in">string</span>) {
  <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.sql&lt;{ status: <span class="hljs-built_in">string</span>; count: <span class="hljs-built_in">number</span> }&gt;<span class="hljs-string">`
    SELECT status, COUNT(*) as count
    FROM check_history
    WHERE url = <span class="hljs-subst">${url}</span>
      AND checked_at &gt; datetime('now', '-1 hour')
    GROUP BY status
  `</span>;
}
</code></pre>
<p>No external database. No connection strings. No cold starts on DB connections. The data lives right next to the agent's compute.</p>
<hr />
<h2 id="heading-adding-ai-powered-analysis">Adding AI-Powered Analysis</h2>
<p>This is where our agent goes from "uptime checker" to "site reliability engineer." Instead of just checking status codes, we feed the check history to an LLM for pattern analysis.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { generateText } <span class="hljs-keyword">from</span> <span class="hljs-string">"ai"</span>;
<span class="hljs-keyword">import</span> { openai } <span class="hljs-keyword">from</span> <span class="hljs-string">"@ai-sdk/openai"</span>;

<span class="hljs-comment">// Inside the SiteAgent class</span>

<span class="hljs-keyword">async</span> runHealthChecks() {
  <span class="hljs-comment">// ... health check logic from above ...</span>

  <span class="hljs-comment">// After checks complete, ask AI to analyze patterns</span>
  <span class="hljs-keyword">const</span> hasIssues = <span class="hljs-built_in">Object</span>.values(results).some(
    <span class="hljs-function">(<span class="hljs-params">s</span>) =&gt;</span> s === <span class="hljs-string">"degraded"</span> || s === <span class="hljs-string">"down"</span>
  );

  <span class="hljs-keyword">if</span> (hasIssues) {
    <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.analyzeWithAI(results);
  }
}

<span class="hljs-keyword">private</span> <span class="hljs-keyword">async</span> analyzeWithAI(currentResults: Record&lt;<span class="hljs-built_in">string</span>, SiteStatus&gt;) {
  <span class="hljs-comment">// Gather recent history for context</span>
  <span class="hljs-keyword">const</span> historyByUrl: Record&lt;<span class="hljs-built_in">string</span>, <span class="hljs-built_in">any</span>[]&gt; = {};
  <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> url <span class="hljs-keyword">of</span> <span class="hljs-built_in">this</span>.state.monitoredUrls) {
    historyByUrl[url] = <span class="hljs-built_in">this</span>.getRecentHistory(url, <span class="hljs-number">10</span>);
  }

  <span class="hljs-keyword">const</span> { text: analysis } = <span class="hljs-keyword">await</span> generateText({
    model: openai(<span class="hljs-string">"gpt-4o-mini"</span>, { structuredOutputs: <span class="hljs-literal">true</span> }),
    system: <span class="hljs-string">`You are a site reliability engineer analyzing website health data.
Be concise and actionable. Focus on patterns, not individual data points.
Flag anything that suggests an emerging problem, not just current outages.`</span>,
    prompt: <span class="hljs-string">`Current check results: <span class="hljs-subst">${<span class="hljs-built_in">JSON</span>.stringify(currentResults)}</span>

Recent history (last 10 checks per URL):
<span class="hljs-subst">${<span class="hljs-built_in">JSON</span>.stringify(historyByUrl, <span class="hljs-literal">null</span>, <span class="hljs-number">2</span>)}</span>

Analyze:
1. Are there any concerning patterns (increasing latency, intermittent failures)?
2. Is this likely a transient issue or systematic problem?
3. Recommended action: MONITOR, INVESTIGATE, or ESCALATE?`</span>,
  });

  <span class="hljs-comment">// Store the analysis</span>
  <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> [url, status] <span class="hljs-keyword">of</span> <span class="hljs-built_in">Object</span>.entries(currentResults)) {
    <span class="hljs-keyword">if</span> (status !== <span class="hljs-string">"healthy"</span>) {
      <span class="hljs-built_in">this</span>.sql<span class="hljs-string">`
        UPDATE check_history
        SET ai_analysis = <span class="hljs-subst">${analysis}</span>
        WHERE url = <span class="hljs-subst">${url}</span>
        AND id = (SELECT MAX(id) FROM check_history WHERE url = <span class="hljs-subst">${url}</span>)
      `</span>;
    }
  }

  <span class="hljs-comment">// If AI recommends escalation, trigger human-in-the-loop</span>
  <span class="hljs-keyword">if</span> (analysis.includes(<span class="hljs-string">"ESCALATE"</span>)) {
    <span class="hljs-built_in">this</span>.setState({
      pendingEscalation: {
        url: <span class="hljs-built_in">Object</span>.entries(currentResults)
          .filter(<span class="hljs-function">(<span class="hljs-params">[, s]</span>) =&gt;</span> s !== <span class="hljs-string">"healthy"</span>)
          .map(<span class="hljs-function">(<span class="hljs-params">[u]</span>) =&gt;</span> u)
          .join(<span class="hljs-string">", "</span>),
        reason: analysis,
        timestamp: <span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>().toISOString(),
      },
    });

    <span class="hljs-built_in">this</span>.broadcast(<span class="hljs-built_in">JSON</span>.stringify({
      <span class="hljs-keyword">type</span>: <span class="hljs-string">"escalation_required"</span>,
      analysis,
      timestamp: <span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>().toISOString(),
    }));
  }
}
</code></pre>
<p>The AI doesn't just check if a site is up — it looks at <strong>patterns</strong>. Is response time gradually increasing? Are failures clustered at specific times? Is this a CDN issue or an origin server problem? These are the kinds of insights that turn raw data into actionable intelligence.</p>
<hr />
<h2 id="heading-real-time-dashboard-with-useagent">Real-Time Dashboard with useAgent</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771267615139/2520cf8a-bbd7-4fec-8e73-ff7e5cd5e61f.webp" alt="Real-time monitoring dashboard with WebSocket state sync" class="image--center mx-auto" /></p>
<p>The agent handles the backend. Now let's build a React frontend that stays in sync via WebSocket.</p>
<h3 id="heading-connecting-with-useagent">Connecting with useAgent</h3>
<pre><code class="lang-tsx">// src/client.tsx
import { useAgent } from "agents/react";

function Dashboard() {
  const agent = useAgent&lt;SiteAgent, AgentState&gt;({
    agent: "site-agent",
    name: "my-sites", // Each unique name = unique agent instance
  });

  if (!agent.state) return &lt;div&gt;Connecting to agent...&lt;/div&gt;;

  return (
    &lt;div className="dashboard"&gt;
      &lt;header&gt;
        &lt;h1&gt;Site Reliability Agent&lt;/h1&gt;
        &lt;span className="last-check"&gt;
          Last check: {agent.state.lastCheckAt ?? "Never"}
        &lt;/span&gt;
      &lt;/header&gt;

      &lt;div className="status-grid"&gt;
        {agent.state.monitoredUrls.map((url) =&gt; (
          &lt;StatusCard
            key={url}
            url={url}
            status={agent.state.currentStatus[url] ?? "unknown"}
          /&gt;
        ))}
      &lt;/div&gt;

      {agent.state.pendingEscalation &amp;&amp; (
        &lt;EscalationBanner
          escalation={agent.state.pendingEscalation}
          onApprove={() =&gt; agent.stub.acknowledgeEscalation()}
          onDismiss={() =&gt; agent.stub.dismissEscalation()}
        /&gt;
      )}

      &lt;ManualControls agent={agent} /&gt;
    &lt;/div&gt;
  );
}
</code></pre>
<p>When the agent calls <code>setState()</code>, every connected dashboard updates instantly — no polling, no refetching. The <code>useAgent</code> hook handles WebSocket connection, reconnection, and state synchronization automatically.</p>
<h3 id="heading-callable-methods-for-manual-controls">Callable Methods for Manual Controls</h3>
<p>The <code>@callable()</code> decorator exposes server-side methods that the frontend can call with full type safety:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// In src/server.ts — inside SiteAgent class</span>

<span class="hljs-meta">@callable</span>()
<span class="hljs-keyword">async</span> addUrl(url: <span class="hljs-built_in">string</span>) {
  <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.state.monitoredUrls.includes(url)) {
    <span class="hljs-keyword">return</span> { success: <span class="hljs-literal">false</span>, error: <span class="hljs-string">"URL already monitored"</span> };
  }

  <span class="hljs-built_in">this</span>.setState({
    monitoredUrls: [...this.state.monitoredUrls, url],
    currentStatus: { ...this.state.currentStatus, [url]: <span class="hljs-string">"unknown"</span> },
  });

  <span class="hljs-comment">// Restart the schedule if this is the first URL</span>
  <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.state.monitoredUrls.length === <span class="hljs-number">1</span>) {
    <span class="hljs-built_in">this</span>.scheduleEvery(
      <span class="hljs-string">"runHealthChecks"</span>,
      <span class="hljs-string">`*/<span class="hljs-subst">${<span class="hljs-built_in">this</span>.state.checkIntervalMinutes}</span> * * * *`</span>
    );
  }

  <span class="hljs-keyword">return</span> { success: <span class="hljs-literal">true</span> };
}

<span class="hljs-meta">@callable</span>()
<span class="hljs-keyword">async</span> removeUrl(url: <span class="hljs-built_in">string</span>) {
  <span class="hljs-built_in">this</span>.setState({
    monitoredUrls: <span class="hljs-built_in">this</span>.state.monitoredUrls.filter(<span class="hljs-function">(<span class="hljs-params">u</span>) =&gt;</span> u !== url),
    currentStatus: <span class="hljs-built_in">Object</span>.fromEntries(
      <span class="hljs-built_in">Object</span>.entries(<span class="hljs-built_in">this</span>.state.currentStatus).filter(<span class="hljs-function">(<span class="hljs-params">[u]</span>) =&gt;</span> u !== url)
    ),
  });

  <span class="hljs-keyword">return</span> { success: <span class="hljs-literal">true</span> };
}

<span class="hljs-meta">@callable</span>()
<span class="hljs-keyword">async</span> triggerManualCheck() {
  <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.runHealthChecks();
  <span class="hljs-keyword">return</span> { success: <span class="hljs-literal">true</span>, checkedAt: <span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>().toISOString() };
}
</code></pre>
<p>On the client, calling these is as simple as:</p>
<pre><code class="lang-tsx">// Type-safe RPC — no manual fetch calls needed
await agent.stub.addUrl("https://example.com");
await agent.stub.triggerManualCheck();
</code></pre>
<hr />
<h2 id="heading-human-in-the-loop-escalation-that-works">Human-in-the-Loop: Escalation That Works</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771268656337/900ae3a8-662f-42ac-8f4c-9ef9069bab7c.webp" alt="Human-in-the-loop escalation flow: AI detects pattern, agent pauses, human decides, agent resumes" class="image--center mx-auto" /></p>
<p>When the AI detects something serious, the agent doesn't just log it — it pauses and waits for human judgment:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// In SiteAgent class</span>

<span class="hljs-meta">@callable</span>()
<span class="hljs-keyword">async</span> acknowledgeEscalation() {
  <span class="hljs-keyword">const</span> escalation = <span class="hljs-built_in">this</span>.state.pendingEscalation;
  <span class="hljs-keyword">if</span> (!escalation) <span class="hljs-keyword">return</span> { success: <span class="hljs-literal">false</span>, error: <span class="hljs-string">"No pending escalation"</span> };

  <span class="hljs-comment">// Log the acknowledgment</span>
  <span class="hljs-built_in">this</span>.sql<span class="hljs-string">`
    INSERT INTO check_history (url, status_code, response_time_ms, status, ai_analysis)
    VALUES (
      <span class="hljs-subst">${escalation.url}</span>,
      0,
      0,
      'acknowledged',
      <span class="hljs-subst">${<span class="hljs-string">'Human acknowledged: '</span> + escalation.reason}</span>
    )
  `</span>;

  <span class="hljs-comment">// Clear the escalation</span>
  <span class="hljs-built_in">this</span>.setState({ pendingEscalation: <span class="hljs-literal">null</span> });

  <span class="hljs-built_in">this</span>.broadcast(<span class="hljs-built_in">JSON</span>.stringify({
    <span class="hljs-keyword">type</span>: <span class="hljs-string">"escalation_resolved"</span>,
    action: <span class="hljs-string">"acknowledged"</span>,
    timestamp: <span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>().toISOString(),
  }));

  <span class="hljs-keyword">return</span> { success: <span class="hljs-literal">true</span> };
}

<span class="hljs-meta">@callable</span>()
<span class="hljs-keyword">async</span> dismissEscalation() {
  <span class="hljs-built_in">this</span>.setState({ pendingEscalation: <span class="hljs-literal">null</span> });

  <span class="hljs-built_in">this</span>.broadcast(<span class="hljs-built_in">JSON</span>.stringify({
    <span class="hljs-keyword">type</span>: <span class="hljs-string">"escalation_resolved"</span>,
    action: <span class="hljs-string">"dismissed"</span>,
    timestamp: <span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>().toISOString(),
  }));

  <span class="hljs-keyword">return</span> { success: <span class="hljs-literal">true</span> };
}
</code></pre>
<p>The escalation flow works like this:</p>
<ol>
<li><p><strong>AI detects a pattern</strong> → Recommends <code>ESCALATE</code></p>
</li>
<li><p><strong>Agent updates state</strong> → <code>pendingEscalation</code> is set</p>
</li>
<li><p><strong>Dashboard shows banner</strong> → Human sees the AI's analysis and reasoning</p>
</li>
<li><p><strong>Human decides</strong> → Acknowledge (take action) or Dismiss (false alarm)</p>
</li>
<li><p><strong>Agent records the decision</strong> → Builds a history of escalations for future AI context</p>
</li>
</ol>
<p>This is the real power of stateful agents: they can <strong>pause, wait, and resume</strong> based on human input without losing their context.</p>
<hr />
<h2 id="heading-worker-entry-point">Worker Entry Point</h2>
<p>Don't forget the Worker entry that routes requests to agent instances:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// At the bottom of src/server.ts</span>

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> {
  <span class="hljs-keyword">async</span> fetch(request: Request, env: Env) {
    <span class="hljs-comment">// Route to the correct agent instance</span>
    <span class="hljs-keyword">return</span> routeAgentRequest(request, env);
  },
} satisfies ExportedHandler&lt;Env&gt;;
</code></pre>
<p>The <code>routeAgentRequest</code> function dispatches requests to the right Durable Object instance based on the URL pattern: <code>/agents/site-agent/:instance-name</code>.</p>
<hr />
<h2 id="heading-testing-and-deploying-to-production">Testing and Deploying to Production</h2>
<h3 id="heading-local-development">Local Development</h3>
<pre><code class="lang-bash">npx wrangler dev
</code></pre>
<p>This starts a local development server with full Durable Object support. Your agent runs with real SQLite, real WebSocket connections, and real scheduling — identical to production.</p>
<p>Open <code>http://localhost:8787</code> to see your dashboard. Add a URL and watch the agent start monitoring.</p>
<h3 id="heading-deploy-to-cloudflare">Deploy to Cloudflare</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Set your API key as a secret</span>
npx wrangler secret put OPENAI_API_KEY

<span class="hljs-comment"># Deploy</span>
npx wrangler deploy
</code></pre>
<p>Your agent is now live on Cloudflare's global network. Each unique instance name creates an isolated agent with its own state, database, and schedule.</p>
<h3 id="heading-environment-separation">Environment Separation</h3>
<p>For staging vs production, use <a target="_blank" href="https://developers.cloudflare.com/workers/wrangler/environments/">wrangler environments</a>:</p>
<pre><code class="lang-plaintext">// wrangler.jsonc
{
  "name": "site-reliability-agent",
  "env": {
    "staging": {
      "name": "site-reliability-agent-staging",
      "vars": { "ENVIRONMENT": "staging" }
    },
    "production": {
      "name": "site-reliability-agent",
      "vars": { "ENVIRONMENT": "production" }
    }
  }
}
</code></pre>
<hr />
<h2 id="heading-performance-limits-and-cost-breakdown">Performance, Limits, and Cost Breakdown</h2>
<h3 id="heading-cloudflare-agents-limits">Cloudflare Agents Limits</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Resource</td><td>Limit</td></tr>
</thead>
<tbody>
<tr>
<td>CPU time per request</td><td>30 seconds (refreshes per event)</td></tr>
<tr>
<td>Memory per instance</td><td>128 MB</td></tr>
<tr>
<td>SQLite storage</td><td>1 GB per Durable Object</td></tr>
<tr>
<td>WebSocket connections</td><td>32,768 per instance</td></tr>
<tr>
<td>Alarm precision</td><td>~1 second</td></tr>
</tbody>
</table>
</div><h3 id="heading-cost-estimate">Cost Estimate</h3>
<p>For a typical monitoring setup (100 URLs, checked every 5 minutes):</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Component</td><td>Monthly Cost</td></tr>
</thead>
<tbody>
<tr>
<td>Worker requests (routing)</td><td>~$0.50</td></tr>
<tr>
<td>Durable Object requests</td><td>~$2.00</td></tr>
<tr>
<td>Durable Object duration</td><td>~$1.50</td></tr>
<tr>
<td>SQLite storage (1 GB)</td><td>$0.20</td></tr>
<tr>
<td>AI API calls (OpenAI)</td><td>~$5.00</td></tr>
<tr>
<td><strong>Total</strong></td><td><strong>~$9.20/month</strong></td></tr>
</tbody>
</table>
</div><p>Compare this to running the same setup on AWS (Lambda + DynamoDB + EventBridge + API Gateway), where you'd easily spend $20-30/month for equivalent functionality — plus the engineering overhead of wiring all those services together.</p>
<p>The real savings come from <strong>hibernation</strong>. Your agent only consumes resources when it's actively checking sites or serving dashboard requests. Between checks, the cost is effectively zero.</p>
<hr />
<h2 id="heading-common-pitfalls-i-learned-the-hard-way">Common Pitfalls I Learned the Hard Way</h2>
<h3 id="heading-1-the-destroy-lifecycle-trap">1. The <code>destroy()</code> Lifecycle Trap</h3>
<p>When a Durable Object is evicted from memory, it doesn't call any cleanup hooks. If you're relying on in-memory state that isn't persisted via <code>setState()</code> or SQLite, it will be lost. <strong>Always persist important data immediately</strong> — don't batch writes.</p>
<h3 id="heading-2-state-serialization-limits">2. State Serialization Limits</h3>
<p><code>setState()</code> serializes your state as JSON. This means:</p>
<ul>
<li><p>No <code>Date</code> objects (use ISO strings)</p>
</li>
<li><p>No <code>Map</code> or <code>Set</code> (use plain objects and arrays)</p>
</li>
<li><p>No circular references</p>
</li>
<li><p>Keep state reasonably small — it's synced to every connected client</p>
</li>
</ul>
<h3 id="heading-3-alarm-retry-behavior">3. Alarm Retry Behavior</h3>
<p>If your scheduled handler throws an error, Cloudflare will retry it. This is usually good, but if your handler isn't idempotent (e.g., it sends notifications), you'll get duplicate actions. Always design handlers to be safe to retry.</p>
<h3 id="heading-4-websocket-reconnection">4. WebSocket Reconnection</h3>
<p>Clients will disconnect — networks are unreliable. The <code>useAgent</code> hook handles reconnection automatically, but your UI should gracefully handle the "reconnecting" state. Always show the last known state while reconnecting, rather than a blank screen.</p>
<hr />
<h2 id="heading-conclusion">Conclusion</h2>
<p>We built a stateful AI agent that goes well beyond chat:</p>
<ul>
<li><p><strong>Scheduled health checks</strong> that run autonomously on cron</p>
</li>
<li><p><strong>Persistent memory</strong> via built-in SQLite — no external database needed</p>
</li>
<li><p><strong>AI-powered analysis</strong> that spots patterns, not just failures</p>
</li>
<li><p><strong>Real-time dashboard</strong> with automatic WebSocket state sync</p>
</li>
<li><p><strong>Human-in-the-loop</strong> escalation for critical decisions</p>
</li>
</ul>
<p>The Cloudflare Agents SDK makes this surprisingly straightforward. The combination of Durable Objects (state + compute), built-in SQLite (persistent memory), WebSocket hibernation (zero idle cost), and scheduled alarms (autonomous execution) creates a platform where stateful agents are a first-class concept — not something you have to hack together from five different services.</p>
<h3 id="heading-whats-next">What's Next</h3>
<p>This is just the beginning. From here, you could:</p>
<ul>
<li><p><strong>Add MCP server support</strong> — Expose your agent as a Model Context Protocol server so AI assistants like Claude can interact with it</p>
</li>
<li><p><strong>Build multi-agent systems</strong> — Have specialized agents that coordinate with each other</p>
</li>
<li><p><strong>Add voice interaction</strong> — Cloudflare's roadmap includes real-time voice agent support</p>
</li>
<li><p><strong>Integrate browser automation</strong> — Use Cloudflare's Browser Rendering API for visual monitoring</p>
</li>
</ul>
<p>The full source code for this project is available on <a target="_blank" href="https://github.com/supra126">GitHub</a>. If you build something cool with the Agents SDK, I'd love to hear about it — drop a comment below or find me on <a target="_blank" href="https://github.com/supra126">GitHub</a>.</p>
<hr />
<p><em>Want to learn more about building AI-ready APIs? Check out my previous article:</em> <a target="_blank" href="https://suprahuang.cc/your-api-wasnt-built-for-ai-agents-heres-how-to-fix-it"><em>Your API Wasn't Built for AI Agents — Here's How to Fix It</em></a><em>.</em></p>
]]></content:encoded>
      <author>黃小黃</author>
      <pubDate>Tue, 17 Feb 2026 10:00:18 GMT</pubDate>
      <category>Cloudflare Workers</category>
      <category>ai agents</category>
      <category>durable-objects</category>
      <category>JavaScript</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Your API Wasn&apos;t Built for AI Agents — Here&apos;s How to Fix It</title>
      <link>https://suprahuang.cc/your-api-wasnt-built-for-ai-agents-heres-how-to-fix-it</link>
      <guid isPermaLink="true">https://suprahuang.cc/your-api-wasnt-built-for-ai-agents-heres-how-to-fix-it</guid>
      <description>By 2026, over 30% of API traffic will come from AI agents rather than human-driven applications. That number will keep climbing.
Here&apos;s the uncomfortable truth: most APIs were designed for human developers who read documentation, interpret ambiguous ...</description>
      <content:encoded><![CDATA[<p>By 2026, over 30% of API traffic will come from AI agents rather than human-driven applications. That number will keep climbing.</p>
<p>Here's the uncomfortable truth: most APIs were designed for human developers who read documentation, interpret ambiguous responses, and manually handle edge cases. AI agents do none of that. They parse schemas, chain requests programmatically, and fail silently when your API does something unexpected.</p>
<p>I learned this firsthand while building a <a target="_blank" href="https://suprahuang.cc/cloudflare-workers-secure-email-api">zero-cost email API on Cloudflare Workers</a>. The API worked perfectly for human integrators — clear docs, sensible endpoints, proper auth. But when I started thinking about how an AI agent would consume the same API, I realized how many assumptions I'd baked in that only made sense to humans.</p>
<p>This article is the guide I wish I'd had. We'll cover the <strong>five principles of agent-ready API design</strong>, walk through a <strong>real before-and-after retrofit</strong>, tackle <strong>authentication and error handling for non-human consumers</strong>, and finish with a <strong>migration checklist</strong> you can start on tomorrow.</p>
<p>Whether you're building new APIs or maintaining existing ones, the agent era is already here. Let's make sure your APIs are ready.</p>
<hr />
<h2 id="heading-why-ai-agents-break-your-existing-apis">Why AI Agents Break Your Existing APIs</h2>
<p>The fundamental disconnect is simple: your API was designed for developers who <em>think</em>. AI agents don't think — they <em>parse</em>.</p>
<p>Here's what that means in practice:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Aspect</td><td>Human Developer</td><td>AI Agent</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Documentation</strong></td><td>Reads prose, follows tutorials</td><td>Parses OpenAPI schemas and descriptions</td></tr>
<tr>
<td><strong>Ambiguity</strong></td><td>Infers meaning from context</td><td>Needs explicit, precise definitions</td></tr>
<tr>
<td><strong>Workflow</strong></td><td>Makes isolated, manual requests</td><td>Chains multiple calls automatically</td></tr>
<tr>
<td><strong>Errors</strong></td><td>Reads error messages, checks Stack Overflow</td><td>Needs structured codes and remediation steps</td></tr>
<tr>
<td><strong>Discovery</strong></td><td>Browses docs, bookmarks endpoints</td><td>Needs programmatic schema endpoints</td></tr>
</tbody>
</table>
</div><p>When an AI agent encounters your API, it's essentially doing this:</p>
<pre><code class="lang-json"><span class="hljs-comment">// What your API returns</span>
{
  <span class="hljs-attr">"status"</span>: <span class="hljs-string">"error"</span>,
  <span class="hljs-attr">"message"</span>: <span class="hljs-string">"Invalid request. Please check your parameters and try again."</span>
}

<span class="hljs-comment">// What the agent needs</span>
{
  <span class="hljs-attr">"status"</span>: <span class="hljs-string">"error"</span>,
  <span class="hljs-attr">"code"</span>: <span class="hljs-string">"INVALID_PARAMETER"</span>,
  <span class="hljs-attr">"message"</span>: <span class="hljs-string">"The 'email' field must be a valid email address."</span>,
  <span class="hljs-attr">"parameter"</span>: <span class="hljs-string">"email"</span>,
  <span class="hljs-attr">"received"</span>: <span class="hljs-string">"not-an-email"</span>,
  <span class="hljs-attr">"expected"</span>: <span class="hljs-string">"string (email format, RFC 5322)"</span>,
  <span class="hljs-attr">"docs"</span>: <span class="hljs-string">"https://api.example.com/docs/errors#INVALID_PARAMETER"</span>,
  <span class="hljs-attr">"remediation"</span>: <span class="hljs-string">"Validate the email format before sending. Example: user@domain.com"</span>
}
</code></pre>
<p>The first response is perfectly fine for a human who can read the message and figure out what went wrong. The second gives an agent everything it needs to <strong>self-correct and retry</strong> without human intervention.</p>
<p>This isn't just about error handling. Every layer of your API — from endpoint naming to authentication flows — carries assumptions about human consumers that break down when an agent is on the other end.</p>
<hr />
<h2 id="heading-the-5-principles-of-agent-ready-api-design">The 5 Principles of Agent-Ready API Design</h2>
<p>Through building APIs and studying how agents consume them, I've distilled the essentials into five principles. These aren't theoretical — they're the minimum bar for making your API useful to autonomous agents.</p>
<h3 id="heading-1-self-describing-let-your-api-explain-itself">1. Self-Describing: Let Your API Explain Itself</h3>
<p>The most impactful thing you can do is make your API self-describing. This means every endpoint, parameter, and response includes enough context for an agent to understand <em>what it does</em> and <em>how to use it</em> without external documentation.</p>
<p><a target="_blank" href="https://swagger.io/specification/">OpenAPI 3.0+</a> is the foundation:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Good: Rich descriptions that agents can parse</span>
<span class="hljs-attr">paths:</span>
  <span class="hljs-string">/users/{userId}/orders:</span>
    <span class="hljs-attr">get:</span>
      <span class="hljs-attr">operationId:</span> <span class="hljs-string">getUserOrders</span>
      <span class="hljs-attr">summary:</span> <span class="hljs-string">Retrieve</span> <span class="hljs-string">all</span> <span class="hljs-string">orders</span> <span class="hljs-string">for</span> <span class="hljs-string">a</span> <span class="hljs-string">specific</span> <span class="hljs-string">user</span>
      <span class="hljs-attr">description:</span> <span class="hljs-string">&gt;
        Returns a paginated list of orders placed by the specified user.
        Orders are sorted by creation date (newest first).
        Includes order items, totals, and current fulfillment status.
        Requires authentication with at least 'read:orders' scope.
</span>      <span class="hljs-attr">parameters:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">userId</span>
          <span class="hljs-attr">in:</span> <span class="hljs-string">path</span>
          <span class="hljs-attr">required:</span> <span class="hljs-literal">true</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">The</span> <span class="hljs-string">unique</span> <span class="hljs-string">identifier</span> <span class="hljs-string">of</span> <span class="hljs-string">the</span> <span class="hljs-string">user</span> <span class="hljs-string">(UUID</span> <span class="hljs-string">v4</span> <span class="hljs-string">format)</span>
          <span class="hljs-attr">schema:</span>
            <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
            <span class="hljs-attr">format:</span> <span class="hljs-string">uuid</span>
            <span class="hljs-attr">example:</span> <span class="hljs-string">"550e8400-e29b-41d4-a716-446655440000"</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">status</span>
          <span class="hljs-attr">in:</span> <span class="hljs-string">query</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">&gt;
            Filter orders by fulfillment status.
            Use 'pending' for unprocessed orders,
            'shipped' for orders in transit,
            'delivered' for completed orders.
</span>          <span class="hljs-attr">schema:</span>
            <span class="hljs-attr">type:</span> <span class="hljs-string">string</span>
            <span class="hljs-attr">enum:</span> [<span class="hljs-string">pending</span>, <span class="hljs-string">shipped</span>, <span class="hljs-string">delivered</span>, <span class="hljs-string">cancelled</span>, <span class="hljs-string">refunded</span>]
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">limit</span>
          <span class="hljs-attr">in:</span> <span class="hljs-string">query</span>
          <span class="hljs-attr">description:</span> <span class="hljs-string">Maximum</span> <span class="hljs-string">number</span> <span class="hljs-string">of</span> <span class="hljs-string">orders</span> <span class="hljs-string">to</span> <span class="hljs-string">return</span> <span class="hljs-string">(1-100,</span> <span class="hljs-string">default</span> <span class="hljs-number">20</span><span class="hljs-string">)</span>
          <span class="hljs-attr">schema:</span>
            <span class="hljs-attr">type:</span> <span class="hljs-string">integer</span>
            <span class="hljs-attr">minimum:</span> <span class="hljs-number">1</span>
            <span class="hljs-attr">maximum:</span> <span class="hljs-number">100</span>
            <span class="hljs-attr">default:</span> <span class="hljs-number">20</span>
</code></pre>
<p>Notice the difference: every field has a description that explains not just <em>what</em> it is but <em>when and why</em> you'd use it. An agent reading this schema knows exactly what each parameter does, what values are valid, and what to expect back.</p>
<h3 id="heading-2-predictable-zero-surprises">2. Predictable: Zero Surprises</h3>
<p>Agents rely on patterns. If your API returns <code>created_at</code> in one endpoint and <code>createdAt</code> in another, an agent will either fail or require special handling for each endpoint.</p>
<p><strong>Consistency checklist:</strong></p>
<ul>
<li><p><strong>Naming</strong>: Pick one convention (snake_case or camelCase) and stick with it everywhere</p>
</li>
<li><p><strong>Response format</strong>: Every endpoint should return the same envelope structure</p>
</li>
<li><p><strong>Pagination</strong>: Use the same pagination pattern across all list endpoints</p>
</li>
<li><p><strong>Timestamps</strong>: One format everywhere (<a target="_blank" href="https://www.iso.org/iso-8601-date-and-time-format.html">ISO 8601</a>: <code>2026-02-11T05:30:00Z</code>)</p>
</li>
<li><p><strong>Null handling</strong>: Decide whether missing fields are <code>null</code>, omitted, or empty strings</p>
</li>
</ul>
<h3 id="heading-3-semantic-meaning-over-syntax">3. Semantic: Meaning Over Syntax</h3>
<p>Name things for what they <em>do</em>, not how they're implemented:</p>
<pre><code class="lang-plaintext"># Bad: Implementation-leaked naming
POST /api/v2/db/insert-record
GET  /api/v2/cache/fetch?key=user_123

# Good: Intent-driven naming
POST /api/v2/users
GET  /api/v2/users/123
</code></pre>
<p>When an agent sees <code>POST /users</code>, it immediately understands the intent: create a user. When it sees <code>POST /db/insert-record</code>, it has to guess what kind of record and where it goes.</p>
<h3 id="heading-4-composable-building-blocks-not-monoliths">4. Composable: Building Blocks, Not Monoliths</h3>
<p>Design endpoints as atomic operations that chain well. An agent orchestrating a checkout flow should be able to:</p>
<ol>
<li><p><code>GET /cart</code> → Get current cart</p>
</li>
<li><p><code>POST /orders</code> → Create order from cart</p>
</li>
<li><p><code>POST /orders/{id}/payments</code> → Process payment</p>
</li>
<li><p><code>GET /orders/{id}</code> → Verify order status</p>
</li>
</ol>
<p>Each step is independent, has clear inputs/outputs, and can be retried individually if something fails.</p>
<p><strong>Avoid "god endpoints"</strong> that do multiple things:</p>
<pre><code class="lang-plaintext"># Bad: One endpoint does everything
POST /checkout
{
  "action": "process",
  "validate_inventory": true,
  "apply_discount": "SAVE10",
  "payment_method": "card",
  "send_confirmation": true
}

# Good: Composable steps
POST /orders              → Creates order
POST /orders/{id}/discounts → Applies discount
POST /orders/{id}/payments  → Processes payment
POST /orders/{id}/confirm   → Sends confirmation
</code></pre>
<h3 id="heading-5-discoverable-help-agents-find-you">5. Discoverable: Help Agents Find You</h3>
<p>Even the best-designed API is useless if agents can't find it. Expose your API schema at known endpoints:</p>
<ul>
<li><p><code>GET /.well-known/openapi.json</code> — Your full OpenAPI spec</p>
</li>
<li><p><code>GET /api</code> — API root with available resources and links</p>
</li>
<li><p>Response headers with <code>Link</code> pointing to related resources</p>
</li>
</ul>
<p>We'll dig deeper into discoverability with MCP and HATEOAS in a later section.</p>
<hr />
<h2 id="heading-before-amp-after-retrofitting-a-real-api">Before &amp; After: Retrofitting a Real API</h2>
<p>Let's take a concrete example — a user management API — and walk through the transformation step by step.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770781259758/2f640e1e-974a-4a64-afb3-78f2574436d2.webp" alt="Before and After API Design Comparison" class="image--center mx-auto" /></p>
<h3 id="heading-before-a-typical-rest-api">Before: A Typical REST API</h3>
<pre><code class="lang-javascript"><span class="hljs-comment">// Express.js — Traditional API endpoint</span>
app.get(<span class="hljs-string">'/api/users/:id'</span>, <span class="hljs-keyword">async</span> (req, res) =&gt; {
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> user = <span class="hljs-keyword">await</span> db.users.findById(req.params.id);
    <span class="hljs-keyword">if</span> (!user) {
      <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">404</span>).json({
        <span class="hljs-attr">error</span>: <span class="hljs-string">'User not found'</span>
      });
    }
    res.json(user);
  } <span class="hljs-keyword">catch</span> (err) {
    res.status(<span class="hljs-number">500</span>).json({
      <span class="hljs-attr">error</span>: <span class="hljs-string">'Something went wrong'</span>
    });
  }
});
</code></pre>
<p>This works for humans. A developer gets a 404, reads "User not found," and knows to check the ID. But an agent?</p>
<ul>
<li><p>No structured error code to branch on</p>
</li>
<li><p>No indication of <em>why</em> the user wasn't found (invalid ID format? deleted? never existed?)</p>
</li>
<li><p>No hint about what to do next</p>
</li>
<li><p>No links to related resources</p>
</li>
<li><p>The success response has no schema guarantee</p>
</li>
</ul>
<h3 id="heading-after-agent-ready-api">After: Agent-Ready API</h3>
<pre><code class="lang-javascript"><span class="hljs-comment">// Express.js — Agent-ready API endpoint</span>
app.get(<span class="hljs-string">'/api/users/:id'</span>, <span class="hljs-keyword">async</span> (req, res) =&gt; {
  <span class="hljs-comment">// Validate input format first</span>
  <span class="hljs-keyword">if</span> (!isValidUUID(req.params.id)) {
    <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">400</span>).json({
      <span class="hljs-attr">error</span>: {
        <span class="hljs-attr">code</span>: <span class="hljs-string">'INVALID_PARAMETER_FORMAT'</span>,
        <span class="hljs-attr">message</span>: <span class="hljs-string">'User ID must be a valid UUID v4.'</span>,
        <span class="hljs-attr">parameter</span>: <span class="hljs-string">'id'</span>,
        <span class="hljs-attr">received</span>: req.params.id,
        <span class="hljs-attr">expected</span>: <span class="hljs-string">'UUID v4 (e.g., 550e8400-e29b-41d4-a716-446655440000)'</span>,
        <span class="hljs-attr">docs</span>: <span class="hljs-string">'https://api.example.com/docs/users#get-user'</span>
      }
    });
  }

  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> user = <span class="hljs-keyword">await</span> db.users.findById(req.params.id);

    <span class="hljs-keyword">if</span> (!user) {
      <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">404</span>).json({
        <span class="hljs-attr">error</span>: {
          <span class="hljs-attr">code</span>: <span class="hljs-string">'RESOURCE_NOT_FOUND'</span>,
          <span class="hljs-attr">message</span>: <span class="hljs-string">`No user found with ID '<span class="hljs-subst">${req.params.id}</span>'.`</span>,
          <span class="hljs-attr">resource</span>: <span class="hljs-string">'user'</span>,
          <span class="hljs-attr">parameter</span>: <span class="hljs-string">'id'</span>,
          <span class="hljs-attr">suggestions</span>: [
            <span class="hljs-string">'Verify the user ID is correct'</span>,
            <span class="hljs-string">'Use GET /api/users?search={query} to find users'</span>
          ],
          <span class="hljs-attr">docs</span>: <span class="hljs-string">'https://api.example.com/docs/users#get-user'</span>
        }
      });
    }

    res.json({
      <span class="hljs-attr">data</span>: {
        <span class="hljs-attr">id</span>: user.id,
        <span class="hljs-attr">email</span>: user.email,
        <span class="hljs-attr">name</span>: user.name,
        <span class="hljs-attr">role</span>: user.role,
        <span class="hljs-attr">createdAt</span>: user.createdAt.toISOString(),
        <span class="hljs-attr">updatedAt</span>: user.updatedAt.toISOString()
      },
      <span class="hljs-attr">_links</span>: {
        <span class="hljs-attr">self</span>: { <span class="hljs-attr">href</span>: <span class="hljs-string">`/api/users/<span class="hljs-subst">${user.id}</span>`</span> },
        <span class="hljs-attr">orders</span>: { <span class="hljs-attr">href</span>: <span class="hljs-string">`/api/users/<span class="hljs-subst">${user.id}</span>/orders`</span> },
        <span class="hljs-attr">profile</span>: { <span class="hljs-attr">href</span>: <span class="hljs-string">`/api/users/<span class="hljs-subst">${user.id}</span>/profile`</span> }
      }
    });
  } <span class="hljs-keyword">catch</span> (err) {
    res.status(<span class="hljs-number">500</span>).json({
      <span class="hljs-attr">error</span>: {
        <span class="hljs-attr">code</span>: <span class="hljs-string">'INTERNAL_ERROR'</span>,
        <span class="hljs-attr">message</span>: <span class="hljs-string">'An unexpected error occurred while fetching the user.'</span>,
        <span class="hljs-attr">requestId</span>: req.id,
        <span class="hljs-attr">remediation</span>: <span class="hljs-string">'Retry the request. If the issue persists, contact support with the requestId.'</span>,
        <span class="hljs-attr">retryable</span>: <span class="hljs-literal">true</span>,
        <span class="hljs-attr">retryAfter</span>: <span class="hljs-number">5</span>
      }
    });
  }
});
</code></pre>
<p><strong>What changed and why:</strong></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Change</td><td>Why It Matters for Agents</td></tr>
</thead>
<tbody>
<tr>
<td>Input validation with specific error</td><td>Agent can self-correct the ID format</td></tr>
<tr>
<td>Structured error codes (<code>INVALID_PARAMETER_FORMAT</code>)</td><td>Agent can branch logic on error type</td></tr>
<tr>
<td><code>suggestions</code> array</td><td>Agent knows alternative approaches</td></tr>
<tr>
<td><code>_links</code> in success response (HATEOAS)</td><td>Agent discovers related resources programmatically</td></tr>
<tr>
<td><code>retryable</code> + <code>retryAfter</code></td><td>Agent knows whether and when to retry</td></tr>
<tr>
<td><code>requestId</code></td><td>Agent can reference specific failures in escalation</td></tr>
<tr>
<td>Consistent <code>data</code> wrapper</td><td>Agent always knows where to find the payload</td></tr>
</tbody>
</table>
</div><p>The before version has about 15 lines. The after version has more code, but every additional line serves the agent. And here's the thing — <strong>humans benefit from these improvements too</strong>. Better error messages and discoverable links make any API easier to work with.</p>
<hr />
<h2 id="heading-authentication-for-non-human-consumers">Authentication for Non-Human Consumers</h2>
<p>Authentication is where most "agent-ready" articles get hand-wavy. Let's get specific.</p>
<h3 id="heading-the-jwt-problem">The JWT Problem</h3>
<p>Traditional JWT flows assume a human is present to log in, handle MFA, and refresh tokens. AI agents operate autonomously — there's no human in the loop to re-authenticate when a token expires at 3 AM.</p>
<p>Worse, if you pass JWTs to an LLM as part of a tool's context, you're exposing credentials in the model's context window. That's a security risk with no upside.</p>
<h3 id="heading-recommended-oauth-20-client-credentials">Recommended: OAuth 2.0 Client Credentials</h3>
<p>For agent-to-API communication, the <a target="_blank" href="https://datatracker.ietf.org/doc/html/rfc6749#section-4.4"><strong>OAuth 2.0 Client Credentials</strong></a> grant is the right choice:</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Agent authenticates with client credentials</span>
<span class="hljs-keyword">const</span> tokenResponse = <span class="hljs-keyword">await</span> fetch(<span class="hljs-string">'https://auth.example.com/oauth/token'</span>, {
  <span class="hljs-attr">method</span>: <span class="hljs-string">'POST'</span>,
  <span class="hljs-attr">headers</span>: { <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span> },
  <span class="hljs-attr">body</span>: <span class="hljs-built_in">JSON</span>.stringify({
    <span class="hljs-attr">grant_type</span>: <span class="hljs-string">'client_credentials'</span>,
    <span class="hljs-attr">client_id</span>: process.env.API_CLIENT_ID,
    <span class="hljs-attr">client_secret</span>: process.env.API_CLIENT_SECRET,
    <span class="hljs-attr">scope</span>: <span class="hljs-string">'read:users read:orders'</span>  <span class="hljs-comment">// Request only needed scopes</span>
  })
});

<span class="hljs-keyword">const</span> { access_token, expires_in } = <span class="hljs-keyword">await</span> tokenResponse.json();

<span class="hljs-comment">// Agent uses the token for API calls</span>
<span class="hljs-keyword">const</span> userResponse = <span class="hljs-keyword">await</span> fetch(<span class="hljs-string">'https://api.example.com/users/123'</span>, {
  <span class="hljs-attr">headers</span>: {
    <span class="hljs-string">'Authorization'</span>: <span class="hljs-string">`Bearer <span class="hljs-subst">${access_token}</span>`</span>,
    <span class="hljs-string">'X-Agent-Id'</span>: <span class="hljs-string">'order-processing-agent-v2'</span>,  <span class="hljs-comment">// Identify the agent</span>
    <span class="hljs-string">'X-Request-Id'</span>: crypto.randomUUID()          <span class="hljs-comment">// Trace requests</span>
  }
});
</code></pre>
<p><strong>Why this works for agents:</strong></p>
<ul>
<li><p>No human in the loop required</p>
</li>
<li><p>Scoped permissions (principle of least privilege)</p>
</li>
<li><p>Token rotation is automated</p>
</li>
<li><p>The agent never sees user credentials — only its own service credentials</p>
</li>
<li><p><code>X-Agent-Id</code> header lets your API track and rate-limit by agent</p>
</li>
</ul>
<h3 id="heading-api-key-patterns">API Key Patterns</h3>
<p>For simpler setups, API keys work — but treat them differently than you would for human developers:</p>
<ul>
<li><p><strong>Separate keys per agent</strong>: Don't reuse the same key across agents with different purposes</p>
</li>
<li><p><strong>Scoped permissions</strong>: Each key should only allow the operations that specific agent needs</p>
</li>
<li><p><strong>Auto-rotation</strong>: Set expiration policies and provide a key rotation endpoint</p>
</li>
<li><p><strong>Rate limits per key</strong>: AI agents can generate bursts of requests — set appropriate limits</p>
</li>
</ul>
<h3 id="heading-rate-limiting-for-non-human-traffic">Rate Limiting for Non-Human Traffic</h3>
<p>AI agents behave differently than humans. A human might make 5-10 API calls during a session. An agent orchestrating a complex task might make 50-100 calls in seconds.</p>
<p>Design your rate limiting accordingly:</p>
<pre><code class="lang-plaintext"># Headers your API should return
X-RateLimit-Limit: 1000          # Requests per window
X-RateLimit-Remaining: 847       # Remaining in current window
X-RateLimit-Reset: 1707635400    # Unix timestamp when window resets
Retry-After: 30                  # Seconds to wait (on 429 response)
</code></pre>
<p>Consider <strong>tiered rate limits</strong>: a basic tier for general API keys and a higher tier for verified agent integrations that have been reviewed and approved.</p>
<hr />
<h2 id="heading-error-handling-that-agents-can-act-on">Error Handling That Agents Can Act On</h2>
<p>Here's a principle that will transform your API's agent-friendliness: <strong>every error response should tell the agent what to do next.</strong></p>
<h3 id="heading-the-error-response-contract">The Error Response Contract</h3>
<p>Define a consistent error schema that agents can rely on:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"error"</span>: {
    <span class="hljs-attr">"code"</span>: <span class="hljs-string">"RATE_LIMIT_EXCEEDED"</span>,
    <span class="hljs-attr">"message"</span>: <span class="hljs-string">"You have exceeded the rate limit for this endpoint."</span>,
    <span class="hljs-attr">"details"</span>: {
      <span class="hljs-attr">"limit"</span>: <span class="hljs-number">100</span>,
      <span class="hljs-attr">"window"</span>: <span class="hljs-string">"60s"</span>,
      <span class="hljs-attr">"current"</span>: <span class="hljs-number">103</span>
    },
    <span class="hljs-attr">"retryable"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"retryAfter"</span>: <span class="hljs-number">45</span>,
    <span class="hljs-attr">"remediation"</span>: <span class="hljs-string">"Wait 45 seconds before retrying. Consider reducing request frequency or upgrading to a higher rate limit tier."</span>,
    <span class="hljs-attr">"docs"</span>: <span class="hljs-string">"https://api.example.com/docs/rate-limits"</span>,
    <span class="hljs-attr">"requestId"</span>: <span class="hljs-string">"req_abc123def456"</span>
  }
}
</code></pre>
<p><strong>Key fields explained:</strong></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Field</td><td>Purpose</td></tr>
</thead>
<tbody>
<tr>
<td><code>code</code></td><td>Machine-readable error type for branching logic</td></tr>
<tr>
<td><code>message</code></td><td>Human-readable explanation</td></tr>
<tr>
<td><code>details</code></td><td>Contextual data specific to this error type</td></tr>
<tr>
<td><code>retryable</code></td><td>Can the agent retry this exact request?</td></tr>
<tr>
<td><code>retryAfter</code></td><td>How long to wait (in seconds)</td></tr>
<tr>
<td><code>remediation</code></td><td>Step-by-step fix instructions</td></tr>
<tr>
<td><code>docs</code></td><td>Link to detailed documentation</td></tr>
<tr>
<td><code>requestId</code></td><td>Unique ID for debugging and support escalation</td></tr>
</tbody>
</table>
</div><h3 id="heading-error-categories">Error Categories</h3>
<p>Organize your error codes into categories that agents can use for high-level branching:</p>
<pre><code class="lang-plaintext">AUTH_*       → Authentication issues    → Re-authenticate
PERM_*       → Permission issues        → Request different scope
PARAM_*      → Parameter issues         → Fix input and retry
RATE_*       → Rate limiting            → Wait and retry
RESOURCE_*   → Resource state issues    → Check resource status
INTERNAL_*   → Server issues            → Retry with backoff
</code></pre>
<p>An agent receiving <code>AUTH_TOKEN_EXPIRED</code> knows to refresh the token and retry. An agent receiving <code>PARAM_INVALID_FORMAT</code> knows to fix the input. An agent receiving <code>INTERNAL_ERROR</code> knows to back off and retry later.</p>
<p>This categorization turns error handling from guesswork into a deterministic state machine — exactly what autonomous agents need.</p>
<hr />
<h2 id="heading-making-your-api-discoverable-mcp-and-beyond">Making Your API Discoverable: MCP and Beyond</h2>
<p>Your API might be perfectly designed, but if agents can't <em>find</em> it, it might as well not exist.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770781275831/45d019a3-1447-427a-a50b-7e799918c22c.webp" alt="API Discoverability Architecture" class="image--center mx-auto" /></p>
<h3 id="heading-model-context-protocol-mcp">Model Context Protocol (MCP)</h3>
<p><a target="_blank" href="https://modelcontextprotocol.io/">MCP</a> is becoming the standard way for AI agents to discover and interact with APIs. Think of it as a universal adapter between AI agents and your services.</p>
<p>Instead of teaching each AI model how to use your specific API, you expose your API through an MCP server that speaks a protocol agents already understand:</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// MCP server exposing your API to AI agents</span>
<span class="hljs-keyword">import</span> { McpServer } <span class="hljs-keyword">from</span> <span class="hljs-string">'@modelcontextprotocol/sdk/server/mcp.js'</span>;

<span class="hljs-keyword">const</span> server = <span class="hljs-keyword">new</span> McpServer({
  <span class="hljs-attr">name</span>: <span class="hljs-string">'user-management-api'</span>,
  <span class="hljs-attr">version</span>: <span class="hljs-string">'1.0.0'</span>,
});

<span class="hljs-comment">// Define a tool that agents can discover and use</span>
server.tool(
  <span class="hljs-string">'get_user'</span>,
  <span class="hljs-string">'Retrieve a user by their unique ID. Returns user profile including name, email, role, and account creation date.'</span>,
  {
    <span class="hljs-attr">userId</span>: {
      <span class="hljs-attr">type</span>: <span class="hljs-string">'string'</span>,
      <span class="hljs-attr">description</span>: <span class="hljs-string">'The unique UUID v4 identifier of the user to retrieve'</span>,
    }
  },
  <span class="hljs-keyword">async</span> ({ userId }) =&gt; {
    <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> fetch(<span class="hljs-string">`https://api.example.com/users/<span class="hljs-subst">${userId}</span>`</span>, {
      <span class="hljs-attr">headers</span>: { <span class="hljs-string">'Authorization'</span>: <span class="hljs-string">`Bearer <span class="hljs-subst">${API_TOKEN}</span>`</span> }
    });
    <span class="hljs-keyword">const</span> data = <span class="hljs-keyword">await</span> response.json();
    <span class="hljs-keyword">return</span> {
      <span class="hljs-attr">content</span>: [{ <span class="hljs-attr">type</span>: <span class="hljs-string">'text'</span>, <span class="hljs-attr">text</span>: <span class="hljs-built_in">JSON</span>.stringify(data, <span class="hljs-literal">null</span>, <span class="hljs-number">2</span>) }]
    };
  }
);
</code></pre>
<p>The key insight: MCP bridges the gap between your existing REST API and agent consumption. You don't have to rewrite your API — you wrap it in a layer that agents can discover.</p>
<h3 id="heading-hateoas-the-comeback">HATEOAS: The Comeback</h3>
<p><a target="_blank" href="https://restfulapi.net/hateoas/">HATEOAS</a> (Hypermedia as the Engine of Application State) was ahead of its time. Human developers mostly ignored it — who needs machine-navigable links when you can bookmark the docs?</p>
<p>AI agents, that's who.</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"data"</span>: {
    <span class="hljs-attr">"id"</span>: <span class="hljs-string">"user_123"</span>,
    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Supra Huang"</span>,
    <span class="hljs-attr">"email"</span>: <span class="hljs-string">"supra@example.com"</span>
  },
  <span class="hljs-attr">"_links"</span>: {
    <span class="hljs-attr">"self"</span>: {
      <span class="hljs-attr">"href"</span>: <span class="hljs-string">"/api/users/user_123"</span>,
      <span class="hljs-attr">"method"</span>: <span class="hljs-string">"GET"</span>
    },
    <span class="hljs-attr">"update"</span>: {
      <span class="hljs-attr">"href"</span>: <span class="hljs-string">"/api/users/user_123"</span>,
      <span class="hljs-attr">"method"</span>: <span class="hljs-string">"PATCH"</span>,
      <span class="hljs-attr">"description"</span>: <span class="hljs-string">"Update user profile fields"</span>
    },
    <span class="hljs-attr">"orders"</span>: {
      <span class="hljs-attr">"href"</span>: <span class="hljs-string">"/api/users/user_123/orders"</span>,
      <span class="hljs-attr">"method"</span>: <span class="hljs-string">"GET"</span>,
      <span class="hljs-attr">"description"</span>: <span class="hljs-string">"List all orders for this user"</span>
    },
    <span class="hljs-attr">"deactivate"</span>: {
      <span class="hljs-attr">"href"</span>: <span class="hljs-string">"/api/users/user_123/deactivate"</span>,
      <span class="hljs-attr">"method"</span>: <span class="hljs-string">"POST"</span>,
      <span class="hljs-attr">"description"</span>: <span class="hljs-string">"Deactivate the user account (reversible)"</span>
    }
  },
  <span class="hljs-attr">"_actions"</span>: {
    <span class="hljs-attr">"available"</span>: [<span class="hljs-string">"update"</span>, <span class="hljs-string">"deactivate"</span>, <span class="hljs-string">"orders"</span>],
    <span class="hljs-attr">"unavailable"</span>: [
      {
        <span class="hljs-attr">"action"</span>: <span class="hljs-string">"delete"</span>,
        <span class="hljs-attr">"reason"</span>: <span class="hljs-string">"User has active orders. Resolve orders before deletion."</span>,
        <span class="hljs-attr">"blockedBy"</span>: <span class="hljs-string">"/api/users/user_123/orders?status=active"</span>
      }
    ]
  }
}
</code></pre>
<p>Notice the <code>_actions</code> block. It tells the agent not just <em>what</em> it can do, but also <em>what it can't do and why</em>. An agent attempting to delete this user would know to resolve active orders first — without making a failed request and parsing an error.</p>
<h3 id="heading-schema-first-design">Schema-First Design</h3>
<p>Expose your full API schema at well-known endpoints:</p>
<ul>
<li><p><code>GET /.well-known/openapi.json</code> — Full OpenAPI specification</p>
</li>
<li><p><code>GET /.well-known/mcp.json</code> — MCP server configuration (if applicable)</p>
</li>
<li><p><code>GET /api</code> — Root endpoint listing all available resources</p>
</li>
</ul>
<p>This is the minimum for discoverability. An agent landing on your API domain can immediately understand what's available and how to use it.</p>
<hr />
<h2 id="heading-testing-your-api-with-ai-agents">Testing Your API with AI Agents</h2>
<p>You wouldn't ship a website without testing it in a browser. Don't ship an agent-ready API without testing it with actual agents.</p>
<h3 id="heading-prompt-based-testing">Prompt-Based Testing</h3>
<p>The simplest test: give an AI agent your API docs and ask it to accomplish a task. If it struggles, your API has discoverability or usability issues.</p>
<pre><code class="lang-python"><span class="hljs-comment"># Simple agent-based API test</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_api_with_agent</span>():</span>
    <span class="hljs-string">"""
    Give an LLM your OpenAPI spec and see if it can
    successfully complete a multi-step workflow.
    """</span>
    openapi_spec = load_openapi_spec(<span class="hljs-string">'./openapi.json'</span>)

    test_scenarios = [
        {
            <span class="hljs-string">"task"</span>: <span class="hljs-string">"Find the user with email test@example.com and list their recent orders"</span>,
            <span class="hljs-string">"expected_calls"</span>: [<span class="hljs-string">"GET /users?email=test@example.com"</span>, <span class="hljs-string">"GET /users/{id}/orders"</span>],
            <span class="hljs-string">"expected_result"</span>: <span class="hljs-string">"Returns a list of orders"</span>
        },
        {
            <span class="hljs-string">"task"</span>: <span class="hljs-string">"Create a new user and assign them the 'editor' role"</span>,
            <span class="hljs-string">"expected_calls"</span>: [<span class="hljs-string">"POST /users"</span>, <span class="hljs-string">"PATCH /users/{id}"</span>],
            <span class="hljs-string">"expected_result"</span>: <span class="hljs-string">"User created with editor role"</span>
        }
    ]

    <span class="hljs-keyword">for</span> scenario <span class="hljs-keyword">in</span> test_scenarios:
        result = run_agent_with_tools(
            prompt=scenario[<span class="hljs-string">"task"</span>],
            tools=openapi_spec_to_tools(openapi_spec)
        )
        assert_calls_match(result.api_calls, scenario[<span class="hljs-string">"expected_calls"</span>])
        print(<span class="hljs-string">f"✅ Passed: <span class="hljs-subst">{scenario[<span class="hljs-string">'task'</span>]}</span>"</span>)
</code></pre>
<h3 id="heading-schema-validation">Schema Validation</h3>
<p>Validate that your actual API responses match your OpenAPI spec. Drift between spec and reality is the number one reason agents fail:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Use openapi-diff to catch breaking changes</span>
npx openapi-diff previous-spec.json current-spec.json

<span class="hljs-comment"># Use Spectral to lint your OpenAPI spec (https://github.com/stoplightio/spectral)</span>
npx @stoplight/spectral-cli lint openapi.json
</code></pre>
<h3 id="heading-key-metrics-to-monitor">Key Metrics to Monitor</h3>
<p>Once agents are consuming your API, track these metrics:</p>
<ul>
<li><p><strong>Agent success rate</strong>: What percentage of agent workflows complete without errors?</p>
</li>
<li><p><strong>Self-correction rate</strong>: How often do agents recover from errors without human help?</p>
</li>
<li><p><strong>Average calls per task</strong>: Are agents making efficient use of your endpoints?</p>
</li>
<li><p><strong>Error category distribution</strong>: Which error types are most common? That's where to improve.</p>
</li>
</ul>
<hr />
<h2 id="heading-migration-checklist-start-tomorrow">Migration Checklist: Start Tomorrow</h2>
<p>You don't have to rewrite your API from scratch. Here's a phased approach:</p>
<h3 id="heading-quick-wins-this-week">Quick Wins (This Week)</h3>
<ul>
<li><p>[ ] Add <code>operationId</code> and rich <code>description</code> to every OpenAPI endpoint</p>
</li>
<li><p>[ ] Standardize error response format with <code>code</code>, <code>message</code>, <code>retryable</code></p>
</li>
<li><p>[ ] Add <code>X-Request-Id</code> to every response for tracing</p>
</li>
<li><p>[ ] Expose your OpenAPI spec at <code>/.well-known/openapi.json</code></p>
</li>
<li><p>[ ] Add rate limit headers to all responses</p>
</li>
</ul>
<h3 id="heading-medium-effort-next-2-weeks">Medium Effort (Next 2 Weeks)</h3>
<ul>
<li><p>[ ] Implement structured error codes with categories (<code>AUTH_*</code>, <code>PARAM_*</code>, etc.)</p>
</li>
<li><p>[ ] Add <code>_links</code> (HATEOAS) to resource responses</p>
</li>
<li><p>[ ] Set up OAuth 2.0 client credentials flow for agent auth</p>
</li>
<li><p>[ ] Create agent-specific API keys with scoped permissions</p>
</li>
<li><p>[ ] Add <code>remediation</code> field to error responses</p>
</li>
</ul>
<h3 id="heading-long-term-1-3-months">Long-Term (1-3 Months)</h3>
<ul>
<li><p>[ ] Build an MCP server wrapping your API</p>
</li>
<li><p>[ ] Implement comprehensive agent-based integration tests</p>
</li>
<li><p>[ ] Set up monitoring dashboards for agent traffic patterns</p>
</li>
<li><p>[ ] Design composable endpoints for complex workflows</p>
</li>
<li><p>[ ] Add <code>_actions</code> blocks showing available/unavailable operations</p>
</li>
</ul>
<p><strong>Start with the quick wins.</strong> Just adding rich OpenAPI descriptions and structured error codes will make a measurable difference in how well agents work with your API.</p>
<hr />
<h2 id="heading-conclusion">Conclusion</h2>
<p>The shift from human-first to agent-first API design isn't coming — it's already here. AI agents are consuming APIs at scale, and the APIs that work well with them will get more integrations, more traffic, and more adoption.</p>
<p>The good news: agent-ready API design isn't a radical departure from good API design. Self-describing endpoints, consistent response formats, structured errors, and proper authentication are improvements that benefit <em>all</em> consumers — human and AI alike.</p>
<p>Start with what matters most: <strong>make your API self-describing</strong> (rich OpenAPI specs), <strong>make errors actionable</strong> (structured codes with remediation), and <strong>make endpoints discoverable</strong> (schema at well-known URLs).</p>
<p>Your API wasn't built for AI agents. But with the changes in this guide, it can be — starting this week.</p>
<hr />
<p><em>What's your experience building APIs that AI agents consume? Have you tried wrapping your API with MCP? I'd love to hear your approach — drop a comment below or find me on</em> <a target="_blank" href="https://github.com/supra126"><em>GitHub</em></a><em>.</em></p>
]]></content:encoded>
      <author>黃小黃</author>
      <pubDate>Mon, 16 Feb 2026 14:07:57 GMT</pubDate>
      <category>API Design</category>
      <category>ai agents</category>
      <category>Web Development</category>
      <category>JavaScript</category>
      <category>software architecture</category>
    </item>
    <item>
      <title>When Microservices Are Wrong: A Solutions Architect&apos;s Decision Framework</title>
      <link>https://suprahuang.cc/when-microservices-are-wrong-decision-framework</link>
      <guid isPermaLink="true">https://suprahuang.cc/when-microservices-are-wrong-decision-framework</guid>
      <description>I&apos;ve been that architect. The one who spun up AWS Lambda functions and ECS clusters for every new service, convinced that microservices were the only &quot;proper&quot; way to build modern software. After years of managing distributed complexity — and eventual...</description>
      <content:encoded><![CDATA[<p>I've been that architect. The one who spun up AWS Lambda functions and ECS clusters for every new service, convinced that microservices were the only "proper" way to build modern software. After years of managing distributed complexity — and eventually migrating most of my projects to Next.js, NestJS, Vercel, Railway, and Supabase — I learned something the hard way: <strong>the best architecture is the one that matches your actual needs, not your aspirations.</strong></p>
<p>Across the industry, a growing number of organizations are consolidating their microservices back into simpler architectures. This isn't a step backward — it's the industry maturing. The microservices hype cycle has peaked, and we're finally having honest conversations about when distributed systems create more problems than they solve.</p>
<p>This article gives you a practical decision framework — backed by real cost data, case studies, and a ready-to-use checklist — so you can make architecture decisions based on evidence, not hype.</p>
<hr />
<h2 id="heading-the-microservices-hype-cycle-where-we-stand-in-2026">The Microservices Hype Cycle: Where We Stand in 2026</h2>
<p>Microservices exploded in popularity after Netflix and Amazon shared their architecture stories around 2014-2015. The message was compelling: break your monolith into small, independent services, and you'll get better scalability, faster deployments, and team autonomy.</p>
<p>What got lost in translation was context. Netflix had <strong>thousands of engineers</strong>. Amazon had <strong>hundreds of teams</strong> that needed to deploy independently. The architecture solved problems at a scale that most organizations will never reach.</p>
<p>Fast forward to 2026, and the pendulum is swinging back:</p>
<ul>
<li><p>The <strong>modular monolith</strong> pattern has emerged as the pragmatic middle ground</p>
</li>
<li><p>Major cloud providers now offer guides on <em>when not to</em> use microservices (even AWS)</p>
</li>
<li><p>High-profile teams like Amazon Prime Video have publicly moved services back to monoliths</p>
</li>
<li><p>The operational cost gap between microservices and monoliths is often <strong>3-5x</strong> when accounting for infrastructure, tooling, and platform team overhead</p>
</li>
</ul>
<p>The industry consensus is shifting from "microservices by default" to <strong>"microservices by necessity."</strong></p>
<hr />
<h2 id="heading-7-scenarios-where-microservices-are-the-wrong-choice">7 Scenarios Where Microservices Are the Wrong Choice</h2>
<p>Not every project needs a distributed architecture. Here are seven concrete scenarios where microservices will likely hurt more than help.</p>
<h3 id="heading-1-your-team-cant-staff-autonomous-teams-per-service">1. Your Team Can't Staff Autonomous Teams Per Service</h3>
<p>Microservices solve an <strong>organizational problem</strong> as much as a technical one. They allow large teams to work independently without stepping on each other's code. Each microservice ideally needs a dedicated team of 5-8 people (Amazon's "two-pizza team" concept) who can own it end-to-end.</p>
<p>If your organization isn't large enough to staff autonomous teams per service — and for most companies, that means having dozens of developers — you're adding distributed systems complexity without the organizational benefit.</p>
<p><strong>Rule of thumb:</strong> If your entire engineering team fits in one meeting room, you probably don't need microservices.</p>
<h3 id="heading-2-youre-building-an-mvp-or-early-stage-product">2. You're Building an MVP or Early-Stage Product</h3>
<p>In the early stages, your domain model is still evolving. You don't know which boundaries will be stable enough to become service boundaries. Premature decomposition means you'll spend more time refactoring service boundaries than building features.</p>
<p>As Martin Fowler <a target="_blank" href="https://martinfowler.com/bliki/MonolithFirst.html">observed</a>: "Almost all the successful microservice stories have started with a monolith that got too big and was broken up."</p>
<p><strong>What to do instead:</strong> Build a well-structured monolith with clear module boundaries. You can extract services later when you have real data about which components need independent scaling.</p>
<h3 id="heading-3-your-domain-boundaries-are-unclear">3. Your Domain Boundaries Are Unclear</h3>
<p>Microservices work best when you have well-defined bounded contexts (in Domain-Driven Design terms). If your team frequently debates where a feature "belongs," or if services constantly need to call each other for basic operations, your boundaries are wrong.</p>
<p>A <strong>distributed monolith</strong> — microservices that can't function independently — is the worst of both worlds: all the network overhead with none of the autonomy benefits.</p>
<h3 id="heading-4-your-team-lacks-devops-maturity">4. Your Team Lacks DevOps Maturity</h3>
<p>Microservices require a significant operational foundation:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Capability</td><td>Required For Microservices</td><td>Monolith Alternative</td></tr>
</thead>
<tbody>
<tr>
<td>Container orchestration (K8s)</td><td>Running dozens of services</td><td>Single deployment</td></tr>
<tr>
<td>Service mesh (Istio/Linkerd)</td><td>Service-to-service communication</td><td>Function calls</td></tr>
<tr>
<td>Distributed tracing (Jaeger)</td><td>Debugging across services</td><td>Stack traces</td></tr>
<tr>
<td>CI/CD per service</td><td>Independent deployments</td><td>One pipeline</td></tr>
<tr>
<td>Centralized logging</td><td>Correlating logs across services</td><td>grep</td></tr>
</tbody>
</table>
</div><p>If your team doesn't already have these capabilities, the <strong>infrastructure tax</strong> will consume more engineering time than feature development.</p>
<h3 id="heading-5-your-application-has-low-traffic-and-no-independent-scaling-needs">5. Your Application Has Low Traffic and No Independent Scaling Needs</h3>
<p>If all parts of your system scale together and your peak traffic can be handled by a single well-provisioned server (or a simple auto-scaling group), microservices add network latency and operational complexity for zero benefit.</p>
<p><strong>Network calls are orders of magnitude slower than in-process function calls</strong> — a typical HTTP call between services takes 1-5 milliseconds, while an in-process function call completes in microseconds or nanoseconds. Every service boundary you introduce adds latency, potential failure points, and debugging complexity.</p>
<h3 id="heading-6-you-need-strong-data-consistency">6. You Need Strong Data Consistency</h3>
<p>Microservices favor <strong>eventual consistency</strong> — each service owns its data, and changes propagate asynchronously. If your domain requires strong transactional consistency (financial systems, inventory management, booking systems), you'll need to implement distributed transactions (sagas, two-phase commit) that are notoriously difficult to get right.</p>
<p>A monolith with a single database gives you ACID transactions for free.</p>
<h3 id="heading-7-youre-a-startup-with-limited-budget">7. You're a Startup With Limited Budget</h3>
<p>The total cost of ownership for microservices is significantly higher:</p>
<ul>
<li><p><strong>Infrastructure</strong>: More containers, load balancers, service meshes</p>
</li>
<li><p><strong>Tooling</strong>: Observability platforms, API gateways, secrets management</p>
</li>
<li><p><strong>People</strong>: Platform engineers command <a target="_blank" href="https://www.glassdoor.com/Salaries/platform-engineer-salary-SRCH_KO0,17.htm">$140,000-$180,000/year salaries</a> (US average)</p>
</li>
<li><p><strong>Cognitive overhead</strong>: Every developer needs to understand distributed systems patterns</p>
</li>
</ul>
<p>For a startup, that money and engineering time is better spent on product development.</p>
<hr />
<h2 id="heading-the-real-cost-why-microservices-are-3-5x-more-expensive">The Real Cost: Why Microservices Are 3-5x More Expensive</h2>
<p>The cost gap between microservices and monoliths is wider than most teams expect. Here's where the money goes:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Cost Category</td><td>Monolith</td><td>Microservices</td><td>Why It's Higher</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Compute</strong></td><td>Single process, efficient</td><td>Dozens of containers, each with overhead</td><td>Each service needs its own resources, plus orchestration</td></tr>
<tr>
<td><strong>Networking</strong></td><td>In-process calls (free)</td><td>Cross-service HTTP/gRPC calls</td><td>Load balancers, service mesh, API gateways</td></tr>
<tr>
<td><strong>Observability</strong></td><td>Stack traces, single log stream</td><td>Distributed tracing, log correlation</td><td>Tools like Datadog/New Relic charge per host</td></tr>
<tr>
<td><strong>CI/CD</strong></td><td>One pipeline</td><td>Pipeline per service</td><td>Build times multiply, artifact storage grows</td></tr>
<tr>
<td><strong>Database</strong></td><td>One database, ACID for free</td><td>Database per service</td><td>More instances, plus eventual consistency tooling</td></tr>
<tr>
<td><strong>Platform team</strong></td><td>Not needed</td><td>2-3 dedicated engineers</td><td>Someone must maintain K8s, service mesh, pipelines</td></tr>
</tbody>
</table>
</div><p><strong>The multiplier effect is real.</strong> When the Amazon Prime Video monitoring team <a target="_blank" href="https://www.thestack.technology/amazon-prime-video-microservices-monolith/">moved back to a monolith</a>, they saw a 90% infrastructure cost reduction for that service. When <a target="_blank" href="https://grapeup.com/blog/the-hidden-cost-of-overengineering-microservices/">Grape Up reported</a> consolidating a client from 25 to 5 services, the result was an 82% cost reduction.</p>
<p>These aren't outliers — they're what happens when the architecture's complexity exceeds the problem's complexity.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770657665574/d6f9b33f-ba80-4915-a372-363b8c8f52b7.webp" alt class="image--center mx-auto" /></p>
<p>The hidden cost that most comparisons miss is the <strong>platform team</strong>. Microservices don't run themselves — someone needs to maintain the Kubernetes clusters, service mesh, deployment pipelines, and monitoring infrastructure. That's typically 2-3 dedicated platform engineers (<a target="_blank" href="https://www.glassdoor.com/Salaries/platform-engineer-salary-SRCH_KO0,17.htm">earning $140-180K/year in the US</a>) who could otherwise be building product features.</p>
<hr />
<h2 id="heading-case-studies-when-teams-reversed-course">Case Studies: When Teams Reversed Course</h2>
<h3 id="heading-amazon-prime-video-monitoring-90-cost-reduction">Amazon Prime Video Monitoring: 90% Cost Reduction</h3>
<p>In 2023, the Amazon Prime Video team published a case study (originally on primevideotech.com, now <a target="_blank" href="https://www.thestack.technology/amazon-prime-video-microservices-monolith/">covered by The Stack</a> and <a target="_blank" href="https://devclass.com/2023/05/05/reduce-costs-by-90-by-moving-from-microservices-to-monolith-amazon-internal-case-study-raises-eyebrows/">DevClass</a>) about moving their <strong>audio/video quality monitoring service</strong> from a microservices architecture (using AWS Lambda and Step Functions) back to a single-process application. The result? A <strong>90% reduction in infrastructure costs</strong> for that specific service.</p>
<p>Important context: this was one monitoring tool within Prime Video, not the entire platform. But the lesson is universal — their microservices were passing large volumes of video data between services through S3, creating enormous data transfer costs. Consolidating into a single process eliminated the inter-service communication entirely.</p>
<p>As <a target="_blank" href="https://thenewstack.io/amazon-prime-videos-microservices-move-doesnt-lead-to-a-monolith-after-all/">The New Stack noted</a>, what they built was arguably a modular monolith — a single deployable unit with well-separated internal components. Amazon's core services remain microservices-based at a scale that justifies the complexity.</p>
<h3 id="heading-grape-up-client-25-services-down-to-5">Grape Up Client: 25 Services Down to 5</h3>
<p>Consulting firm Grape Up <a target="_blank" href="https://grapeup.com/blog/the-hidden-cost-of-overengineering-microservices/">documented a client engagement</a> where they consolidated 25 microservices into 5 well-defined services. The reported results:</p>
<ul>
<li><p><strong>82% reduction</strong> in cloud infrastructure costs</p>
</li>
<li><p><strong>70% reduction</strong> in monitoring tool costs</p>
</li>
<li><p>10 databases migrated into 5</p>
</li>
<li><p>3 cache instances reduced to 1</p>
</li>
</ul>
<p>The original decomposition had been driven by the "one service per entity" anti-pattern — each database table essentially had its own service, leading to constant inter-service calls for basic operations. <em>(Note: the client is anonymous, as is typical for consulting case studies.)</em></p>
<h3 id="heading-my-own-journey-from-aws-everything-to-pragmatic-simplicity">My Own Journey: From AWS Everything to Pragmatic Simplicity</h3>
<p>I spent years building on AWS Lambda and ECS, decomposing everything into microservices because that's what "real architects" were supposed to do. Each function was independently deployable. Each service had its own database. The architecture diagrams looked impressive.</p>
<p>But the reality was different:</p>
<ul>
<li><p><strong>Cold starts</strong> on Lambda added latency that users noticed</p>
</li>
<li><p><strong>Debugging</strong> a request that touched 6 services required correlating logs across multiple CloudWatch log groups</p>
</li>
<li><p><strong>Local development</strong> was painful — you can't easily run 15 services on your laptop</p>
</li>
<li><p><strong>Deployment coordination</strong> still existed because services had implicit dependencies</p>
</li>
</ul>
<p>I gradually migrated to <strong>Next.js + NestJS</strong> deployed on <strong>Vercel, Railway, and <code>Fly.io</code></strong>. The result was a system that was simpler to develop, cheaper to run, and faster to iterate on. Not because these tools are inherently better than AWS services, but because the architecture matched my actual scale and team size.</p>
<p>The lesson: <strong>the right architecture is the one that lets you ship features, not the one that looks best on a whiteboard.</strong></p>
<hr />
<h2 id="heading-the-solutions-architects-decision-framework">The Solutions Architect's Decision Framework</h2>
<p>Instead of debating microservices vs. monolith in the abstract, use this scoring matrix to evaluate your specific situation. Rate each dimension from 1 (favors monolith) to 5 (favors microservices):</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770657694824/5368adf0-15c4-492f-b6dd-8b35dceb9f99.webp" alt class="image--center mx-auto" /></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Dimension</td><td>1 (Monolith)</td><td>3 (Either)</td><td>5 (Microservices)</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Team size</strong></td><td>&lt; 15 developers</td><td>15-50</td><td>50+</td></tr>
<tr>
<td><strong>Domain maturity</strong></td><td>Exploring / pivoting</td><td>Stable core, evolving edges</td><td>Well-defined bounded contexts</td></tr>
<tr>
<td><strong>Scaling needs</strong></td><td>Uniform traffic</td><td>Some hotspots</td><td>Components scale independently</td></tr>
<tr>
<td><strong>DevOps maturity</strong></td><td>Manual deployments</td><td>CI/CD in place</td><td>K8s, service mesh, observability</td></tr>
<tr>
<td><strong>Deployment frequency</strong></td><td>Weekly / monthly</td><td>Daily</td><td>Multiple times per day per team</td></tr>
<tr>
<td><strong>Data consistency</strong></td><td>Strong ACID required</td><td>Mix of consistent and eventual</td><td>Eventual consistency acceptable</td></tr>
<tr>
<td><strong>Budget</strong></td><td>Constrained</td><td>Moderate</td><td>Significant infrastructure budget</td></tr>
<tr>
<td><strong>Organizational structure</strong></td><td>Single team</td><td>Few teams</td><td>Multiple autonomous teams</td></tr>
</tbody>
</table>
</div><h3 id="heading-how-to-interpret-your-score">How to Interpret Your Score</h3>
<ul>
<li><p><strong>8-16 points: Monolith</strong> — A well-structured monolith is your best bet. Focus on clean module boundaries and solid testing.</p>
</li>
<li><p><strong>17-28 points: Modular Monolith</strong> — You need better separation than a traditional monolith but don't need the overhead of full microservices. This is the sweet spot for most organizations.</p>
</li>
<li><p><strong>29-40 points: Microservices</strong> — You have the scale, team structure, and operational maturity to benefit from microservices. Proceed with clear domain boundaries.</p>
</li>
</ul>
<h3 id="heading-decision-flowchart">Decision Flowchart</h3>
<pre><code class="lang-plaintext">Start
  │
  ├─ Team &lt; 15 people? ──── YES ──→ Monolith
  │         │
  │        NO
  │         │
  ├─ Domain boundaries clear? ── NO ──→ Monolith (define boundaries first)
  │         │
  │        YES
  │         │
  ├─ DevOps maturity high? ──── NO ──→ Modular Monolith
  │         │
  │        YES
  │         │
  ├─ Independent scaling needed? ── NO ──→ Modular Monolith
  │         │
  │        YES
  │         │
  └─ Team &gt; 50 &amp; multiple teams? ── YES ──→ Microservices
            │
           NO ──→ Modular Monolith
</code></pre>
<hr />
<h2 id="heading-the-modular-monolith-the-third-option-most-teams-ignore">The Modular Monolith: The Third Option Most Teams Ignore</h2>
<p>If your score lands in the 17-28 range (and statistically, most teams fall here), the <strong>modular monolith</strong> deserves serious consideration.</p>
<p>A modular monolith is a single deployable unit with strictly enforced module boundaries:</p>
<ul>
<li><p><strong>Each module</strong> owns its domain logic and data access</p>
</li>
<li><p><strong>Modules communicate</strong> through well-defined internal APIs (not direct database queries)</p>
</li>
<li><p><strong>Shared kernel</strong> is kept minimal — only truly cross-cutting concerns</p>
</li>
<li><p><strong>Each module</strong> can be independently tested</p>
</li>
</ul>
<p>The beauty of this approach is that it gives you a clear <strong>migration path</strong>. When (and if) a module genuinely needs to become an independent service — because it needs to scale independently, or a separate team needs to own it — you can extract it with minimal refactoring because the boundaries are already defined.</p>
<p>Frameworks like NestJS modules, Spring Boot's module system, and .NET's project structure make this pattern straightforward to implement.</p>
<p><strong>The modular monolith is not a compromise — it's the optimal architecture for teams of 10-100 engineers who need clean separation without distributed systems complexity.</strong></p>
<hr />
<h2 id="heading-pre-migration-readiness-checklist">Pre-Migration Readiness Checklist</h2>
<p>Before committing to microservices, ensure your organization can answer "yes" to these questions:</p>
<ul>
<li><p>[ ] We have <strong>automated CI/CD pipelines</strong> for every service</p>
</li>
<li><p>[ ] We have <strong>container orchestration</strong> (Kubernetes or equivalent) in production</p>
</li>
<li><p>[ ] We have <strong>centralized logging and distributed tracing</strong> across services</p>
</li>
<li><p>[ ] We have defined <strong>clear bounded contexts</strong> with minimal cross-service dependencies</p>
</li>
<li><p>[ ] We have a dedicated <strong>platform / DevOps team</strong> (or budget for one)</p>
</li>
<li><p>[ ] Each service can be <strong>deployed independently</strong> without coordinating with other teams</p>
</li>
<li><p>[ ] We have a strategy for <strong>data consistency</strong> across service boundaries</p>
</li>
<li><p>[ ] We have <strong>service-level SLAs</strong> and monitoring for each service</p>
</li>
<li><p>[ ] Our developers are comfortable with <strong>distributed systems patterns</strong> (circuit breakers, retries, sagas)</p>
</li>
<li><p>[ ] We have a plan for <strong>local development</strong> that doesn't require running all services</p>
</li>
<li><p>[ ] Our <strong>monthly infrastructure budget</strong> can absorb a 2-3x increase</p>
</li>
<li><p>[ ] We have <strong>enough developers to staff autonomous teams</strong> (5-8 people) per service</p>
</li>
</ul>
<p><strong>If you answered "no" to more than 3 of these, you're not ready for microservices.</strong> Consider a modular monolith as your next step, and revisit this checklist in 6-12 months.</p>
<hr />
<h2 id="heading-conclusion-making-the-right-call">Conclusion: Making the Right Call</h2>
<p>The microservices vs. monolith debate has always been a false binary. The real question is: <strong>what architecture gives your specific team the best chance of shipping quality software quickly?</strong></p>
<p>For most teams in 2026, the answer is somewhere between a monolith and full microservices — and that's perfectly fine. The modular monolith pattern gives you clean boundaries, testability, and a future migration path without the operational tax of distributed systems.</p>
<p>Here's what I wish someone had told me when I was spinning up my fifth Lambda function for a project with three contributors:</p>
<blockquote>
<p>Start with the simplest architecture that could work. Add complexity only when you have evidence — not speculation — that you need it.</p>
</blockquote>
<p>Architecture decisions should be driven by your team's size, your domain's maturity, your operational capabilities, and your budget. Not by conference talks, not by what FAANG companies do, and definitely not by your architecture diagram's aesthetic appeal.</p>
<p><strong>Use the decision framework in this article.</strong> Score your project honestly. And if the score says monolith or modular monolith — embrace it. You'll ship faster, spend less, and sleep better.</p>
<p>Next time someone says "we need microservices," ask them to score it first.</p>
]]></content:encoded>
      <author>黃小黃</author>
      <pubDate>Wed, 11 Feb 2026 10:00:22 GMT</pubDate>
      <category>Microservices</category>
      <category>software architecture</category>
      <category>Web Development</category>
      <category>System Design</category>
      <category>Devops</category>
    </item>
    <item>
      <title>Building a Zero-Cost Enterprise Email API: Complete Guide to Timing Attack and Header Injection Protection</title>
      <link>https://suprahuang.cc/cloudflare-workers-secure-email-api</link>
      <guid isPermaLink="true">https://suprahuang.cc/cloudflare-workers-secure-email-api</guid>
      <description>Have you ever found yourself in this situation: your project needs to send system notifications, but SendGrid charges monthly fees, AWS SES setup is complicated, and self-hosting an email server is a maintenance nightmare?
In this article, I&apos;ll share...</description>
      <content:encoded><![CDATA[<p>Have you ever found yourself in this situation: your project needs to send system notifications, but SendGrid charges monthly fees, AWS SES setup is complicated, and self-hosting an email server is a maintenance nightmare?</p>
<p>In this article, I'll share how I built a <strong>completely free</strong> email notification API using <strong>Cloudflare Workers + Email Routing</strong>. More importantly, I'll dive deep into two often-overlooked security attacks: <strong>Timing Attacks</strong> and <strong>Email Header Injection</strong>—and how to defend against them.</p>
<p>This isn't just theory. I've open-sourced the entire project: <a target="_blank" href="https://github.com/supra126/worker-email-notifier">worker-email-notifier</a>. Feel free to use it!</p>
<hr />
<h2 id="heading-why-build-your-own-email-notification-system">🤔 Why Build Your Own Email Notification System?</h2>
<h3 id="heading-the-cost-problem-with-paid-services">The Cost Problem with Paid Services</h3>
<p>Let's look at the pricing of mainstream email services:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Service</td><td>Free Tier</td><td>Beyond Free Tier</td></tr>
</thead>
<tbody>
<tr>
<td>SendGrid</td><td>100 emails/day (60-day trial only)</td><td>Starting at $19.95/month</td></tr>
<tr>
<td>AWS SES</td><td>3,000 emails/month (12-month trial)</td><td>$0.10/1000 emails</td></tr>
<tr>
<td>Mailgun</td><td>100 emails/day</td><td>Starting at $15/month</td></tr>
<tr>
<td>Postmark</td><td>100 emails/month</td><td>Starting at $15/month</td></tr>
</tbody>
</table>
</div><p>For personal projects or small teams, these costs add up. More importantly—<strong>I just want to send a system notification. Why does it have to be this complicated?</strong></p>
<h3 id="heading-cloudflares-free-tier">Cloudflare's Free Tier</h3>
<p>Cloudflare Workers + Email Routing offers:</p>
<ul>
<li><p>✅ <strong>100,000 API requests/day</strong></p>
</li>
<li><p>✅ <strong>Generous email sending limits</strong></p>
</li>
<li><p>✅ <strong>No credit card required</strong></p>
</li>
<li><p>✅ <strong>Global edge network with ultra-low latency</strong></p>
</li>
</ul>
<p>For system notifications, monitoring alerts, and CI/CD notifications, this quota is more than enough.</p>
<h3 id="heading-use-cases-what-its-for-and-what-its-not">Use Cases: What It's For and What It's Not</h3>
<p>Before diving in, let's clarify what this system is designed for:</p>
<p><strong>✅ Good fit:</strong></p>
<ul>
<li><p>Server monitoring alerts (high CPU, service down)</p>
</li>
<li><p>Application event notifications (new orders, payment success)</p>
</li>
<li><p>CI/CD pipeline notifications (build success/failure)</p>
</li>
<li><p>IoT device alerts</p>
</li>
<li><p>Internal team notifications</p>
</li>
</ul>
<p><strong>❌ Not suitable for:</strong></p>
<ul>
<li><p>Marketing emails / newsletters (Email Routing has whitelist restrictions)</p>
</li>
<li><p>User-to-user messaging</p>
</li>
<li><p>Transactional emails to arbitrary external users</p>
</li>
</ul>
<p>Clear boundaries are important—this is a design decision, not a limitation.</p>
<hr />
<h2 id="heading-technology-stack-and-architecture-design">🏗️ Technology Stack and Architecture Design</h2>
<h3 id="heading-why-cloudflare-workers">Why Cloudflare Workers?</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Feature</td><td>Cloudflare Workers</td><td>AWS Lambda</td></tr>
</thead>
<tbody>
<tr>
<td>Cold start</td><td>Almost none</td><td>Can be seconds</td></tr>
<tr>
<td>Global deployment</td><td>Automatic (edge network)</td><td>Manual configuration</td></tr>
<tr>
<td>Free tier</td><td>100,000 req/day</td><td>1M req/month</td></tr>
<tr>
<td>Email integration</td><td>Native Email Routing</td><td>Requires SES</td></tr>
<tr>
<td>Setup complexity</td><td>Low</td><td>Medium-High</td></tr>
</tbody>
</table>
</div><p>The biggest advantage of Workers is <strong>native integration with Email Routing</strong>—no additional email service needed. Just configure DNS and you're ready to send.</p>
<h3 id="heading-system-architecture-overview">System Architecture Overview</h3>
<pre><code class="lang-mermaid">flowchart TD
    A[👤 Client] --&gt;|REST API Request| B[⚡ Cloudflare Worker]
    B --&gt; C[1. CORS Validation]
    C --&gt; D[2. API Key Check 🔒]
    D --&gt;|Timing Attack Protection| E[3. Input Validation 🔒]
    E --&gt;|Header Injection Protection| F[4. Email Sending]
    F --&gt; G[📧 Email Routing]
    G --&gt;|Recipient Whitelist| H[✅ Recipient Inbox]
</code></pre>
<p>Key design decisions:</p>
<ol>
<li><p><strong>Multi-platform isolation</strong>: Each platform has its own sender, API key, and recipient whitelist</p>
</li>
<li><p><strong>Security-first</strong>: Multiple validation layers before sending any email</p>
</li>
<li><p><strong>Flexible configuration</strong>: All settings managed via <code>wrangler.toml</code></p>
</li>
</ol>
<hr />
<h2 id="heading-core-implementation">💻 Core Implementation</h2>
<h3 id="heading-project-structure">Project Structure</h3>
<pre><code class="lang-plaintext">worker-email-notifier/
├── src/
│   └── index.js          # Main code (~450 lines)
├── wrangler.toml         # Workers configuration
├── wrangler.toml.example # Configuration template
└── package.json
</code></pre>
<h3 id="heading-platform-configuration-wranglertoml">Platform Configuration (wrangler.toml)</h3>
<pre><code class="lang-toml"><span class="hljs-section">[[send_email]]</span>
<span class="hljs-attr">name</span> = <span class="hljs-string">"MAILER_A"</span>
<span class="hljs-attr">destination_address</span> = <span class="hljs-string">"boss@gmail.com"</span>
<span class="hljs-attr">allowed_destination_addresses</span> = [<span class="hljs-string">"boss@gmail.com"</span>, <span class="hljs-string">"admin@company.com"</span>]

<span class="hljs-section">[vars.PLATFORMS.platform-a]</span>
<span class="hljs-attr">senderEmail</span> = <span class="hljs-string">"noreply@your-domain.com"</span>
<span class="hljs-attr">senderName</span> = <span class="hljs-string">"Platform A Notifications"</span>
<span class="hljs-attr">mailer</span> = <span class="hljs-string">"MAILER_A"</span>
</code></pre>
<p>Each platform binds to a <code>MAILER</code>, and each <code>MAILER</code> has its own whitelist—that's the key to isolation.</p>
<h3 id="heading-email-sending-logic">Email Sending Logic</h3>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createMimeMessage } <span class="hljs-keyword">from</span> <span class="hljs-string">"mimetext"</span>;

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">sendEmail</span>(<span class="hljs-params">mailer, from, fromName, to, subject, content, html</span>) </span>{
  <span class="hljs-keyword">const</span> msg = createMimeMessage();
  msg.setSender({ <span class="hljs-attr">name</span>: fromName, <span class="hljs-attr">addr</span>: <span class="hljs-keyword">from</span> });
  msg.setRecipient(to);
  msg.setSubject(subject);

  <span class="hljs-comment">// Provide both plain text and HTML versions</span>
  msg.addMessage({
    <span class="hljs-attr">contentType</span>: <span class="hljs-string">"text/plain"</span>,
    <span class="hljs-attr">data</span>: content,
  });

  <span class="hljs-keyword">if</span> (html) {
    msg.addMessage({
      <span class="hljs-attr">contentType</span>: <span class="hljs-string">"text/html"</span>,
      <span class="hljs-attr">data</span>: html,
    });
  }

  <span class="hljs-keyword">const</span> message = <span class="hljs-keyword">new</span> EmailMessage(<span class="hljs-keyword">from</span>, to, msg.asRaw());
  <span class="hljs-keyword">await</span> mailer.send(message);
}
</code></pre>
<p>Using <code>mimetext</code> to create MIME-compliant email format, supporting both plain text and HTML emails.</p>
<hr />
<h2 id="heading-security-protection-part-1-timing-attack-defense">🔐 Security Protection (Part 1): Timing Attack Defense</h2>
<p>This is one of the most important sections of this article. You may have never heard of "timing attacks," but they're a hidden killer for API security.</p>
<h3 id="heading-what-is-a-timing-attack">What is a Timing Attack?</h3>
<p>Imagine a combination lock: every time you get a digit right, the lock makes a subtle "click" sound. A thief can listen to the sounds and guess the combination one digit at a time.</p>
<p><strong>Timing attacks work exactly the same way</strong>—attackers measure server response times to deduce your API key.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770699678888/ae83052a-bbf0-4672-a96d-5091d45f1d04.webp" alt class="image--center mx-auto" /></p>
<h3 id="heading-why-is-not-safe">Why is <code>===</code> Not Safe?</h3>
<p>JavaScript's string comparison uses "short-circuit comparison":</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Assume the correct API key is "secret123"</span>
apiKey === <span class="hljs-string">"secret123"</span>

<span class="hljs-comment">// Comparison process:</span>
<span class="hljs-comment">// "a" vs "s" → 1st char differs, immediately returns false (very fast)</span>
<span class="hljs-comment">// "s" vs "s" → same, continue comparing</span>
<span class="hljs-comment">// "sa" vs "se" → 2nd char differs, returns false (slightly slower)</span>
<span class="hljs-comment">// "se" vs "se" → same, continue...</span>
<span class="hljs-comment">// ...and so on</span>
</code></pre>
<p><strong>What's the problem?</strong></p>
<ul>
<li><p>First character wrong: comparison time ~0.1ms</p>
</li>
<li><p>First five characters correct: comparison time ~0.5ms</p>
</li>
<li><p>All correct: comparison time ~1ms</p>
</li>
</ul>
<p>Attackers can:</p>
<ol>
<li><p>Try "a000000..." → measure time</p>
</li>
<li><p>Try "b000000..." → measure time</p>
</li>
<li><p>Try "s000000..." → this one's slower! First char is "s"</p>
</li>
<li><p>Try "sa00000..." → measure time</p>
</li>
<li><p>...repeat until the entire API key is guessed</p>
</li>
</ol>
<p>This is why you should <strong>never use</strong> <code>===</code> to compare secrets.</p>
<h3 id="heading-constant-time-algorithm-implementation">Constant-Time Algorithm Implementation</h3>
<p>The solution is "constant-time comparison"—the comparison takes the same amount of time regardless of whether the strings match:</p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">timingSafeEqual</span>(<span class="hljs-params">a, b</span>) </span>{
  <span class="hljs-keyword">const</span> encoder = <span class="hljs-keyword">new</span> TextEncoder();
  <span class="hljs-keyword">const</span> aBytes = encoder.encode(a);
  <span class="hljs-keyword">const</span> bBytes = encoder.encode(b);

  <span class="hljs-comment">// Even when lengths differ, perform full comparison</span>
  <span class="hljs-comment">// to avoid leaking length information</span>
  <span class="hljs-keyword">if</span> (aBytes.length !== bBytes.length) {
    <span class="hljs-comment">// Compare bBytes against itself to consume constant time</span>
    <span class="hljs-comment">// proportional to input length, then return false</span>
    <span class="hljs-keyword">let</span> result = <span class="hljs-number">1</span>;
    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; bBytes.length; i++) {
      result |= aBytes[i % aBytes.length] ^ bBytes[i];
    }
    <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;
  }

  <span class="hljs-comment">// Use XOR operation, accumulate all differences</span>
  <span class="hljs-keyword">let</span> result = <span class="hljs-number">0</span>;
  <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; aBytes.length; i++) {
    result |= aBytes[i] ^ bBytes[i];
  }

  <span class="hljs-comment">// result is 0 only if strings are identical</span>
  <span class="hljs-keyword">return</span> result === <span class="hljs-number">0</span>;
}
</code></pre>
<p><strong>Why does this work?</strong></p>
<ol>
<li><p><strong>XOR operation</strong>: Same = 0, different = non-zero</p>
</li>
<li><p><strong>OR accumulation</strong>: If any bit differs, result won't be 0</p>
</li>
<li><p><strong>Full iteration</strong>: Loop runs completely regardless of match</p>
</li>
<li><p><strong>Constant time</strong>: Execution time depends only on string length, not content</p>
</li>
</ol>
<h3 id="heading-practical-application">Practical Application</h3>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">validateApiKey</span>(<span class="hljs-params">providedKey, env, platformId</span>) </span>{
  <span class="hljs-comment">// Try to get platform-specific key</span>
  <span class="hljs-keyword">const</span> apiKeys = parseApiKeys(env.API_KEYS);
  <span class="hljs-keyword">const</span> platformKey = apiKeys[platformId];

  <span class="hljs-keyword">if</span> (platformKey) {
    <span class="hljs-comment">// Use constant-time comparison!</span>
    <span class="hljs-keyword">return</span> timingSafeEqual(providedKey, platformKey);
  }

  <span class="hljs-comment">// Fall back to shared key</span>
  <span class="hljs-keyword">if</span> (env.API_KEY) {
    <span class="hljs-keyword">return</span> timingSafeEqual(providedKey, env.API_KEY);
  }

  <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;
}
</code></pre>
<blockquote>
<p>💡 <strong>Note</strong>: Cloudflare Workers now supports <a target="_blank" href="https://developers.cloudflare.com/workers/examples/protect-against-timing-attacks/"><code>crypto.subtle.timingSafeEqual()</code></a> natively, and also supports <code>crypto.timingSafeEqual()</code> from <code>node:crypto</code> with the <code>nodejs_compat</code> flag enabled. The custom implementation above is kept for educational purposes—in production, prefer the built-in API.</p>
</blockquote>
<hr />
<h2 id="heading-security-protection-part-2-email-header-injection-defense">🛡️ Security Protection (Part 2): Email Header Injection Defense</h2>
<p>The second attack to defend against is "email header injection"—more common but equally overlooked.</p>
<h3 id="heading-what-is-header-injection">What is Header Injection?</h3>
<p>SMTP email structure uses <code>\r\n</code> to separate different headers:</p>
<pre><code class="lang-plaintext">From: sender@example.com\r\n
To: recipient@example.com\r\n
Subject: Hello\r\n
\r\n
Email body...
</code></pre>
<p>If an attacker can inject <code>\r\n</code> into the <code>subject</code>, they can insert arbitrary headers:</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Malicious input</span>
<span class="hljs-keyword">const</span> subject = <span class="hljs-string">"Hello\r\nBcc: victim1@example.com, victim2@example.com\r\n\r\nSpam content"</span>;

<span class="hljs-comment">// Actual generated email</span>
<span class="hljs-comment">/*
Subject: Hello
Bcc: victim1@example.com, victim2@example.com

Spam content
*/</span>
</code></pre>
<p>The attacker successfully added their own recipients!</p>
<h3 id="heading-impact-of-the-attack">Impact of the Attack</h3>
<ul>
<li><p>📧 <strong>Spam distribution</strong>: Send massive spam using your domain</p>
</li>
<li><p>🎭 <strong>Phishing</strong>: Forge the From field for phishing attacks</p>
</li>
<li><p>📛 <strong>Domain reputation damage</strong>: Your domain may be blacklisted</p>
</li>
<li><p>🔓 <strong>Data leakage</strong>: Secretly BCC sensitive information to attackers</p>
</li>
</ul>
<h3 id="heading-defense-strategies">Defense Strategies</h3>
<h4 id="heading-strategy-1-strict-newline-detection">Strategy 1: Strict Newline Detection</h4>
<pre><code class="lang-javascript"><span class="hljs-comment">// Check if subject contains newline characters</span>
<span class="hljs-keyword">if</span> (<span class="hljs-regexp">/[\r\n]/</span>.test(subject)) {
  <span class="hljs-keyword">return</span> jsonResponse(
    { <span class="hljs-attr">success</span>: <span class="hljs-literal">false</span>, <span class="hljs-attr">error</span>: <span class="hljs-string">"Invalid subject: contains forbidden characters"</span> },
    <span class="hljs-number">400</span>
  );
}
</code></pre>
<p>Simple but effective—reject any subject containing <code>\r</code> or <code>\n</code>.</p>
<h4 id="heading-strategy-2-strict-email-format-validation">Strategy 2: Strict Email Format Validation</h4>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">isValidEmail</span>(<span class="hljs-params">email</span>) </span>{
  <span class="hljs-comment">// Basic length check</span>
  <span class="hljs-keyword">if</span> (!email || email.length &gt; <span class="hljs-number">254</span>) {
    <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;
  }

  <span class="hljs-comment">// Prevent consecutive dots (possible path traversal)</span>
  <span class="hljs-keyword">if</span> (<span class="hljs-regexp">/\.\./</span>.test(email)) {
    <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;
  }

  <span class="hljs-comment">// Prevent leading or trailing dots</span>
  <span class="hljs-keyword">if</span> (email.startsWith(<span class="hljs-string">"."</span>) || email.endsWith(<span class="hljs-string">"."</span>)) {
    <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;
  }

  <span class="hljs-comment">// Practical email validation (RFC 5322-inspired)</span>
  <span class="hljs-keyword">const</span> emailRegex = <span class="hljs-regexp">/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)+$/</span>;

  <span class="hljs-keyword">return</span> emailRegex.test(email);
}
</code></pre>
<p>This validation function:</p>
<ul>
<li><p>Limits maximum length (254 characters per RFC)</p>
</li>
<li><p>Blocks consecutive dots (prevents <code>../</code> style attacks)</p>
</li>
<li><p>Uses strict regex validation</p>
</li>
</ul>
<h4 id="heading-strategy-3-html-content-escaping-xss-prevention">Strategy 3: HTML Content Escaping (XSS Prevention)</h4>
<p>If allowing HTML emails, also prevent XSS:</p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">escapeHtml</span>(<span class="hljs-params">text</span>) </span>{
  <span class="hljs-keyword">const</span> map = {
    <span class="hljs-string">"&amp;"</span>: <span class="hljs-string">"&amp;amp;"</span>,
    <span class="hljs-string">"&lt;"</span>: <span class="hljs-string">"&amp;lt;"</span>,
    <span class="hljs-string">"&gt;"</span>: <span class="hljs-string">"&amp;gt;"</span>,
    <span class="hljs-string">'"'</span>: <span class="hljs-string">"&amp;quot;"</span>,
    <span class="hljs-string">"'"</span>: <span class="hljs-string">"&amp;#039;"</span>,
  };
  <span class="hljs-keyword">return</span> text.replace(<span class="hljs-regexp">/[&amp;&lt;&gt;"']/g</span>, <span class="hljs-function">(<span class="hljs-params">char</span>) =&gt;</span> map[char]);
}
</code></pre>
<h3 id="heading-error-message-sanitization">Error Message Sanitization</h3>
<p>Another easily overlooked point—<strong>error messages can also leak sensitive information</strong>:</p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">sanitizeErrorMessage</span>(<span class="hljs-params">message</span>) </span>{
  <span class="hljs-keyword">if</span> (<span class="hljs-keyword">typeof</span> message !== <span class="hljs-string">"string"</span>) {
    <span class="hljs-keyword">return</span> <span class="hljs-string">"An error occurred"</span>;
  }

  <span class="hljs-keyword">return</span> message
    <span class="hljs-comment">// Remove stack traces</span>
    .replace(<span class="hljs-regexp">/at\s+.*:\d+:\d+/g</span>, <span class="hljs-string">""</span>)
    <span class="hljs-comment">// Remove file paths</span>
    .replace(<span class="hljs-regexp">/\/[\w/.-]+/g</span>, <span class="hljs-string">"[path]"</span>)
    <span class="hljs-comment">// Remove sensitive keywords</span>
    .replace(<span class="hljs-regexp">/password|secret|key|token/gi</span>, <span class="hljs-string">"[redacted]"</span>)
    .trim()
    .substring(<span class="hljs-number">0</span>, <span class="hljs-number">200</span>);
}
</code></pre>
<p>Never expose internal implementation details in error messages.</p>
<hr />
<h2 id="heading-deployment-and-testing">🚀 Deployment and Testing</h2>
<h3 id="heading-deployment-steps">Deployment Steps</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># 1. Install dependencies</span>
npm install

<span class="hljs-comment"># 2. Copy and modify configuration</span>
cp wrangler.toml.example wrangler.toml
<span class="hljs-comment"># Edit wrangler.toml to set your domain and platforms</span>

<span class="hljs-comment"># 3. Login to Cloudflare</span>
wrangler login

<span class="hljs-comment"># 4. Generate and set API Key</span>
npm run generate-key
wrangler secret put API_KEY

<span class="hljs-comment"># 5. Deploy</span>
npm run deploy
</code></pre>
<h3 id="heading-testing-the-api">Testing the API</h3>
<pre><code class="lang-bash">curl -X POST https://email-notifier.your-subdomain.workers.dev \
  -H <span class="hljs-string">"Content-Type: application/json"</span> \
  -H <span class="hljs-string">"X-API-Key: your-api-key"</span> \
  -d <span class="hljs-string">'{
    "platformId": "platform-a",
    "to": ["boss@gmail.com"],
    "subject": "🔔 System Notification",
    "content": "This is a test email",
    "html": "&lt;h1&gt;System Notification&lt;/h1&gt;&lt;p&gt;This is a test email&lt;/p&gt;"
  }'</span>
</code></pre>
<p>Success response:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"success"</span>: <span class="hljs-literal">true</span>,
  <span class="hljs-attr">"message"</span>: <span class="hljs-string">"Email sent: 1 success, 0 failed"</span>,
  <span class="hljs-attr">"platform"</span>: <span class="hljs-string">"platform-a"</span>,
  <span class="hljs-attr">"details"</span>: [
    { <span class="hljs-attr">"to"</span>: <span class="hljs-string">"boss@gmail.com"</span>, <span class="hljs-attr">"status"</span>: <span class="hljs-string">"fulfilled"</span> }
  ]
}
</code></pre>
<h3 id="heading-common-issues-troubleshooting">Common Issues Troubleshooting</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Issue</td><td>Possible Cause</td><td>Solution</td></tr>
</thead>
<tbody>
<tr>
<td>401 Unauthorized</td><td>Wrong API Key</td><td>Verify header name is <code>X-API-Key</code></td></tr>
<tr>
<td>400 Invalid platform</td><td>platformId doesn't exist</td><td>Check platform config in wrangler.toml</td></tr>
<tr>
<td>500 Email failed</td><td>Recipient not in whitelist</td><td>Add to <code>allowed_destination_addresses</code></td></tr>
<tr>
<td>CORS error</td><td>Origin not allowed</td><td>Set <code>ALLOWED_ORIGINS</code> environment variable</td></tr>
</tbody>
</table>
</div><hr />
<h2 id="heading-open-source-project-and-recommendations">📦 Open Source Project and Recommendations</h2>
<h3 id="heading-project-information">Project Information</h3>
<ul>
<li><p><strong>GitHub</strong>: <a target="_blank" href="https://github.com/supra126/worker-email-notifier">supra126/worker-email-notifier</a></p>
</li>
<li><p><strong>License</strong>: MIT License</p>
</li>
<li><p><strong>Documentation</strong>: Available in English and Traditional Chinese</p>
</li>
</ul>
<h3 id="heading-free-tier-limits">Free Tier Limits</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Item</td><td>Free Quota</td><td>Suitable For</td></tr>
</thead>
<tbody>
<tr>
<td>API requests</td><td>100,000/day</td><td>Most small-medium applications</td></tr>
<tr>
<td>Email sending</td><td>Generous daily limit</td><td>More than enough for system notifications</td></tr>
</tbody>
</table>
</div><h3 id="heading-extension-suggestions">Extension Suggestions</h3>
<p>Want to extend the functionality? Here are some directions:</p>
<ol>
<li><p><strong>Add platforms</strong>: Add new <code>[[send_email]]</code> and <code>PLATFORMS</code> config in <code>wrangler.toml</code></p>
</li>
<li><p><strong>Email templates</strong>: Store HTML templates in KV Storage</p>
</li>
<li><p><strong>Rate limiting</strong>: Integrate with Cloudflare WAF Rate Limiting rules</p>
</li>
<li><p><strong>Logging</strong>: Use Workers Analytics or Logpush</p>
</li>
</ol>
<hr />
<h2 id="heading-conclusion">📝 Conclusion</h2>
<p>This article shared how to build a zero-cost email notification system using Cloudflare Workers + Email Routing, and more importantly, explored two commonly overlooked security attacks in depth.</p>
<h3 id="heading-key-takeaways">Key Takeaways</h3>
<ol>
<li><p><strong>Zero-cost doesn't mean low-quality</strong> Cloudflare's free tier is sufficient for most notification scenarios</p>
</li>
<li><p><strong>Timing attacks are a hidden risk</strong> Never use <code>===</code> to compare secrets—use constant-time algorithms</p>
</li>
<li><p><strong>Input validation is fundamental</strong> Header injection attacks are simple but dangerous—validate all inputs strictly</p>
</li>
<li><p><strong>Security is not optional—it's essential</strong> Even for small features, the cost of security measures is far less than post-incident remediation</p>
</li>
</ol>
<h3 id="heading-final-thoughts">Final Thoughts</h3>
<p>The full project is open-sourced on GitHub — feel free to use it, fork it, or contribute.</p>
<hr />
<p><strong>References:</strong></p>
<ul>
<li><p><a target="_blank" href="https://developers.cloudflare.com/email-routing/">Cloudflare Email Routing Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://dev.to/silentwatcher_95/timing-attacks-in-nodejs-4pmb">Timing Attacks in Node.js</a></p>
</li>
<li><p><a target="_blank" href="https://www.codecademy.com/learn/2021-owasp-top-10-injection-attacks/modules/dont-mean-to-inject-but-here-comes-email-header-injection-attacks/cheatsheet">Email Header Injection - Codecademy</a></p>
</li>
<li><p><a target="_blank" href="https://www.acunetix.com/blog/articles/email-header-injection/">What Are Email Injection Attacks - Acunetix</a></p>
</li>
</ul>
]]></content:encoded>
      <author>黃小黃</author>
      <pubDate>Tue, 10 Feb 2026 04:30:49 GMT</pubDate>
      <category>Cloudflare Workers</category>
      <category>email</category>
      <category>Security</category>
      <category>Timing Attack</category>
      <category>JavaScript</category>
      <category>serverless</category>
      <category>api security</category>
    </item>
    <item>
      <title>Astro 6 Beta Upgrade: Zero Code Changes in a Real-World Blog — And Why the Future Looks Different</title>
      <link>https://suprahuang.cc/astro-6-beta-upgrade-zero-code-changes-real-world-blog</link>
      <guid isPermaLink="true">https://suprahuang.cc/astro-6-beta-upgrade-zero-code-changes-real-world-blog</guid>
      <description>Two weeks ago, Cloudflare acquired Astro. Days later, Astro 6 Beta dropped with first-class Cloudflare Workers support. The timing wasn&apos;t a coincidence.
As someone who recently rewrote Hashnode&apos;s Next.js Starter Kit in Astro and has been building on ...</description>
      <content:encoded><![CDATA[<p>Two weeks ago, <a target="_blank" href="https://blog.cloudflare.com/astro-joins-cloudflare/">Cloudflare acquired Astro</a>. Days later, <a target="_blank" href="https://astro.build/blog/astro-6-beta/">Astro 6 Beta dropped</a> with first-class Cloudflare Workers support. The timing wasn't a coincidence.</p>
<p>As someone who <a target="_blank" href="https://suprahuang.cc/i-rewrote-hashnodes-nextjs-starter-kit-in-astro-from-150-kb-to-15-kb-of-client-js">recently rewrote Hashnode's Next.js Starter Kit in Astro</a> and has been building on the Cloudflare ecosystem, this felt like a natural next step: upgrade my real-world Astro 5 project to v6 Beta and see what happens.</p>
<p>The upgrade itself took minutes. Zero code changes. But the architecture shift behind Astro 6 — that's what makes this interesting.</p>
<hr />
<h2 id="heading-quick-context-the-project">Quick Context — The Project</h2>
<p>The project is <a target="_blank" href="https://github.com/supra126/astro-starter-hashnode">astro-starter-hashnode</a>: an open-source Astro-based frontend for Hashnode blogs. It replaces Hashnode's official Next.js starter kit, cutting client-side JavaScript from ~150 kB to roughly 15 kB.</p>
<p>The stack before the upgrade:</p>
<ul>
<li><p><strong>Astro</strong> v5.17.1</p>
</li>
<li><p><strong>Tailwind CSS</strong> v4 (via <code>@tailwindcss/vite</code>)</p>
</li>
<li><p><strong>GraphQL</strong> for fetching content from Hashnode's API</p>
</li>
<li><p><strong>Deployed on</strong> Vercel (fully static output)</p>
</li>
<li><p><strong>10 pages</strong>, 540K total build output</p>
</li>
</ul>
<p>This makes it a useful upgrade test case: it's a real project with real dependencies, not a starter template with hello-world complexity.</p>
<hr />
<h2 id="heading-the-upgrade-what-actually-happened">The Upgrade — What Actually Happened</h2>
<h3 id="heading-step-1-bump-the-version">Step 1: Bump the Version</h3>
<pre><code class="lang-bash">npm install astro@next
</code></pre>
<p>That's it. Astro's CLI pulled in v6.0.0-beta.9 along with Vite 7.0. The <code>@tailwindcss/vite</code> v4 adapter required no changes.</p>
<h3 id="heading-step-2-first-build">Step 2: First Build</h3>
<pre><code class="lang-bash">npm run build
</code></pre>
<p><strong>Result: BUILD SUCCESS.</strong> No errors. The only output worth noting was a single internal warning:</p>
<pre><code class="lang-plaintext">[WARN] [vite] "isRemoteAllowed", "matchHostname", "matchPathname",
"matchPort" and "matchProtocol" are imported from external module
"@astrojs/internal-helpers/remote" but never used in
"node_modules/astro/dist/assets/utils/index.js".
</code></pre>
<p>This lives inside <code>node_modules</code> — it's Astro's own housekeeping, not something you need to act on.</p>
<h3 id="heading-the-numbers">The Numbers</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Metric</td><td>Astro v5.17.1</td><td>Astro v6 Beta</td><td>Delta</td></tr>
</thead>
<tbody>
<tr>
<td>Build Time</td><td>8.54s</td><td>~7.32s (avg of 3)</td><td>-14%</td></tr>
<tr>
<td>Output Size</td><td>540K</td><td>540K</td><td>0%</td></tr>
<tr>
<td>Pages</td><td>10</td><td>10</td><td>—</td></tr>
<tr>
<td>Build Errors</td><td>0</td><td>0</td><td>—</td></tr>
<tr>
<td>Code Changes Required</td><td>—</td><td><strong>0 files</strong></td><td>—</td></tr>
<tr>
<td>Vite</td><td>6.x</td><td>7.0</td><td>Major upgrade</td></tr>
</tbody>
</table>
</div><p>The -14% build time improvement sounds nice but is misleading. Build time in this project is dominated by Hashnode's GraphQL API latency (network I/O), not Astro's compilation. Individual runs varied from 5.87s to 9.00s depending on network conditions. The honest answer: <strong>build performance is effectively identical</strong>.</p>
<p>What didn't change is equally important: <strong>same output, same HTML, same 540K.</strong> The upgrade is transparent to end users.</p>
<hr />
<h2 id="heading-why-zero-breaking-changes-its-not-luck-its-architecture">Why Zero Breaking Changes? It's Not Luck — It's Architecture</h2>
<p>Astro 6 ships with <a target="_blank" href="https://docs.astro.build/en/guides/upgrade-to/v6/">a list of documented breaking changes</a>. None of them affected this project. Here's the checklist I ran through:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Breaking Change</td><td>Affected?</td><td>Why Not</td></tr>
</thead>
<tbody>
<tr>
<td>Node 22+ required</td><td>No</td><td>Already on v24</td></tr>
<tr>
<td>Vite 7.0</td><td>No issues</td><td>Tailwind v4 compatible</td></tr>
<tr>
<td>Zod 4 API changes</td><td>N/A</td><td>Not used</td></tr>
<tr>
<td><code>&lt;ViewTransitions /&gt;</code> removed</td><td>No</td><td>Already using <code>&lt;ClientRouter /&gt;</code></td></tr>
<tr>
<td><code>Astro.glob()</code> removed</td><td>N/A</td><td>Not used</td></tr>
<tr>
<td>Legacy Content Collections removed</td><td>N/A</td><td>Data from Hashnode API</td></tr>
<tr>
<td>Markdown heading ID algorithm</td><td>N/A</td><td>Content rendered by Hashnode</td></tr>
<tr>
<td>Script/style tag order change</td><td>No impact</td><td>—</td></tr>
<tr>
<td>Image service defaults</td><td>N/A</td><td>Using raw <code>&lt;img&gt;</code> with CDN URLs</td></tr>
<tr>
<td><code>import.meta.env</code> always inlined</td><td>No</td><td>Only <code>PUBLIC_</code> vars used</td></tr>
<tr>
<td>Experimental flags removed</td><td>N/A</td><td>None configured</td></tr>
<tr>
<td><code>redirectToDefaultLocale</code> changed</td><td>N/A</td><td>No i18n</td></tr>
<tr>
<td><code>getStaticPaths()</code> Astro access removed</td><td>N/A</td><td>Not using Astro object in paths</td></tr>
</tbody>
</table>
</div><p>This wasn't luck. The project had already adopted Astro's recommended patterns:</p>
<ul>
<li><p><code>&lt;ClientRouter /&gt;</code> instead of the deprecated <code>&lt;ViewTransitions /&gt;</code></p>
</li>
<li><p><strong>External CMS</strong> (Hashnode API) instead of Content Collections</p>
</li>
<li><p><strong>Standard</strong> <code>PUBLIC_</code> env vars instead of server-side secrets</p>
</li>
<li><p><strong>Raw image tags</strong> with CDN URLs instead of Astro's image pipeline</p>
</li>
</ul>
<p><strong>The takeaway</strong>: if you've been following Astro's best practices in v5, your upgrade path to v6 is likely smoother than you think.</p>
<hr />
<h2 id="heading-okay-but-heres-why-astro-6-actually-matters">Okay, But Here's Why Astro 6 Actually Matters</h2>
<p>The smooth upgrade is nice. But if that were the whole story, this would be a short post. What makes Astro 6 significant isn't what changed in the code — it's what changed in the architecture.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770574273901/52814b0f-3ee3-4a80-a6d0-86751d912385.webp" alt="Astro 6 architecture shift — dev server now runs your production runtime" /></p>
<h3 id="heading-the-dev-server-runs-your-production-runtime">The Dev Server Runs Your Production Runtime</h3>
<p>Before Astro 6, <code>astro dev</code> ran your project in Node.js regardless of where you'd deploy it. If your production target was Cloudflare Workers, you were developing against a simulation.</p>
<p>Astro 6 changes this fundamentally. The new dev server leverages <a target="_blank" href="https://vite.dev/guide/api-environment">Vite's Environment API</a> to run your application inside the <strong>same runtime as production</strong>. For Cloudflare Workers, that means <code>astro dev</code> now uses <a target="_blank" href="https://github.com/cloudflare/workerd">workerd</a> — the actual open-source runtime that powers Workers globally.</p>
<p>This isn't a mock. It's the real engine.</p>
<h3 id="heading-real-platform-apis-during-development">Real Platform APIs During Development</h3>
<p>With the workerd-powered dev server, you get access to real Cloudflare primitives during local development:</p>
<ul>
<li><p><strong>Durable Objects</strong> — Test stateful serverless objects locally</p>
</li>
<li><p><strong>KV Namespaces</strong> — Read/write to key-value storage in dev</p>
</li>
<li><p><strong>R2 Storage</strong> — Object storage available during development</p>
</li>
<li><p><strong>Workers Analytics Engine</strong> — All with hot module replacement</p>
</li>
</ul>
<p>No more "it works in dev but breaks in production" surprises for platform-specific APIs.</p>
<h3 id="heading-sessions-api-with-automatic-kv">Sessions API with Automatic KV</h3>
<p>Astro's <a target="_blank" href="https://docs.astro.build/en/guides/sessions/">Sessions API</a> (stable since v5.7) stores user data between requests. When using the Cloudflare adapter, it automatically configures Workers KV for session storage. Wrangler provisions the KV namespace on deploy — zero manual setup.</p>
<h3 id="heading-live-content-collections">Live Content Collections</h3>
<p><a target="_blank" href="https://docs.astro.build/en/guides/content-collections/">Live content collections</a> — experimental since Astro 5.10 — are now stable. They allow fetching content from CMSs, APIs, and databases with a unified API, updating in real-time without requiring a rebuild. For a Hashnode-powered blog like this one, that's a compelling path forward.</p>
<h3 id="heading-built-in-content-security-policy">Built-in Content Security Policy</h3>
<p>CSP support, previously experimental, is now stable. It controls which resources can load on your pages, protecting against XSS and code injection attacks — an increasingly important baseline for any production site.</p>
<hr />
<h2 id="heading-the-bigger-picture-cloudflare-astro">The Bigger Picture — Cloudflare + Astro</h2>
<p>On January 16, 2026, <a target="_blank" href="https://www.cloudflare.com/press/press-releases/2026/cloudflare-acquires-astro-to-accelerate-the-future-of-high-performance-web-development/">Cloudflare announced</a> that the Astro team would be joining Cloudflare. This wasn't just an acqui-hire — it's a strategic bet on content-driven web development.</p>
<p><strong>Why this matters for developers:</strong></p>
<p>Astro already excels at content sites, docs, marketing pages, and hybrid sites with selective interactivity. Cloudflare Workers already excels at edge computing with global distribution. Combining them creates a "golden path" where the framework and the platform are designed to work together — similar to how Next.js and Vercel evolved.</p>
<p><strong>What it doesn't mean:</strong></p>
<p>Astro will remain open source. It will continue to deploy to Vercel, Netlify, and other platforms. This project still runs on Vercel, and that's fine. But the deepest integration, the most optimized path, will increasingly be Cloudflare Workers.</p>
<p>The <a target="_blank" href="https://astro.build/blog/joining-cloudflare/">Astro team's own announcement</a> confirms this: Astro becomes the best way to build content sites, whether you host on Cloudflare or elsewhere.</p>
<p>As someone who was already building with both Astro and Cloudflare — <a target="_blank" href="https://suprahuang.cc/building-a-zero-cost-enterprise-email-api-complete-guide-to-timing-attack-and-header-injection-protection">an email API on Workers</a> and a blog frontend in Astro — watching these two ecosystems merge feels exciting rather than surprising. The tools are converging around the same vision: fast, lightweight, edge-first.</p>
<hr />
<h2 id="heading-should-you-upgrade-now">Should You Upgrade Now?</h2>
<p><strong>If you're on Astro 5 and following best practices:</strong> the upgrade is probably easier than you expect. Check the <a target="_blank" href="https://docs.astro.build/en/guides/upgrade-to/v6/">breaking changes list</a> against your project. If you're not using Content Collections, <code>Astro.glob()</code>, or <code>&lt;ViewTransitions /&gt;</code>, you might be in the same "zero changes" boat.</p>
<p><strong>If you're evaluating frameworks for a content site:</strong> Astro 6 + Cloudflare Workers is becoming the most integrated option for edge-first content delivery. Worth serious consideration.</p>
<p><strong>If you're on Vercel or Netlify:</strong> no urgency. Astro 6 works great on these platforms too. You gain Vite 7, stable Live Content Collections, and CSP support regardless of where you deploy.</p>
<p><strong>One caveat:</strong> Astro 6 is still in beta. For production sites, it's reasonable to wait for the stable release. But for side projects or new builds, the beta is stable enough — this project built and ran without a single issue.</p>
<hr />
<h2 id="heading-whats-next">What's Next</h2>
<p>For this project, the natural next experiment is exploring a Workers deployment path — moving from Vercel static output to Cloudflare Workers with SSR. That would unlock KV caching for GraphQL responses, Sessions for user preferences, and the full edge runtime experience.</p>
<p>That's a story for another post.</p>
<p>In the meantime, the <a target="_blank" href="https://github.com/supra126/astro-starter-hashnode">astro-starter-hashnode</a> repo is open source. If you've done your own Astro 6 upgrade — smooth or rocky — drop a comment. The more data points we have from real projects, the better the community can prepare for the stable release.</p>
]]></content:encoded>
      <author>黃小黃</author>
      <pubDate>Mon, 09 Feb 2026 04:30:38 GMT</pubDate>
      <category>Astro</category>
      <category>Cloudflare Workers</category>
      <category>Web Development</category>
      <category>JavaScript</category>
      <category>Web Perf</category>
    </item>
    <item>
      <title>I Rewrote Hashnode&apos;s Next.js Starter Kit in Astro — From 150 kB to ~15 kB of Client JS</title>
      <link>https://suprahuang.cc/astro-starter-hashnode-rewrite-nextjs-to-astro</link>
      <guid isPermaLink="true">https://suprahuang.cc/astro-starter-hashnode-rewrite-nextjs-to-astro</guid>
      <description>Your blog doesn&apos;t need 150 kB of JavaScript.
I discovered this when I started using Hashnode. Their official Next.js starter kit worked fine out of the box — but something felt off. A blog that publishes a few articles a week was loading an entire Re...</description>
      <content:encoded><![CDATA[<p>Your blog doesn't need 150 kB of JavaScript.</p>
<p>I discovered this when I started using <a target="_blank" href="https://hashnode.com">Hashnode</a>. Their <a target="_blank" href="https://github.com/Hashnode/starter-kit">official Next.js starter kit</a> worked fine out of the box — but something felt off. A blog that publishes a few articles a week was loading an entire React runtime, multiple JavaScript bundles, and a full client-side router. For what? Rendering text and images.</p>
<p>So I rewrote the entire thing in <a target="_blank" href="https://astro.build">Astro</a>. The result? A fully-featured blog frontend that ships <strong>~15 kB of client-side JavaScript</strong> — a 90% reduction. Same features. Same CMS. Dramatically less code sent to your readers' browsers.</p>
<p>Here's the story, the technical decisions, and how you can deploy your own in under 2 minutes.</p>
<h2 id="heading-the-problem-a-react-runtime-for-a-blog">The Problem: A React Runtime for a Blog</h2>
<p>Don't get me wrong — Hashnode's starter kit is well-built, and the team has done solid work with it. But there's a fundamental mismatch: <strong>a blog is mostly static content, yet the starter kit ships an entire React runtime to the browser.</strong></p>
<p>When I first deployed it, I opened DevTools and looked at what was being loaded. For a page that's essentially an article with some images, the browser was downloading:</p>
<ul>
<li>React + ReactDOM</li>
<li>The Next.js client-side router</li>
<li>Hydration logic</li>
<li>Various runtime utilities</li>
</ul>
<p>All together, <strong>150 kB+ of JavaScript</strong> — before any of my actual content loads.</p>
<p>Then I thought about what a blog post page <em>actually</em> needs to do on the client side:</p>
<ul>
<li>Render text (HTML does this natively)</li>
<li>Display images (HTML does this natively)</li>
<li>Apply syntax highlighting (CSS can handle most of this)</li>
<li>Toggle dark mode (a few lines of vanilla JS)</li>
</ul>
<p>There's also operational complexity. The starter kit uses SSR (Server-Side Rendering) or ISR (Incremental Static Regeneration), which means you need a Node.js server or a platform that supports edge functions. For a blog that publishes a few posts a week, this felt like overkill.</p>
<p><strong>There had to be a lighter way to do this.</strong></p>
<h2 id="heading-why-astro">Why Astro?</h2>
<p><a target="_blank" href="https://astro.build">Astro</a> is built around a philosophy that aligns perfectly with content-heavy sites: <strong>ship zero JavaScript by default.</strong> Every page is pre-rendered to static HTML at build time. No framework runtime. No hydration. Just HTML, CSS, and your content.</p>
<p>The key concept is <strong>Islands Architecture</strong>. Instead of hydrating the entire page with a JavaScript framework, Astro lets you create small "islands" of interactivity — only the components that genuinely need JavaScript get it. Everything else stays as static HTML.</p>
<p>For a blog, this means:</p>
<ul>
<li><strong>Article content?</strong> Static HTML. Zero JS.</li>
<li><strong>Navigation and layout?</strong> Static HTML. Zero JS.</li>
<li><strong>Dark mode toggle?</strong> A tiny island with a few lines of vanilla JS.</li>
<li><strong>Search modal?</strong> An island that loads only when triggered.</li>
</ul>
<p>This isn't a trade-off. It's the right architecture for the job.</p>
<p>Astro is also framework-agnostic. If I ever need a React or Svelte component for something complex, I can drop it in as an island. But for this project, vanilla JS in Astro components was more than enough.</p>
<h2 id="heading-key-architecture-decisions">Key Architecture Decisions</h2>
<h3 id="heading-graphql-client-lightweight-by-design">GraphQL Client: Lightweight by Design</h3>
<p>Hashnode's API is GraphQL-based. The Next.js starter kit typically pairs this with heavier clients. I chose <a target="_blank" href="https://github.com/jasonkuhrt/graphql-request"><code>graphql-request</code></a> — a minimal GraphQL client with zero unnecessary dependencies. Since it only runs at build time in a static Astro site, it adds zero bytes to the client bundle.</p>
<p>The entire client setup is 16 lines:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// src/lib/client.ts</span>
<span class="hljs-keyword">import</span> { GraphQLClient } <span class="hljs-keyword">from</span> <span class="hljs-string">'graphql-request'</span>;

<span class="hljs-keyword">const</span> GQL_ENDPOINT =
  <span class="hljs-keyword">import</span>.meta.env.PUBLIC_HASHNODE_GQL_ENDPOINT || <span class="hljs-string">'https://gql.hashnode.com'</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> gqlClient = <span class="hljs-keyword">new</span> GraphQLClient(GQL_ENDPOINT, {
  headers: {
    <span class="hljs-string">'hn-trace-app'</span>: <span class="hljs-string">'astro-starter-hashnode'</span>,
  },
});

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> PUBLICATION_HOST =
  <span class="hljs-keyword">import</span>.meta.env.PUBLIC_HASHNODE_PUBLICATION_HOST || <span class="hljs-string">'engineering.hashnode.com'</span>;
</code></pre>
<p>All GraphQL queries are organized in 11 dedicated files (845 lines total), covering everything from homepage posts to RSS feeds to search.</p>
<h3 id="heading-static-output-smart-prefetching">Static Output + Smart Prefetching</h3>
<p>The Astro config is intentionally minimal:</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// astro.config.mjs</span>
<span class="hljs-keyword">import</span> { defineConfig } <span class="hljs-keyword">from</span> <span class="hljs-string">'astro/config'</span>;
<span class="hljs-keyword">import</span> tailwindcss <span class="hljs-keyword">from</span> <span class="hljs-string">'@tailwindcss/vite'</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> defineConfig({
  <span class="hljs-attr">site</span>: siteUrl,
  <span class="hljs-attr">output</span>: <span class="hljs-string">'static'</span>,
  <span class="hljs-attr">prefetch</span>: {
    <span class="hljs-attr">prefetchAll</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">defaultStrategy</span>: <span class="hljs-string">'hover'</span>,
  },
  <span class="hljs-attr">vite</span>: {
    <span class="hljs-attr">plugins</span>: [tailwindcss()],
  },
});
</code></pre>
<p>Two things to note:</p>
<ol>
<li><strong><code>output: 'static'</code></strong> — Every page is pre-built as an HTML file. No server needed.</li>
<li><strong><code>defaultStrategy: 'hover'</code></strong> — When a user hovers over a link, Astro prefetches that page in the background. By the time they click, the page is already cached. This gives the feel of a SPA without any client-side router.</li>
</ol>
<h3 id="heading-tailwind-css-v4">Tailwind CSS v4</h3>
<p>Styling uses Tailwind CSS v4 with the <code>@tailwindcss/typography</code> plugin for beautiful article rendering. The entire CSS output compiles to a single 55.6 kB file — and that's CSS, not JavaScript. It doesn't block interactivity.</p>
<h2 id="heading-building-features-without-a-framework">Building Features Without a Framework</h2>
<p>Here's where it gets interesting. The Hashnode Next.js starter kit uses React for features like dark mode, search, and comments. I rebuilt all of them without any framework.</p>
<h3 id="heading-dark-mode-css-localstorage">Dark Mode: CSS + localStorage</h3>
<p>Dark mode doesn't need React state management. It needs a class toggle and a localStorage call:</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Inside Header.astro &lt;script&gt; tag</span>
<span class="hljs-keyword">const</span> themeToggle = <span class="hljs-built_in">document</span>.getElementById(<span class="hljs-string">'theme-toggle'</span>);
themeToggle?.addEventListener(<span class="hljs-string">'click'</span>, <span class="hljs-function">() =&gt;</span> {
  <span class="hljs-keyword">const</span> isDark = <span class="hljs-built_in">document</span>.documentElement.classList.toggle(<span class="hljs-string">'dark'</span>);
  <span class="hljs-built_in">localStorage</span>.setItem(<span class="hljs-string">'theme'</span>, isDark ? <span class="hljs-string">'dark'</span> : <span class="hljs-string">'light'</span>);
});
</code></pre>
<p>That's it. The initial theme is set by an inline script in the <code>&lt;head&gt;</code> (to prevent flash of wrong theme), and the toggle is this 4-line event listener. Tailwind's <code>dark:</code> variant handles all the styling.</p>
<h3 id="heading-search-vanilla-js-hashnodes-graphql-api">Search: Vanilla JS + Hashnode's GraphQL API</h3>
<p>The search modal was the most complex piece. In the Next.js version, this would be a React component with state management, effects, and possibly a state library. In Astro, it's a single <code>.astro</code> file with a <code>&lt;script&gt;</code> tag.</p>
<p>The implementation uses:</p>
<ul>
<li><strong>Debounced input</strong> (300ms) to avoid hammering the API</li>
<li><strong>AbortController</strong> to cancel in-flight requests when the user types again</li>
<li><strong>Keyboard shortcuts</strong> (<code>Cmd/Ctrl + K</code> to open, <code>Escape</code> to close)</li>
<li><strong>Hashnode's <code>searchPostsOfPublication</code> GraphQL query</strong> for server-side search</li>
</ul>
<pre><code class="lang-javascript"><span class="hljs-comment">// Search with debounce and abort control</span>
input?.addEventListener(<span class="hljs-string">'input'</span>, <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
  <span class="hljs-built_in">clearTimeout</span>(debounceTimer);
  <span class="hljs-keyword">const</span> query = e.target.value.trim();
  debounceTimer = <span class="hljs-built_in">setTimeout</span>(<span class="hljs-function">() =&gt;</span> performSearch(query), <span class="hljs-number">300</span>);
});

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">performSearch</span>(<span class="hljs-params">query</span>) </span>{
  <span class="hljs-keyword">if</span> (searchAbort) searchAbort.abort();
  searchAbort = <span class="hljs-keyword">new</span> AbortController();

  <span class="hljs-keyword">const</span> res = <span class="hljs-keyword">await</span> fetch(gqlEndpoint, {
    <span class="hljs-attr">method</span>: <span class="hljs-string">'POST'</span>,
    <span class="hljs-attr">signal</span>: searchAbort.signal,
    <span class="hljs-attr">headers</span>: { <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span> },
    <span class="hljs-attr">body</span>: <span class="hljs-built_in">JSON</span>.stringify({
      <span class="hljs-attr">query</span>: searchQuery,
      <span class="hljs-attr">variables</span>: { <span class="hljs-attr">first</span>: <span class="hljs-number">10</span>, <span class="hljs-attr">filter</span>: { query, publicationId } },
    }),
  });
  <span class="hljs-comment">// ... render results</span>
}
</code></pre>
<p>No React. No state library. Just the DOM APIs that browsers have shipped for years.</p>
<h3 id="heading-full-feature-list">Full Feature List</h3>
<p>Every feature from the Next.js starter kit has been rebuilt — plus a few extras:</p>
<ul>
<li><strong>Dark Mode</strong> — System preference detection + manual toggle</li>
<li><strong>Search</strong> — <code>Cmd/Ctrl + K</code> shortcut, live GraphQL search</li>
<li><strong>View Transitions</strong> — SPA-like page transitions with morph animations, no full-page reloads</li>
<li><strong>Comments</strong> — Hashnode's native comment threads + optional <a target="_blank" href="https://giscus.app">Giscus</a> integration</li>
<li><strong>Newsletter</strong> — Built-in subscription form via Hashnode API</li>
<li><strong>SEO</strong> — Open Graph tags, Twitter cards, canonical URLs, structured data</li>
<li><strong>RSS Feed</strong> — Full-content RSS with <code>content:encoded</code></li>
<li><strong>Sitemap</strong> — Auto-generated XML sitemap</li>
<li><strong>Analytics</strong> — Supports GA4, GTM, Fathom, Plausible, Umami, and more</li>
<li><strong>Table of Contents</strong> — Auto-generated from post headings</li>
<li><strong>Pagination</strong> — Cursor-based with numbered pages</li>
<li><strong>Series &amp; Tags</strong> — Dedicated archive pages</li>
<li><strong>Responsive</strong> — Mobile-first with Tailwind CSS</li>
<li><strong>Accessibility</strong> — Semantic HTML, ARIA labels, keyboard navigation</li>
</ul>
<h2 id="heading-the-results">The Results</h2>
<p>Here's the comparison with real data from the built output:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Metric</td><td>Next.js Starter Kit</td><td>Astro Starter Hashnode</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Client JS</strong></td><td>~150 kB+</td><td><strong>~15 kB</strong></td></tr>
<tr>
<td><strong>JS Files</strong></td><td>Multiple React bundles</td><td><strong>A few small scripts</strong></td></tr>
<tr>
<td><strong>Build Output</strong></td><td>SSR / ISR</td><td><strong>Fully static HTML</strong></td></tr>
<tr>
<td><strong>Framework Runtime</strong></td><td>React + ReactDOM</td><td><strong>None</strong></td></tr>
<tr>
<td><strong>Server Required</strong></td><td>Yes (Node.js)</td><td><strong>No</strong> (static hosting)</td></tr>
<tr>
<td><strong>CSS</strong></td><td>CSS-in-JS + Tailwind</td><td><strong>1 file, 55.6 kB</strong> (Tailwind)</td></tr>
</tbody>
</table>
</div><p>That ~15 kB includes Astro's View Transitions (for smooth, SPA-like page navigation with morph animations) and the prefetch script. It's not React. It's not a full client-side router. It's the minimal JS needed to make the experience feel polished — and it's still <strong>10x less</strong> than what the Next.js version ships.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770538067576/ea8351e1-c9bc-43c0-b7ed-bc28a472f967.webp" alt="Performance comparison chart" /></p>
<h2 id="heading-get-started-in-2-minutes">Get Started in 2 Minutes</h2>
<p>Want to try it? You can deploy your own Hashnode blog frontend right now.</p>
<h3 id="heading-option-1-one-click-deploy">Option 1: One-Click Deploy</h3>
<p>Click one of these buttons to deploy instantly:</p>
<ul>
<li><a target="_blank" href="https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fsupra126%2Fastro-starter-hashnode&amp;env=PUBLIC_HASHNODE_PUBLICATION_HOST">Deploy with Vercel</a></li>
<li><a target="_blank" href="https://app.netlify.com/start/deploy?repository=https://github.com/supra126/astro-starter-hashnode">Deploy to Netlify</a></li>
</ul>
<p>You'll be asked for one environment variable: your Hashnode publication host (e.g., <code>yourblog.hashnode.dev</code>). That's it.</p>
<h3 id="heading-option-2-run-locally">Option 2: Run Locally</h3>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/supra126/astro-starter-hashnode.git
<span class="hljs-built_in">cd</span> astro-starter-hashnode
npm install
npm run dev
</code></pre>
<p>Open <code>http://localhost:4321</code>. If you don't set a <code>PUBLIC_HASHNODE_PUBLICATION_HOST</code>, it defaults to Hashnode's engineering blog as demo content.</p>
<h3 id="heading-multi-site-support">Multi-Site Support</h3>
<p>Since this is a statically generated frontend, you can deploy multiple instances with different <code>PUBLIC_HASHNODE_PUBLICATION_HOST</code> values. Same codebase, different blogs.</p>
<p>If you find this useful, I'd appreciate a <a target="_blank" href="https://github.com/supra126/astro-starter-hashnode">star on GitHub</a>. Found a bug or have a feature request? <a target="_blank" href="https://github.com/supra126/astro-starter-hashnode/issues">Open an issue</a>. Pull requests are welcome.</p>
<hr />
<p><strong>The web doesn't have a performance problem. It has a complexity problem.</strong> Most blogs don't need a JavaScript framework runtime. They need HTML, CSS, and a handful of interactive islands. Astro makes this the default, and the results speak for themselves: ~15 kB of JavaScript for a fully-featured blog with smooth page transitions — 10x less than the React equivalent.</p>
<p>Your readers — and their browsers — will thank you.</p>
]]></content:encoded>
      <author>黃小黃</author>
      <pubDate>Sun, 08 Feb 2026 08:12:59 GMT</pubDate>

    </item>
    <item>
      <title>From Side Project to Open Source: Why I Built My Own URL Shortener</title>
      <link>https://suprahuang.cc/from-side-project-to-open-source-why-i-built-my-own-url-shortener</link>
      <guid isPermaLink="true">https://suprahuang.cc/from-side-project-to-open-source-why-i-built-my-own-url-shortener</guid>
      <description>Ever stared at a third-party service&apos;s pricing page and thought, &quot;I could build this myself&quot;? That&apos;s exactly how Open Short URL was born.

The Moment Everything Changed
It started with a simple realization: I was paying for something I didn&apos;t fully c...</description>
      <content:encoded><![CDATA[<blockquote>
<p>Ever stared at a third-party service's pricing page and thought, "I could build this myself"? That's exactly how Open Short URL was born.</p>
</blockquote>
<h2 id="heading-the-moment-everything-changed">The Moment Everything Changed</h2>
<p>It started with a simple realization: I was paying for something I didn't fully control.</p>
<p>Like many developers, I relied on third-party URL shorteners for everything—marketing campaigns, sharing links, tracking clicks. It worked fine until it didn't. Google URL Shortener shut down. Other services started limiting features behind expensive paywalls. And worst of all, I had no idea who was looking at my click data.</p>
<p>That's when I asked myself: <strong>What if I just built my own?</strong></p>
<h2 id="heading-the-problem-with-third-party-url-shorteners">The Problem with Third-Party URL Shorteners</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770667594032/2d7285aa-82c6-44f5-ae27-cd17510f500e.webp" alt class="image--center mx-auto" /></p>
<h3 id="heading-the-hidden-costs-of-free">The Hidden Costs of "Free"</h3>
<p>When Google finally shut down goo.gl redirects in 2025 — seven years after first announcing its deprecation — millions of links became ticking time bombs. It was a wake-up call for anyone relying on free services.</p>
<p>But even paid services come with hidden costs:</p>
<ul>
<li><p><strong>Data Privacy</strong>: Your click data—geographic locations, devices, referrers—sits on someone else's servers. Who's analyzing it? Who are they selling it to?</p>
</li>
<li><p><strong>Brand Perception</strong>: Let's be honest, a <code>bit.ly</code> link in a professional email looks... questionable. Some spam filters even flag them.</p>
</li>
<li><p><strong>Feature Limitations</strong>: Want A/B testing? Pay more. Need API access? Pay more. Custom domains? You guessed it.</p>
</li>
</ul>
<h3 id="heading-why-existing-open-source-solutions-werent-enough">Why Existing Open-Source Solutions Weren't Enough</h3>
<p>I'm not the first person to have this idea. There are solid open-source alternatives:</p>
<ul>
<li><p><strong>YOURLS</strong>: The OG of self-hosted URL shorteners. But it's PHP-based, and I'm more comfortable with TypeScript. The UI also feels dated.</p>
</li>
<li><p><strong>Shlink</strong>: Excellent API design and Docker support. But I wanted a more complete frontend experience out of the box.</p>
</li>
<li><p><strong>Kutt</strong>: Clean and simple, but missing some features I needed.</p>
</li>
</ul>
<p>None of them were <em>wrong</em>—they just weren't <em>right for me</em>. I wanted something built with a modern stack, something I could extend and customize, and honestly, something that would be fun to build.</p>
<h2 id="heading-tech-stack-decisions">Tech Stack Decisions</h2>
<p>Here's where things get interesting. Choosing a tech stack for a side project is like choosing what to cook for dinner—you could go with something safe, or you could experiment.</p>
<p>I chose to experiment.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770667605671/ce451dc8-bd64-4629-b2dd-046dbc4190e6.webp" alt class="image--center mx-auto" /></p>
<h3 id="heading-backend-nestjs-fastify">Backend: NestJS + Fastify</h3>
<p>Most Node.js projects default to Express. It's battle-tested, well-documented, and... slow.</p>
<p>I went with <strong>Fastify</strong> instead. The benchmarks don't lie—Fastify handles 2-3x more requests per second than Express. For a URL shortener where redirects need to be lightning-fast, this matters.</p>
<p><strong>NestJS</strong> on top of Fastify gives me the best of both worlds: Fastify's performance with NestJS's modular architecture. Dependency injection, decorators, and a clear project structure make the codebase maintainable as it grows.</p>
<h3 id="heading-frontend-nextjs-16-react-19">Frontend: Next.js 16 + React 19</h3>
<p>I'll admit it—part of choosing Next.js 16 and React 19 was pure curiosity. I wanted to play with the latest features.</p>
<p>But beyond the hype, the App Router and Server Components genuinely improve the developer experience. The dashboard loads fast, navigation is smooth, and the codebase is cleaner than anything I could've built with the old Pages Router.</p>
<h3 id="heading-database-postgresql-prisma">Database: PostgreSQL + Prisma</h3>
<p>NoSQL might be trendy, but for a URL shortener with complex relationships (users, URLs, clicks, bundles, webhooks), a relational database makes more sense.</p>
<p><strong>Prisma</strong> as the ORM was an easy choice. Type-safe queries, automatic migrations, and a beautiful query API. It catches errors before they happen.</p>
<h3 id="heading-the-redis-optional-design">The "Redis Optional" Design</h3>
<p>Here's a design decision I'm particularly proud of: <strong>Redis is completely optional</strong>.</p>
<p>Many self-hosted solutions require Redis, which adds complexity and cost. Open Short URL works perfectly without it—the system automatically falls back to in-memory storage for caching and rate limiting.</p>
<pre><code class="lang-plaintext">Without Redis: Perfect for &lt; 10K daily clicks
With Redis: Handles 100K+ clicks with 10-20ms redirects
</code></pre>
<p>This means you can start simple and scale up when needed. No configuration changes required—the system detects Redis automatically and adapts.</p>
<h2 id="heading-features-worth-highlighting">Features Worth Highlighting</h2>
<p>Building a basic URL shortener takes a weekend. Building one with features you'd actually <em>want</em> to use? That takes longer. Here are some features I found interesting to implement.</p>
<h3 id="heading-dynamic-slug-length">Dynamic Slug Length</h3>
<p>Most URL shorteners use fixed-length slugs. Open Short URL dynamically adjusts based on how many URLs exist:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// Simplified logic</span>
<span class="hljs-keyword">if</span> (urlCount &lt; <span class="hljs-number">1000</span>) <span class="hljs-keyword">return</span> <span class="hljs-number">4</span>;      <span class="hljs-comment">// abc1</span>
<span class="hljs-keyword">if</span> (urlCount &lt; <span class="hljs-number">50000</span>) <span class="hljs-keyword">return</span> <span class="hljs-number">5</span>;     <span class="hljs-comment">// abc12</span>
<span class="hljs-keyword">if</span> (urlCount &lt; <span class="hljs-number">500000</span>) <span class="hljs-keyword">return</span> <span class="hljs-number">6</span>;    <span class="hljs-comment">// abc123</span>
<span class="hljs-keyword">return</span> <span class="hljs-number">7</span>;                           <span class="hljs-comment">// abc1234</span>
</code></pre>
<p>This keeps URLs as short as possible while ensuring uniqueness. Small detail, but it matters.</p>
<h3 id="heading-built-in-ab-testing">Built-in A/B Testing</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770667624954/82424eeb-5aa3-45b4-b55c-6377d3d40823.webp" alt class="image--center mx-auto" /></p>
<p>Want to test which landing page converts better? Create URL variants with traffic allocation:</p>
<ul>
<li><p>Variant A: <code>example.com/page-v1</code> → 50% traffic</p>
</li>
<li><p>Variant B: <code>example.com/page-v2</code> → 50% traffic</p>
</li>
</ul>
<p>The system automatically distributes clicks and tracks conversion data for each variant. No external tools needed.</p>
<h3 id="heading-smart-routing">Smart Routing</h3>
<p>Conditional redirects based on:</p>
<ul>
<li><p>Device type (mobile users → app store, desktop → website)</p>
</li>
<li><p>Geographic location (US visitors → .com, Taiwan visitors → .tw)</p>
</li>
<li><p>Time of day (business hours → sales page, after hours → support)</p>
</li>
</ul>
<p>You can set up rules with priorities, and the system evaluates them in order to determine where each visitor should go.</p>
<h3 id="heading-mcp-server-ready-for-the-ai-era">MCP Server: Ready for the AI Era</h3>
<p>This one's a bit unconventional. Open Short URL includes an <strong>MCP (Model Context Protocol) server</strong> that lets AI assistants manage your URLs.</p>
<p>Install it globally:</p>
<pre><code class="lang-bash">npm install -g @open-short-url/mcp
</code></pre>
<p>Configure it in Claude Desktop or Cursor, and you can say things like:</p>
<ul>
<li><p>"Create a short URL for example.com with the slug 'promo'"</p>
</li>
<li><p>"Show me click statistics for my campaign URLs"</p>
</li>
<li><p>"Set up A/B testing with 60/40 traffic split"</p>
</li>
</ul>
<p>Is this necessary? No. Is it cool? Absolutely.</p>
<h2 id="heading-lessons-learned">Lessons Learned</h2>
<h3 id="heading-url-shortening-is-harder-than-it-looks">URL Shortening Is Harder Than It Looks</h3>
<p>The basic concept is simple: store a mapping, redirect when visited. But production-ready URL shortening involves:</p>
<ul>
<li><p><strong>Click tracking performance</strong>: Recording every click without slowing down redirects</p>
</li>
<li><p><strong>Bot detection</strong>: Googlebot, Bingbot, and countless crawlers inflate your stats</p>
</li>
<li><p><strong>Geographic parsing</strong>: IP-to-location lookups add latency if not cached</p>
</li>
<li><p><strong>Concurrent access</strong>: Multiple clicks to the same URL at the exact same millisecond</p>
</li>
</ul>
<p>Each of these took more time than I expected.</p>
<h3 id="heading-what-actually-took-the-most-time">What Actually Took the Most Time</h3>
<p>Surprisingly, it wasn't the core URL shortening logic. It was:</p>
<ol>
<li><p><strong>The webhook system</strong>: Ensuring reliable delivery with retries, HMAC signatures, and detailed logging took weeks.</p>
</li>
<li><p><strong>The analytics dashboard</strong>: Getting Recharts to display time-series data exactly how I wanted was... an adventure.</p>
</li>
<li><p><strong>Edge cases</strong>: What happens when a URL expires mid-click? When Redis goes down? When someone submits a slug that's already taken?</p>
</li>
</ol>
<p>Side projects have a way of revealing just how many edge cases exist in "simple" features.</p>
<h2 id="heading-why-open-source">Why Open Source?</h2>
<p>I've benefited enormously from open source throughout my career. YOURLS taught me how URL shorteners work. NestJS documentation helped me structure the backend. Countless Stack Overflow answers saved me hours of debugging.</p>
<p>Open sourcing Open Short URL under the MIT license is my way of giving back. No commercial plans, no "open core" upsells—just a tool that I hope others find useful.</p>
<p>Open source projects also get better through community feedback. Bug reports I'd never find on my own. Feature requests I'd never think of. PRs that improve the codebase in ways I couldn't have imagined.</p>
<h2 id="heading-whats-next">What's Next</h2>
<p>The roadmap includes:</p>
<ul>
<li><p><strong>Open Graph Customization</strong>: Custom social previews for shared links</p>
</li>
<li><p><strong>Deep Linking</strong>: Native app integration (App Links / Universal Links)</p>
</li>
<li><p><strong>Retargeting Pixels</strong>: Integration with ad platforms</p>
</li>
<li><p><strong>More one-click deploy options</strong>: DigitalOcean, Vercel, Netlify templates</p>
</li>
</ul>
<p>But more than features, I'm excited to see how others use it. What workflows will people build? What integrations will they create?</p>
<hr />
<h2 id="heading-try-it-yourself">Try It Yourself</h2>
<p>Open Short URL is free, open source, and ready to deploy:</p>
<ul>
<li><p><strong>GitHub</strong>: <a target="_blank" href="https://github.com/supra126/open-short-url">github.com/supra126/open-short-url</a></p>
</li>
<li><p><strong>Documentation</strong>: <a target="_blank" href="https://supra126.github.io/open-short-url">supra126.github.io/open-short-url</a></p>
</li>
<li><p><strong>One-click deploy</strong>: Railway (more platforms coming soon)</p>
</li>
<li><p><strong>MCP Package</strong>: <a target="_blank" href="https://www.npmjs.com/package/@open-short-url/mcp">@open-short-url/mcp</a></p>
</li>
</ul>
<p>If you find it useful, a star on GitHub would mean a lot. If you find bugs, issues are welcome. And if you want to contribute, PRs are always open.</p>
<p>Sometimes the best way to solve a problem is to build the solution yourself. And if you're lucky, that solution might help others too.</p>
]]></content:encoded>
      <author>黃小黃</author>
      <pubDate>Wed, 04 Feb 2026 21:58:21 GMT</pubDate>
      <category>side project</category>
      <category>Url Shortener</category>
      <category>TypeScript</category>
      <category>nestjs</category>
      <category>Open Source</category>
    </item>
  </channel>
</rss>