AI04 · AI 工程實作（概念層）詳細 ROADMAP

計畫文件，不會被 Quartz 渲染。回主 roadmap → ai/ROADMAP.md

⚡ 近期開工

優先	主題	Slug	落在章節
1	AI Gateway 與 Agent 平台選型：LiteLLM / Portkey / OpenClaw 怎麼選	`06-ai-gateway-products`	F04-A #06
2	Agent 授權邊界與資料隔離：怎麼確保 agent 只碰到特定人的特定資料	`17b-agent-authorization-and-data-isolation`	F04-F 17b

素材方向（#06）：自己用 OpenClaw 管理多個 AI provider 的實際經驗；跟純 Gateway（LiteLLM）的角色差異；自架 vs SaaS 的決策點。

素材方向（#17b）：多用戶 agent 系統裡最容易被忽略的授權問題；tool call 帶 user context、RAG filter、memory 隔離四層架構；agent 幫你查 CRM 但不該看到別人的客戶資料這類真實情境。

章節目標

AI 工程的架構決策與概念——Gateway 選型、Observability 架構、Evaluation 方法論、版本管理、Security 設計、SLA 策略。本章聚焦「怎麼做 / 選什麼」的工程思維，動手實作在 AI08。

跟其他章分工：

AI08 Build Applications = 動手建（從零寫 RAG / API / fine-tune / local deploy）
AI09 Build Extensions = 寫擴展（Skill / MCP / tool）
本章 = 概念 / 架構 / 選型 / 治理（read to decide）
backend/ai/ B20 = 後端工程師視角；本章是 AI 系統視角

🌱 基本介紹

#	主題	Slug	Stage	大綱
01	AI 工程概念全景	`01-ai-engineering-landscape`	🌱	從 API 抽象到 production 的工程決策堆疊：Gateway / Obs / Eval / Versioning / SLA / Security

❓ 為什麼需要

#	主題	Slug	Stage	大綱
02	為什麼 Demo 不等於 Production	`02-why-demo-not-production`	🌱	Hallucination / cost / latency / rate limit / eval 缺位；上線才發現的問題
03	為什麼要 AI Gateway 抽象層	`03-why-ai-gateway`	🌱	多 provider 切換、成本控制、限流、集中化 secret；不上 Gateway 每個 service 自己整合 provider = 一團亂
04	為什麼 AI 一定要 Eval	`04-why-evaluation`	🌱	改 prompt 可能讓某類 input 退步但別類進步；不 eval 就看不到

🕰️ 演進

#	主題	Slug	Stage	大綱
05	AI 工程演進驅動力	`05-engineering-evolution-drivers`	🌱	單點 API call 撞量產撞牆 → AI Gateway；prompt 散落撞維護撞牆 → Prompt version control；黑盒輸出撞品質撞牆 → LLM Observability；手動 eval 撞覆蓋撞牆 → automated eval framework；DIY provider 撞 lock-in 撞牆 → LiteLLM / Portkey 抽象

🧠 知識型

F04-A AI Gateway & 抽象層

#	主題	Slug	Stage	大綱
06	AI Gateway 與 Agent 平台選型	`06-ai-gateway-products`	⚡🌱	LiteLLM / Portkey / Helicone / Kong AI Gateway / OpenRouter vs OpenClaw 類 self-hosted Agent Platform；純 Gateway（provider 抽象層）vs Agent Platform（含 skill / session / routing 全棧）的角色差異；自架 vs SaaS；provider failover；cost aggregation；跟 I02 Gateway S01 / I06 #36 呼應
07	Multi-provider 架構模式	`07-multi-provider-patterns`	🌱	Primary/Fallback / Load balance / Routing by task；provider outage 應變架構決策；實作見 AI08 #07

F04-B RAG 架構（概念）

#	主題	Slug	Stage	大綱
08	RAG 架構模式	`08-rag-patterns`	🌱	Naive RAG / Advanced RAG（pre-retrieval + post-retrieval）/ Modular RAG / GraphRAG；每種適用場景；實作見 AI08 F08-B
09	Retrieval 策略決策	`09-retrieval-strategy`	🌱	Dense / Sparse / Hybrid / Reranking / HyDE / Query expansion；跟 I04 #30 Vector DB 運維連動

F04-C 品質管理（概念）

#	主題	Slug	Stage	大綱
10	Prompt 版本管理策略	`10-prompt-versioning-strategy`	🌱	Prompt as code vs prompt as data；version control / A/B test / rollback 架構；跟 AI07 #12 Prompt Iteration 方法論呼應
11	模型版本與切換策略	`11-model-versioning`	🌱	provider 升版的 breaking change；fallback 策略；成本控制；OpenAI model deprecation 案例
12	AI 服務的 SLA 設計	`12-ai-sla-design`	🌱	Latency budget（p50 / p95 / p99）；token 預估；限流；provider outage 影響評估

F04-D LLM Observability（2024+ 顯學）

#	主題	Slug	Stage	大綱
13	LLM Observability 全景	`13-llm-observability-landscape`	🌱	Langfuse / LangSmith / Arize Phoenix / Helicone 對比；trace 結構（prompt → response → eval score）；跟 I05 #28 Grafana Unified + #30 APM 分工
14	Token Usage 成本監控	`14-token-cost-monitoring`	🌱	Per-tenant / per-user cost tracking；alert on abnormal usage；跟 billing 整合

F04-E AI Evaluation 方法論（概念）

#	主題	Slug	Stage	大綱
15	AI Evaluation 方法論	`15-ai-evaluation-methodology`	🌱	Eval dataset 建立；metric 選擇（accuracy / BLEU / relevance）；回歸測試；本章是 eval 設計思維；動手做 eval framework 見 AI08 #22，eval-driven dev 流程見 AI07 #11
16	人工 vs LLM-as-Judge 取捨	`16-human-vs-llm-judge`	🌱	人工 eval 精準但貴；LLM judge 便宜但有偏見；hybrid 策略；bias correction

F04-F AI app 層 Security（跟 I06 F06-F 呼應）

#	主題	Slug	Stage	大綱
17	AI 應用層 Security	`17-ai-app-security`	🌱	Prompt Injection 防範（input sanitization / role hardening）、Output Sanitization（PII 濾除、markdown injection、URL 白名單）、Data leakage（不在 prompt 洩漏 secret）；跟 I06 F06-F 三層切法：本篇是 app 層，I06 是 infra Gateway 層，AI01 a07 是 user 層
17b	Agent 授權邊界與資料隔離	`17b-agent-authorization-and-data-isolation`	🌿
17c	Agent Guardrail 工具生態：agent 的「防毒軟體」	`17c-agent-guardrails`	🌱	agent 執行期間的主動監控：偵測越界行為、context 污染、輸出異常並攔截；Guardrails AI / NeMo Guardrails / Lakera Guard 對比；跟 17b 分工：#17b 是事前定義範圍，本篇是執行期即時防護；「防毒軟體 AI 版」這個角度沒有人在講

🔧 小實作注意事項

（本章是概念章，動手實作集中在 AI08 / AI09）

#	主題	Slug	Stage	大綱
18	為 AI app 畫架構圖	`18-draw-ai-architecture`	🌱	Gateway / Model / Vector DB / Cache / Queue 標準畫法；關鍵資料流（prompt / embedding / eval）

💣 Anti-pattern

#	主題	Slug	Stage	大綱
19	AI 工程概念 Anti-patterns	`19-ai-engineering-antipatterns`	🌱	直接呼叫 provider 不經 Gateway；把 API key 寫 code；prompt 沒版本化；RAG 不監控 retrieval 品質；不控制 token budget 被 bill 爆；allow prompt 取得系統 config；LLM output 直接 eval()

🧰 對應檢查工具

#	主題	Slug	Stage	大綱
20	AI 工程工具概觀	`20-ai-engineering-tooling`	🌱	Gateway: LiteLLM / Portkey / Helicone / OpenRouter；RAG Framework: LangChain / LlamaIndex / Haystack；Observability: Langfuse / LangSmith / Arize Phoenix；Eval: RAGAS / DeepEval / Promptfoo / G-Eval；Security: Lakera / Rebuff / PromptArmor / NeMo Guardrails

章節進度統計

知識主題：20 項
🌿 growing：0（原本吸收的 RAG 實作 4 篇搬去 AI08）
🌱 seed：20

本章內容範圍變更（2026-04）：

原 F-A API 整合 / F-B RAG 實作 / F-C Fine-tuning / F-D Workflow 實作搬到 AI08 Build Applications
原小實作 hands-on 題（Gateway+Obs stack / RAG pipeline / Eval framework building）搬到 AI08
本章精簡成概念 + 選型 + 架構，讀完知道「做什麼工程決策」

跨系列連結

→ AI07 Methodology（方法論思維）
→ AI08 Build Applications（本章概念的動手實作）
→ AI09 Build Extensions（擴展方向不同但互補）
→ backend/ B20（後端視角）
→ infra/security-governance/ I06 F06-F（AI Gateway infra 守門）
→ infra/gateway-mesh/ S01 Gateway for AI agents
→ infra/observability/ I05 #28 Grafana Unified + #30 APM

Terry Yao's Blog

目錄

ROADMAP