B20 · AI 輔助後端開發 + LLM Backend 詳細 ROADMAP

計畫文件，不會被 Quartz 渲染。回主 roadmap → backend/ROADMAP.md 對稱前端 → frontend/ai-assisted/ROADMAP.md

章節目標

AI 對後端工程師的影響跟前端同等大，分兩個面向：

AI 輔助後端開發——Cursor / Claude Code / Copilot 寫 API / DB migration / 測試
建 AI / LLM Backend 服務——整合 OpenAI / Anthropic API、RAG / Vector DB、Prompt 管理、Cost / Rate limit / 安全

本章不重寫前端已寫過的通用 AI 概念（CH17 有），只聚焦後端特有議題。

🌱 基本介紹

#	主題	Slug	Stage	大綱
01	AI 對後端工程師是什麼	`01-ai-for-backend-engineer`	🌱	兩個面向（輔助開發 / 建 AI service）、跟前端 AI 的分工差異

❓ 為什麼需要

#	主題	Slug	Stage	大綱
02	為什麼後端工程師也被 AI 衝擊	`02-why-ai-disrupts-backend`	🌱	「AI 只影響前端」是誤解、CRUD 樣板、SQL、測試、API 文件都被 AI 吃掉一大塊、價值往哪轉移
03	為什麼 LLM Backend 有獨特技術挑戰	`03-why-llm-backend-unique`	🌱	Token cost 按用量爆、streaming response 必要、Prompt Injection 新攻擊面、非確定性難測試

🕰️ 演進

#	主題	Slug	Stage	大綱
04	後端 AI 發展史	`04-backend-ai-evolution`	🌱	ML 後端（TF Serving 時代）→ LLM API 時代（2022+）→ Agent / MCP / Tool calling（2024+）→ 分散式 LLM inference（2025+）

🧠 知識型

F20-A AI 輔助後端開發

#	主題	Slug	Stage	大綱
05	Cursor / Claude Code / Copilot 後端場景	`05-ai-tools-for-backend`	🌱	寫 CRUD、寫 migration、寫 test、寫 API doc 的 prompt 心法
06	AI 寫 SQL 的坑	`06-ai-writes-sql-pitfalls`	🌱	LLM 幻覺 schema / 用不存在的 function / 寫出 N+1；給 schema + EXPLAIN 對策
07	AI 寫測試的最佳實踐	`07-ai-writes-tests`	🌱	讓 AI 寫 test 有用還是有害、fixture 生成、覆蓋率陷阱
08	AI 輔助 code review（後端視角）	`08-ai-code-review-backend`	🌱	跟資安掃描整合、專屬 prompt、合約測試輔助

F20-B LLM API 整合

#	主題	Slug	Stage	大綱
09	OpenAI / Anthropic / Gemini API 比較	`09-llm-api-comparison`	🌱	Latency / cost / context window / function calling / streaming 支援
10	Token usage 追蹤與計費	`10-token-usage-billing`	🌱	tiktoken / count-before-send、按 user / tenant 分帳、budget alert
11	Rate Limit 與 Retry 策略	`11-llm-rate-limit-retry`	🌱	429 處理、exponential backoff、provider fallback（Anthropic 掛時切 OpenAI）
12	Streaming Response 實作	`12-llm-streaming`	🌱	SSE / chunked response、token-by-token 處理、跟前端 streaming UI 連動
13	Structured Output（JSON mode / tool use）	`13-structured-output`	🌱	讓 LLM 回 schema 對齊 JSON、validation、retry 策略

F20-C RAG 與 Vector

#	主題	Slug	Stage	大綱
14	RAG（Retrieval Augmented Generation）基礎	`14-rag-basics`	🌱	為什麼要 RAG、chunking / embedding / retrieval / generation 四段
15	Vector DB 選型	`15-vector-db-selection`	🌱	pgvector（PG 擴充）vs Pinecone（SaaS）vs Qdrant（自架）vs Chroma（輕量）；跟現有 PG 整合
16	Embedding 策略	`16-embedding-strategy`	🌱	model 選（OpenAI / Cohere / 自訓）、維度、成本、batching
17	Chunking 策略	`17-chunking-strategy`	🌱	fixed / semantic / recursive、overlap、最佳 chunk size 實驗
18	Hybrid Search（向量 + 關鍵字）	`18-hybrid-search`	🌱	為什麼純向量不夠、reranking、BM25 + vector 混合

F20-D Prompt 管理

#	主題	Slug	Stage	大綱
19	Prompt 版本控制	`19-prompt-versioning`	🌱	把 prompt 當 code；git 管 / 環境變數 / PromptHub / LangSmith
20	Prompt Template 設計	`20-prompt-template-design`	🌱	variable substitution、few-shot example、system vs user role 分工
21	A/B Testing Prompt	`21-prompt-ab-testing`	🌱	量化評估（evaluator）、LangSmith / Braintrust / 自建

F20-E Agent / Tool Use

#	主題	Slug	Stage	大綱
22	Tool Use / Function Calling	`22-tool-use`	🌱	OpenAI function / Anthropic tool、自家工具暴露給 LLM 的 API 設計
23	MCP（Model Context Protocol）	`23-mcp`	🌱	Anthropic 2024 標準、服務暴露工具給 AI client；後端如何實作 MCP server
24	Agent Loop 設計	`24-agent-loop`	🌱	ReAct pattern、tool call 循環、max step 限制、cost control
25	Multi-agent 協作	`25-multi-agent`	🌱	Agent A 呼叫 Agent B、timeout、state 管理；複雜場景何時值得

F20-F LLM Backend 可靠性

#	主題	Slug	Stage	大綱
26	Prompt Injection 防禦	`26-prompt-injection-defense`	🌱	indirect injection（從檔案 / URL 來的）、system prompt 外洩、defense-in-depth
27	Output 驗證 / Moderation	`27-output-validation`	🌱	有害內容、PII 遮蔽、行為限制；provider moderation API vs 自建
28	Non-determinism 測試策略	`28-non-deterministic-testing`	🌱	傳統 assertion 不適用、LLM-as-judge、semantic similarity 測試
29	LLM 可觀測性	`29-llm-observability`	🌱	每次 call 的 prompt / response / latency / cost 收集；LangSmith / Helicone / 自建

F20-G 成本與架構

#	主題	Slug	Stage	大綱
30	LLM cost 優化策略	`30-llm-cost-optimization`	🌱	快取 / 小模型分流 / batching / 長 context 壓縮
31	Provider Routing	`31-provider-routing`	🌱	多 provider 切換（OpenRouter / LiteLLM）；依任務類型 route 到最省模型
32	Self-hosted LLM（Llama / Qwen）	`32-self-hosted-llm`	🌱	何時該自架、vLLM / Ollama / Text Generation Inference；cost break-even 分析
33	Edge AI（Cloudflare Workers AI / AWS Bedrock）	`33-edge-ai`	🌱	低延遲場景、模型選項、跟傳統 infra 整合

🔧 小實作注意事項

#	主題	Slug	Stage	大綱
34	從零做一個 RAG 系統（pgvector + FastAPI）	`34-rag-from-scratch`	🌱	ingestion / chunking / embedding / query、最小可跑版本
35	LLM chat backend（streaming + history）	`35-llm-chat-backend`	🌱	SSE + conversation state + context management
36	設計一個 LLM API Client Wrapper	`36-llm-client-wrapper`	🌱	retry / fallback / cost track / logging；類似 proto 的 CacheService 抽象
37	MCP Server from scratch	`37-mcp-server`	🌱	做一個給 Claude Desktop 用的 MCP server 範例

💣 Anti-pattern

#	主題	Slug	Stage	大綱
38	LLM Backend Anti-patterns	`38-llm-backend-antipatterns`	🌱	API key 寫 client / 出現在 git、沒 token cost 追蹤、沒 rate limit 被濫用、回應不 stream、信任 LLM output 直接塞 DB / SQL、沒 prompt version 控制、完全不測（「反正 LLM 不確定」）、用 GPT-4 當所有場景 default

🧰 對應檢查工具

#	主題	Slug	Stage	大綱
39	AI Backend 相關工具	`39-ai-backend-tooling`	🌱	LangChain / LlamaIndex、LangSmith / Helicone / Langfuse（observability）、LiteLLM（gateway）、vLLM / Ollama（self-host）、pgvector / Qdrant（vector）

📎 補充

#	主題	Slug	Stage	大綱
S01	AI 時代後端工程師職涯定位	`s01-ai-backend-career`	🌱	哪些技能變稀有（系統設計 / 分散式 / 資安 / 成本）、哪些變 commodity（CRUD / boilerplate）
S02	Vector DB 深入（跟 B03 連動）	`s02-vector-db-deep`	🌱	index 類型（HNSW / IVF）、量化、跟傳統 RDBMS 整合
S03	AI Gateway（LiteLLM / Portkey）	`s03-ai-gateway`	🌱	多 provider 統一接口、cost 集中、失效切換

章節進度統計

知識主題：39 + 3 補充 = 42 項
🌿 growing：0
🌱 seed：42

跨系列連結

→ frontend/ai-assisted/ CH17（前端 AI；本章是後端對稱視角）
→ backend/database/ B03 #18 Vector DB（本章 F20-C 展開）
→ backend/api-design/ B09（Streaming / SSE）
→ backend/security/ B16 S01（Prompt Injection；本章展開）
→ backend/observability/ B17（LLM observability）
→ ai/ 系列（AI 使用端、工具、prompt 寫作）