[B06][41] Structured Logging：Correlation ID 與全鏈 Log 設計

兩種 Log 的差距

出問題的時候，你去 log 系統搜尋，看到：

非結構化 log（不好用）：

2026-04-22 10:30:15 User login failed: invalid password
2026-04-22 10:30:16 GET /api/users 200 45ms
2026-04-22 10:30:16 Database error: connection timeout

你知道「有個 DB connection timeout」，但不知道：是哪個 request 造成的？這個 request 是哪個 user？前後發生了什麼？

結構化 log（好用）：

{
  "timestamp": "2026-04-22T10:30:16.000Z",
  "level": "error",
  "message": "Database connection timeout",
  "requestId": "req-abc123",
  "userId": "user-456",
  "path": "/api/orders",
  "method": "POST",
  "duration": 3041,
  "query": "INSERT INTO orders...",
  "service": "order-service",
  "environment": "production"
}

同一個 requestId 可以把這個請求的所有 log 串起來——從進來、查 DB、呼叫外部 API、回傳 response，全程可追蹤。

Correlation ID 的全鏈傳遞

Correlation ID（也叫 Request ID / Trace ID）是「這個請求的唯一識別碼」，要從請求進入的那一刻開始生成，然後帶到所有的 log、所有的下游服務呼叫。

import { AsyncLocalStorage } from 'async_hooks';
import { v4 as uuidv4 } from 'uuid';
 
// AsyncLocalStorage 讓 correlation ID 在同一個 async context 中自動傳遞
// 不需要手動把 requestId 傳進每個 function
const requestContext = new AsyncLocalStorage<{ requestId: string; userId?: string }>();
 
// Middleware：生成或讀取 correlation ID
export const correlationMiddleware = (req, res, next) => {
  // 如果上游（API Gateway / Load Balancer）已經加了 X-Request-Id，繼承它
  const requestId = (req.headers['x-request-id'] as string) || uuidv4();
 
  req.requestId = requestId;
  res.set('X-Request-Id', requestId);  // 回傳給 client，方便 debug
 
  // 把 context 注入 async storage，後續所有的 async call 都能取得
  requestContext.run({ requestId }, next);
};
 
// Logger：自動注入 context
export const logger = {
  info: (message: string, meta?: object) => {
    const ctx = requestContext.getStore();
    console.log(JSON.stringify({
      timestamp: new Date().toISOString(),
      level: 'info',
      message,
      requestId: ctx?.requestId,
      userId: ctx?.userId,
      ...meta,
    }));
  },
  error: (message: string, meta?: object) => {
    const ctx = requestContext.getStore();
    console.error(JSON.stringify({
      timestamp: new Date().toISOString(),
      level: 'error',
      message,
      requestId: ctx?.requestId,
      userId: ctx?.userId,
      ...meta,
    }));
  },
};
 
// 使用：不用手動傳 requestId
class UserService {
  async createUser(dto: CreateUserDto) {
    logger.info('Creating user', { email: dto.email });  // requestId 自動注入
 
    try {
      const user = await userRepo.create(dto);
      logger.info('User created', { userId: user.id });
      return user;
    } catch (error) {
      logger.error('Failed to create user', { error: error.message });
      throw error;
    }
  }
}

各層要 Log 什麼

Request 進入 / 離開（Middleware 層）：

export const requestLogger = (req, res, next) => {
  const start = Date.now();
 
  res.on('finish', () => {
    logger.info('Request completed', {
      method: req.method,
      path: req.path,
      statusCode: res.statusCode,
      duration: Date.now() - start,
      userAgent: req.headers['user-agent'],
      ip: req.ip,
    });
  });
 
  next();
};

Service 層的業務事件（不是每個 function 都要 log，只 log 業務關鍵路徑）：

// ✅ 值得 log 的業務事件
logger.info('Order placed', { orderId: order.id, amount: order.amount, userId });
logger.info('Payment processed', { paymentId, provider: 'stripe', amount });
logger.warn('Rate limit exceeded', { userId, path: req.path });
logger.error('Payment failed', { error: error.message, userId, amount });
 
// ❌ 不值得 log 的細節（太吵）
logger.info('Calling findById');
logger.info('findById returned');
logger.info('Checking if user exists');

慢查詢（Database 層）：

sequelize.addHook('afterQuery', (options) => {
  const duration = /* query duration */;
  if (duration > 1000) {  // 超過 1 秒的查詢
    logger.warn('Slow query detected', {
      query: options.sql,
      duration,
      bind: options.bind,
    });
  }
});

Log Level 策略

標準是 7 個等級（由低到高）：

TRACE  → 極細節追蹤（特定 code path 的每一步，幾乎只在本地排查特定 bug 時開）
DEBUG  → 開發用資訊（每次 DB 查詢、middleware 執行、變數值）
INFO   → 業務關鍵事件（訂單建立、用戶登入、payment 成功）
WARN   → 值得注意但不緊急（rate limit 觸發、deprecated endpoint 被用、慢查詢）
ERROR  → 需要處理的問題（DB 查詢失敗、外部 API 噴錯、payment 失敗）
FATAL  → 應用即將崩潰（無法取得 DB connection pool、config 驗證失敗、OOM）
SILENT → 關閉所有 log（測試環境可用）

FATAL vs ERROR 的差異：ERROR 是「這個 request 失敗了，但應用還活著繼續服務」；FATAL 是「整個 process 無法繼續，準備退出」。大部分 logger 在呼叫 logger.fatal() 後會 flush 然後 process.exit(1)。

各環境的 log level：

development  → DEBUG（看到所有細節）
staging      → INFO（和 prod 行為一致，不要 DEBUG 噪音）
production   → INFO（只留業務事件和警告；特定問題排查時可臨時調低）

pino 和 winston 都支援 runtime 動態調整 log level（透過 HTTP endpoint 或 signal），不需要重啟 server。

三種 Log 類型：Access / Error / Audit

多數人把所有 log 混在一起，實際上有三種目的完全不同的 log：

Access Log（請求日誌）

記錄每一個進來的 HTTP request，格式接近 Apache/Nginx 的 access.log：

{
  "type": "access",
  "requestId": "req-abc123",
  "method": "POST",
  "path": "/api/orders",
  "statusCode": 201,
  "duration": 45,
  "ip": "203.0.113.1",
  "userAgent": "Mozilla/5.0 ...",
  "userId": "user-456"
}

量大、結構固定——適合放 Loki 用 LogQL 查詢；用來做 QPS 統計、慢請求分析。Express 的 morgan 就是在做這件事，但建議用結構化 JSON 而不是預設的 Apache 格式字串。

Error Log（錯誤日誌）

記錄非預期的例外和系統錯誤，需要包含 stack trace：

{
  "type": "error",
  "requestId": "req-abc123",
  "level": "error",
  "message": "Database connection timeout",
  "stack": "Error: connect ETIMEDOUT\n    at ...",
  "userId": "user-456",
  "path": "/api/orders"
}

Error log 要接告警——ERROR 和 FATAL 等級的 log 應該觸發 Slack 通知或 PagerDuty。WARN 視情況。

Audit Log（稽核日誌）

記錄「誰對哪筆資料做了什麼」，用於安全合規和事後追查：

{
  "type": "audit",
  "actor": { "userId": "user-456", "email": "alice@example.com", "ip": "203.0.113.1" },
  "action": "user.update",
  "resource": { "type": "user", "id": "user-789" },
  "changes": { "role": { "before": "viewer", "after": "admin" } },
  "timestamp": "2026-04-22T10:30:00.000Z",
  "requestId": "req-abc123"
}

Audit log 的特殊要求：

不能被刪除或修改（append-only）——可以存進獨立的 DB table 或 write-once storage
retention 通常更長（一般 log 30 天，audit log 可能要 1–7 年，視合規要求）
不一定要存進 log 系統——很多場景存進 DB table 更容易查詢

Express proto 的 audit.ts middleware 就是在做這個：每個寫操作（POST/PUT/PATCH/DELETE）自動記錄 actor + action + resource。

這三種 log 的分工：

類型	觸發時機	主要用途	Retention
Access	每個 HTTP request	流量分析、debug	7–30 天
Error	非預期例外	告警、incident 排查	30–90 天
Audit	寫操作、敏感讀取	合規、安全追查	1–7 年

Log Sink：Log 要送到哪裡

開發環境  → stdout（直接看終端機）
CI/CD    → stdout（CI 系統收集）
K8s 生產  → stdout → Fluent Bit → Loki / Elasticsearch

Kubernetes 的標準做法是讓應用只寫 stdout，由 log 收集器（Fluent Bit、Filebeat）負責把 log 送到集中的 log 系統（Grafana Loki、ELK Stack）。應用不管 log rotation、不管 log 壓縮——這些是 infra 的責任。

各框架的 Structured Logging 工具

框架	推薦工具	特點
Express / Node.js	`pino`（最快）或 `winston`	pino 序列化快 5x；winston 自訂性高
FastAPI	Python 標準 `logging` + `python-json-logger`	內建 logging 搭配 JSON formatter
NestJS	內建 Logger 可替換；pino 也支援	`nestjs-pino` 直接整合
Spring Boot	Logback（預設）+ Logstash encoder	`logstash-logback-encoder` 輸出 JSON
Laravel	Monolog（內建）	channel 設定為 `single` 或 `daily`

Terry Yao's Blog

目錄

[41] Structured Logging：Correlation ID 與全鏈 Log 設計