[B06][43] Background Jobs / Queue：非同步任務與 Dead Letter Queue

為什麼不在 HTTP Request 裡做完所有事

// ❌ 在 request handler 裡發 email
app.post('/orders', async (req, res) => {
  const order = await orderService.create(req.body);
  await emailService.sendOrderConfirmation(order);  // 如果 email provider 慢，response 就慢
  await smsService.sendNotification(order);          // 如果 SMS API 掛了，整個 request 失敗
  res.json(order);
});

問題：

Response time 受第三方影響：Email provider 慢了，用戶等的是整個下單流程
失敗要整個重來：SMS API 噴錯，要讓用戶重新下單嗎？
Server 重啟就丟失：用戶下單後 server crash，那封確認信就沒了

正確做法：

// ✅ 下單成功後把任務丟進 queue，立刻回傳
app.post('/orders', async (req, res) => {
  const order = await orderService.create(req.body);
 
  // 丟進 queue，立刻回傳
  await emailQueue.add('send-order-confirmation', { orderId: order.id });
  await smsQueue.add('send-notification', { orderId: order.id, userId: req.user.id });
 
  res.status(201).json(order);  // 不等 email / SMS 完成就回傳
});
 
// 背景 worker 處理
emailQueue.process('send-order-confirmation', async (job) => {
  const order = await orderRepo.findById(job.data.orderId);
  await emailService.sendOrderConfirmation(order);
});

Queue 的三個核心能力

Retry 策略

網路抖動、外部 API 臨時掛掉——這些是暫時性錯誤，retry 就好。Worker 不應該永遠 retry，也不應該只試一次。

// BullMQ（Node.js）
const emailQueue = new Queue('email', { connection: redis });
 
const worker = new Worker('email', async (job) => {
  await emailService.send(job.data);
}, {
  connection: redis,
  defaultJobOptions: {
    attempts: 5,                         // 最多 retry 5 次
    backoff: {
      type: 'exponential',
      delay: 1000,                       // 第 1 次 retry 等 1 秒，第 2 次 2 秒，...
    },
  }
});
 
worker.on('failed', (job, error) => {
  logger.error('Job failed', {
    jobId: job.id,
    jobName: job.name,
    attempts: job.attemptsMade,
    error: error.message,
  });
});

Exponential backoff 是標準做法：第 1 次失敗後等 1 秒重試，再失敗等 2 秒，再失敗等 4 秒——給外部服務時間恢復，同時避免連打。

Dead Letter Queue（DLQ）

Retry 到上限還失敗的 job，要去哪裡？

答案：Dead Letter Queue——一個存放「無法處理」的 job 的特殊 queue。

// BullMQ 設定 DLQ
const emailWorker = new Worker('email', async (job) => {
  await emailService.send(job.data);
}, {
  connection: redis,
});
 
emailWorker.on('failed', async (job, error) => {
  if (job.attemptsMade >= job.opts.attempts) {
    // 達到 retry 上限，送進 DLQ
    await deadLetterQueue.add('failed-email', {
      originalJob: job.data,
      error: error.message,
      failedAt: new Date().toISOString(),
      jobId: job.id,
    });
 
    // 通知工程師
    await alertService.send('Email job exhausted retries', { jobId: job.id });
  }
});

DLQ 的 job 需要人工介入處理：

確認是程式 bug 還是資料問題
修復後可以重新 enqueue（requeue）
或標記為永久失敗（紀錄 + 補償操作）

沒有 DLQ 的後果：job 失敗超過 retry 次數後就消失——你不知道有多少 email 沒送到、多少通知沒發出。

Job 去重（Deduplication）

Queue 裡的另一個常見問題：同一個 job 被觸發多次，但只應該執行一次。

典型場景：

用戶快速連點「發布」按鈕，觸發三次 send-notification job
批次更新 10 個商品，每個都觸發 rebuild-search-index，但 index 只需要重建一次
Webhook 發送方 retry，你的系統把同一個事件放進 queue 兩次

BullMQ 的 jobId 去重：

// 指定 jobId，相同 jobId 的 job 不會重複加進 queue
await searchQueue.add(
  'rebuild-index',
  { category: 'electronics' },
  {
    jobId: 'rebuild-index:electronics',  // 固定的 ID
    // 如果 queue 裡已有這個 jobId，新的 add 會被忽略
  }
);
 
// 批次更新 10 個商品時，10 次 add 只會有 1 個 job 在 queue
for (const product of updatedProducts) {
  await searchQueue.add(
    'rebuild-index',
    { category: product.category },
    { jobId: `rebuild-index:${product.category}` }  // 同 category 只有一個 job
  );
}

Redis SET NX 手動去重（不依賴 BullMQ jobId）：

async function addJobOnce(queue: Queue, jobName: string, data: unknown, dedupeKey: string, ttlMs = 5000) {
  const lockKey = `job-dedupe:${dedupeKey}`;
  const acquired = await redis.set(lockKey, '1', 'NX', 'PX', ttlMs);
 
  if (!acquired) {
    // 這個 key 在 TTL 內已經有 job 在跑了，跳過
    return null;
  }
 
  return queue.add(jobName, data);
}
 
// 使用
await addJobOnce(
  notificationQueue,
  'send-push',
  { userId, message },
  `push:${userId}`,
  3000  // 3 秒內不重複觸發
);

去重和冪等（Idempotency）的差異：去重是在 enqueue 前擋掉重複；冪等是在 執行時確保重複執行結果相同。兩個都要——去重減少無謂的工，冪等保證萬一去重失效時結果還是正確的。

Job Priority

不是所有 job 都一樣重要：

// 高優先：密碼重設 email（使用者等著）
await emailQueue.add('password-reset', { userId }, { priority: 1 });
 
// 低優先：月報產生（不急）
await emailQueue.add('monthly-report', { month: '2026-03' }, { priority: 10 });

Priority 值越小，優先級越高（BullMQ 的慣例）。

什麼任務適合放 Queue

適合：

Email / SMS / Push notification
產生報表、匯出 CSV
呼叫有 rate limit 的外部 API（要控制速度）
圖片處理、影片轉檔
資料同步（webhook 轉發、ETL）
任何超過 1–2 秒的操作

不適合：

需要即時回傳結果的操作（client 在等 response）
簡單的 DB 操作（加 queue 反而多一層複雜度）

各框架的 Queue 工具

語言/框架	推薦工具	特點
Node.js	BullMQ（Redis-based）	最主流，有 UI dashboard（Bull Board）
Python	Celery（可接 Redis / RabbitMQ）	Python 生態標準，與 Django / FastAPI 整合成熟
Java / Spring	Spring Batch / RabbitMQ / Kafka	企業級，支援分散式 job
Laravel	Laravel Queue（內建，可接 Redis / SQS）	與 Laravel 深度整合，Horizon 提供 dashboard
Rails	Sidekiq（Redis-based）	社群最成熟，Active Job 介面統一

Crontab vs Queue

Article 24 有提到 crontab 的使用情境。Crontab 和 Queue 不是競爭關係：

Crontab → 定時觸發（每天凌晨 2 點產生報表）
Queue   → 事件觸發（使用者下單後發 email）

實際上，很多場景是Cron + Queue 的組合：

// Cron 定時把大量 job 放進 queue，由多個 worker 並行處理
cron.schedule('0 2 * * *', async () => {
  // 取得所有需要月報的用戶
  const users = await userRepo.findSubscribers();
  for (const user of users) {
    // 不是在 cron 裡直接產生，而是丟進 queue
    await reportQueue.add('monthly-report', { userId: user.id, month: '2026-03' });
  }
});

Terry Yao's Blog

目錄

[43] Background Jobs / Queue：非同步任務與 Dead Letter Queue