重試機制與指數退避:打造韌性系統
網路請求失敗是常態,不是例外。一個健壯的系統需要有策略地重試失敗的請求,而不是簡單地放棄或無腦重試。
一、 為什麼需要重試?
1.1 暫時性錯誤
很多錯誤是暫時性的:
| 錯誤 | 原因 | 重試有效? |
|---|---|---|
| 503 Service Unavailable | 伺服器過載 | ✅ 很可能 |
| 504 Gateway Timeout | 上游超時 | ✅ 可能 |
| 網路超時 | 網路波動 | ✅ 可能 |
| 連線中斷 | 網路不穩 | ✅ 可能 |
| 429 Too Many Requests | 限流 | ✅ 等一下就好 |
| 404 Not Found | 資源不存在 | ❌ 不會改變 |
| 401 Unauthorized | 未認證 | ❌ 需要登入 |
1.2 重試的風險
WARNING
不當的重試可能造成雪崩效應(Thundering Herd)!
二、 指數退避(Exponential Backoff)
2.1 核心概念
每次重試,等待時間指數增長:
第 1 次重試:等待 1 秒
第 2 次重試:等待 2 秒
第 3 次重試:等待 4 秒
第 4 次重試:等待 8 秒
...公式:delay = base * 2^attempt
2.2 基本實作
javascript
async function fetchWithRetry(url, options = {}, maxRetries = 3) {
let lastError;
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
const response = await fetch(url, options);
if (response.ok) {
return response;
}
// 只重試 5xx 錯誤
if (response.status < 500) {
throw new Error(`HTTP ${response.status}`);
}
lastError = new Error(`HTTP ${response.status}`);
} catch (error) {
lastError = error;
}
// 計算延遲
const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s...
console.log(`Retry ${attempt + 1}/${maxRetries} in ${delay}ms`);
await sleep(delay);
}
throw lastError;
}
function sleep(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}2.3 加入上限
避免等待過長:
javascript
function calculateDelay(attempt, options = {}) {
const { baseDelay = 1000, maxDelay = 30000, factor = 2 } = options;
const delay = baseDelay * Math.pow(factor, attempt);
return Math.min(delay, maxDelay);
}
// 結果:1s, 2s, 4s, 8s, 16s, 30s, 30s, 30s...三、 抖動(Jitter)
3.1 為什麼需要抖動?
如果所有客戶端都在相同時間重試,會造成請求尖峰:
3.2 抖動策略
1. 完全抖動(Full Jitter)
javascript
function fullJitter(baseDelay, attempt) {
const exponentialDelay = baseDelay * Math.pow(2, attempt);
return Math.random() * exponentialDelay;
}
// 範圍:[0, exponentialDelay]2. 等量抖動(Equal Jitter)
javascript
function equalJitter(baseDelay, attempt) {
const exponentialDelay = baseDelay * Math.pow(2, attempt);
const half = exponentialDelay / 2;
return half + Math.random() * half;
}
// 範圍:[exponentialDelay/2, exponentialDelay]3. 裝飾抖動(Decorrelated Jitter)
javascript
let previousDelay = baseDelay;
function decorrelatedJitter(baseDelay, maxDelay) {
const delay = Math.min(
maxDelay,
Math.random() * (previousDelay * 3 - baseDelay) + baseDelay
);
previousDelay = delay;
return delay;
}3.3 推薦實作
javascript
function calculateDelayWithJitter(attempt, options = {}) {
const {
baseDelay = 1000,
maxDelay = 30000,
jitter = "full", // 'none', 'full', 'equal'
} = options;
let delay = baseDelay * Math.pow(2, attempt);
delay = Math.min(delay, maxDelay);
switch (jitter) {
case "full":
return Math.random() * delay;
case "equal":
return delay / 2 + Math.random() * (delay / 2);
default:
return delay;
}
}四、 完整重試實作
4.1 可配置的重試器
javascript
class RetryableRequest {
constructor(options = {}) {
this.maxRetries = options.maxRetries ?? 3;
this.baseDelay = options.baseDelay ?? 1000;
this.maxDelay = options.maxDelay ?? 30000;
this.jitter = options.jitter ?? "full";
this.retryCondition = options.retryCondition ?? this.defaultRetryCondition;
this.onRetry = options.onRetry ?? (() => {});
}
defaultRetryCondition(error, response) {
// 網路錯誤
if (error instanceof TypeError) return true;
// 超時
if (error.name === "AbortError") return true;
// 5xx 錯誤
if (response && response.status >= 500) return true;
// 429 限流
if (response && response.status === 429) return true;
return false;
}
calculateDelay(attempt) {
let delay = this.baseDelay * Math.pow(2, attempt);
delay = Math.min(delay, this.maxDelay);
if (this.jitter === "full") {
return Math.random() * delay;
} else if (this.jitter === "equal") {
return delay / 2 + Math.random() * (delay / 2);
}
return delay;
}
async execute(requestFn) {
let lastError;
let lastResponse;
for (let attempt = 0; attempt <= this.maxRetries; attempt++) {
try {
const response = await requestFn();
if (response.ok) {
return response;
}
lastResponse = response;
if (!this.retryCondition(null, response)) {
return response;
}
} catch (error) {
lastError = error;
if (!this.retryCondition(error, null)) {
throw error;
}
}
if (attempt < this.maxRetries) {
const delay = this.calculateDelay(attempt);
this.onRetry(attempt + 1, delay);
await this.sleep(delay);
}
}
if (lastError) throw lastError;
return lastResponse;
}
sleep(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
}
// 使用
const retryable = new RetryableRequest({
maxRetries: 3,
baseDelay: 1000,
jitter: "full",
onRetry: (attempt, delay) => {
console.log(`Retry ${attempt} in ${delay}ms`);
},
});
const response = await retryable.execute(() => fetch("/api/data"));4.2 Axios 重試攔截器
javascript
function createRetryInterceptor(options = {}) {
return async (error) => {
const config = error.config;
if (!config) {
return Promise.reject(error);
}
const { maxRetries = 3, baseDelay = 1000 } = options;
config._retryCount = config._retryCount || 0;
// 檢查是否應該重試
const shouldRetry =
config._retryCount < maxRetries &&
(error.code === "ECONNABORTED" ||
!error.response ||
error.response.status >= 500 ||
error.response.status === 429);
if (!shouldRetry) {
return Promise.reject(error);
}
config._retryCount++;
// 指數退避 + 抖動
const delay = Math.random() * baseDelay * Math.pow(2, config._retryCount);
await new Promise((resolve) => setTimeout(resolve, delay));
return axios(config);
};
}
// 使用
axios.interceptors.response.use(
(response) => response,
createRetryInterceptor({ maxRetries: 3 })
);五、 熔斷機制(Circuit Breaker)
5.1 什麼是熔斷?
當失敗率過高時,暫時停止請求,避免持續衝擊有問題的服務:
5.2 三種狀態
| 狀態 | 行為 |
|---|---|
| Closed | 正常發送請求 |
| Open | 直接拒絕請求(快速失敗) |
| Half-Open | 允許少量測試請求 |
5.3 實作
javascript
class CircuitBreaker {
constructor(options = {}) {
this.failureThreshold = options.failureThreshold ?? 5;
this.successThreshold = options.successThreshold ?? 2;
this.timeout = options.timeout ?? 30000;
this.state = "CLOSED";
this.failureCount = 0;
this.successCount = 0;
this.lastFailureTime = null;
}
async execute(requestFn) {
if (this.state === "OPEN") {
if (Date.now() - this.lastFailureTime > this.timeout) {
this.state = "HALF_OPEN";
console.log("Circuit: HALF_OPEN");
} else {
throw new Error("Circuit is OPEN");
}
}
try {
const result = await requestFn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
onSuccess() {
if (this.state === "HALF_OPEN") {
this.successCount++;
if (this.successCount >= this.successThreshold) {
this.state = "CLOSED";
this.failureCount = 0;
this.successCount = 0;
console.log("Circuit: CLOSED");
}
} else {
this.failureCount = 0;
}
}
onFailure() {
this.failureCount++;
this.lastFailureTime = Date.now();
if (this.state === "HALF_OPEN") {
this.state = "OPEN";
this.successCount = 0;
console.log("Circuit: OPEN (from HALF_OPEN)");
} else if (this.failureCount >= this.failureThreshold) {
this.state = "OPEN";
console.log("Circuit: OPEN");
}
}
getState() {
return this.state;
}
}
// 使用
const breaker = new CircuitBreaker({
failureThreshold: 5,
timeout: 30000,
});
async function fetchWithBreaker(url) {
return breaker.execute(() => fetch(url));
}六、 Retry-After 標頭
6.1 尊重伺服器建議
伺服器可能透過 Retry-After 標頭告訴你何時重試:
http
HTTP/1.1 429 Too Many Requests
Retry-After: 60
HTTP/1.1 503 Service Unavailable
Retry-After: Wed, 21 Oct 2025 07:28:00 GMT6.2 處理 Retry-After
javascript
function getRetryDelay(response, defaultDelay) {
const retryAfter = response.headers.get("Retry-After");
if (!retryAfter) {
return defaultDelay;
}
// 秒數格式
const seconds = parseInt(retryAfter, 10);
if (!isNaN(seconds)) {
return seconds * 1000;
}
// 日期格式
const date = new Date(retryAfter);
if (!isNaN(date.getTime())) {
return Math.max(0, date.getTime() - Date.now());
}
return defaultDelay;
}
// 使用
if (response.status === 429 || response.status === 503) {
const delay = getRetryDelay(response, 5000);
await sleep(delay);
// 重試
}七、 冪等性考慮
7.1 安全重試的方法
| 方法 | 冪等? | 可安全重試? |
|---|---|---|
| GET | ✅ | ✅ |
| HEAD | ✅ | ✅ |
| PUT | ✅ | ✅ |
| DELETE | ✅ | ✅ |
| POST | ❌ | ⚠️ 需要注意 |
| PATCH | ❌ | ⚠️ 需要注意 |
7.2 Post 請求的冪等鍵
javascript
async function postWithIdempotency(url, data) {
const idempotencyKey = crypto.randomUUID();
return fetchWithRetry(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
"Idempotency-Key": idempotencyKey,
},
body: JSON.stringify(data),
});
}伺服器應該:
- 儲存
Idempotency-Key和對應的回應 - 相同的 Key 返回相同的回應
- 不重複執行操作
總結
| 策略 | 說明 |
|---|---|
| 指數退避 | 延遲隨重試次數指數增長 |
| 抖動 | 加入隨機性,分散重試時間 |
| 最大延遲 | 設定上限,避免等待過久 |
| 熔斷器 | 失敗過多時停止請求 |
| Retry-After | 尊重伺服器建議 |
| 冪等鍵 | 確保 POST 請求可安全重試 |
> **重試公式**:
delay = min(maxDelay, baseDelay * 2^attempt * random())進階挑戰
- 實作一個帶熔斷功能的 HTTP 客戶端。
- 設計一個重試機制的監控面板,追蹤重試率和成功率。
- 思考:在微服務架構中,如何避免重試造成的級聯故障?