Skip to content

Rate Limit Handling: Auto-Retry and Account Switching

What You'll Learn

Master Antigravity Auth's intelligent rate limit handling mechanism:

  • Distinguish 5 different types of 429 errors (quota exhaustion, rate limiting, capacity exhaustion, etc.)
  • Understand the exponential backoff algorithm for auto-retry
  • Master automatic account switching logic in multi-account scenarios
  • Configure immediate switching on first rate limit or switch after 2 retries
  • Use Gemini dual quota pool fallback to improve availability

No longer struggle with "all accounts rate limited but quota not fully utilized."

Your Current Challenge

When using multiple Google accounts, you may encounter:

  • Frequent 429 rate limits, unsure whether to retry or switch accounts
  • Significantly different wait times for different 429 types, unsure how long to wait
  • All accounts rate limited but actual quota unused due to inaccurate rate limit detection
  • Unclear Gemini dual quota pool switch timing, leading to quota waste

When to Use This

When you:

  • Have configured multiple accounts but frequently encounter 429 errors
  • Want to optimize request success rates in multi-account scenarios
  • Need to adjust retry strategies (e.g., immediate switch on first rate limit)
  • Use Gemini models and want to leverage dual quota pools

Core Approach

What is Rate Limit Handling

When the Antigravity Auth plugin encounters a 429 error, it automatically performs the following actions:

  1. Detect Rate Limit Type: Parse the reason or message from the response to distinguish 5 rate limit types
  2. Calculate Backoff Time: Intelligently calculate wait time based on rate limit type and failure count
  3. Execute Strategy:
    • Multi-Account: Prioritize switching to available accounts
    • Single Account: Exponential backoff retry
  4. Record State: Update the account's rate limit status for future request reference

Why Intelligent Handling?

Google has rate limits for each account. Simply "switch on 429" may cause frequent switching, missing quickly recovering accounts. Simply "wait and retry" may waste quota from other available accounts. Intelligent handling needs to find the optimal balance between "switching" and "waiting."

5 Rate Limit Types

Antigravity Auth distinguishes 5 rate limit types based on the reason field or message content in API responses:

TypeReasonBackoff StrategyTypical Scenario
QUOTA_EXHAUSTEDQuota exhausted (daily or monthly)Incremental backoff: 1min → 5min → 30min → 120minDaily quota exhausted
RATE_LIMIT_EXCEEDEDRequests too fast (per minute limit)Fixed 30 secondsShort burst of requests
MODEL_CAPACITY_EXHAUSTEDModel server capacity insufficientFixed 15 secondsPeak hours
SERVER_ERRORServer internal error (5xx)Fixed 20 secondsService instability
UNKNOWNUnknown causeFixed 60 secondsUnparseable error

Backoff Time Formula (accounts.ts:51-75):

typescript
// QUOTA_EXHAUSTED: Incremental backoff (based on consecutive failures)
// Fail 1 time: 1min (60_000ms)
// Fail 2 times: 5min (300_000ms)
// Fail 3 times: 30min (1_800_000ms)
// Fail 4+ times: 120min (7_200_000ms)

// Other types: Fixed backoff
// RATE_LIMIT_EXCEEDED: 30s
// MODEL_CAPACITY_EXHAUSTED: 15s
// SERVER_ERROR: 20s
// UNKNOWN: 60s

Exponential Backoff Algorithm

Antigravity Auth uses a deduplicated exponential backoff algorithm:

Core Logic (plugin.ts:509-567):

typescript
// 1. Deduplication window: Concurrent 429s within 2 seconds treated as same event
const RATE_LIMIT_DEDUP_WINDOW_MS = 2000;

// 2. State reset: Reset counter after 2 minutes without 429
const RATE_LIMIT_STATE_RESET_MS = 120_000;

// 3. Exponential backoff: baseDelay * 2^(attempt-1), max 60s
const expBackoff = Math.min(baseDelay * Math.pow(2, attempt - 1), 60000);

Why Deduplication Window?

Suppose you have 3 concurrent requests all triggering 429:

  • Without deduplication: Each request increments counter, resulting in attempt=3, backoff 4s (2^2 × 1s)
  • With deduplication: Treated as same event, attempt=1, backoff 1s

The deduplication window avoids concurrent requests excessively amplifying backoff time.

Multi-Account Switching Logic

Antigravity Auth adopts a "prioritize switching, retry as fallback" strategy in multi-account scenarios:

Decision Flow:

Key Configurations (config/schema.ts:256-259):

ConfigDefaultDescription
switch_on_first_rate_limittrueWhether to immediately switch account on first rate limit (after waiting 1s)
max_rate_limit_wait_seconds300Maximum wait time when all accounts rate limited (5 minutes)

Recommended Configuration:

  • Multi-Account (2+): switch_on_first_rate_limit: true, immediate switch to avoid wasting quota
  • Single Account: This config is invalid, automatically uses exponential backoff retry

Gemini Dual Quota Pool Fallback

Gemini models support two independent quota pools:

  • Antigravity Quota Pool: Used first, but smaller capacity
  • Gemini CLI Quota Pool: Used as fallback, larger capacity

Fallback Logic (plugin.ts:1318-1345):

1. Make request using Antigravity quota pool
2. Encounter 429 rate limit
3. Check if other accounts' Antigravity quota available
   - Yes: Switch account, continue using Antigravity
   - No: If quota_fallback=true, switch to Gemini CLI quota pool

Configuration Option (config/schema.ts:179):

json
{
  "quota_fallback": true  // Default false
}

Example:

You have 2 accounts, both encounter 429:

Statusquota_fallback=falsequota_fallback=true
Account 1 (Antigravity)Rate limitedRate limited → Try Gemini CLI
Account 2 (Antigravity)Rate limitedRate limited → Try Gemini CLI
ResultWait 5 minutes then retrySwitch to Gemini CLI, no wait needed

Advantages of Dual Quota Pool

Gemini CLI quota pool is typically larger, and fallback can significantly improve request success rates. Note:

  • Models with explicit :antigravity suffix will not fallback
  • Fallback only happens when all accounts' Antigravity quota is exhausted

Single Account Retry Logic

If only one account exists, Antigravity Auth uses exponential backoff retry:

Retry Formula (plugin.ts:1373-1375):

typescript
// First: 1s
// 2nd: 2s (1s × 2^1)
// 3rd: 4s (1s × 2^2)
// 4th: 8s (1s × 2^3)
// ...
// Max: 60s
const expBackoffMs = Math.min(1000 * Math.pow(2, attempt - 1), 60000);

Retry Flow:

1st: Encounter 429
  ↓ Wait 1s and retry (quick retry)
2nd: Still 429
  ↓ Wait 2s and retry
3rd: Still 429
  ↓ Wait 4s and retry
...

Difference from Multi-Account:

ScenarioStrategyWait Time
Single AccountExponential backoff retry1s → 2s → 4s → 8s → ... → 60s
Multi-AccountSwitch account1s (first) or 5s (2nd)

🎒 Preparation

Prerequisites Check

Ensure you have completed:

Follow Along

Step 1: Enable Debug Logs to Observe Rate Limits

Why Debug logs show detailed rate limit information, helping you understand how the plugin works.

How

Enable debug logs:

bash
export OPENCODE_ANTIGRAVITY_DEBUG=1

Trigger rate limits with concurrent requests:

bash
# Send multiple concurrent requests (ensure triggering 429)
for i in {1..10}; do
  opencode run "Test $i" --model=google/antigravity-gemini-3-pro &
done
wait

You should see:

[RateLimit] 429 on Account 0 family=claude retryAfterMs=60000
  message: You have exceeded the quota for this request.
  quotaResetTime: 2026-01-23T12:00:00Z
  retryDelayMs: 60000
  reason: QUOTA_EXHAUSTED

Rate limited. Quick retry in 1s... (toast notification)

Log Interpretation:

  • 429 on Account 0 family=claude: Account 0's Claude model rate limited
  • retryAfterMs=60000: Server suggests waiting 60 seconds
  • reason: QUOTA_EXHAUSTED: Quota exhausted (incremental backoff)

Step 2: Configure Immediate Switch on First Rate Limit

Why With multiple accounts, immediate switching on first rate limit maximizes quota utilization, avoiding waiting.

How

Modify configuration file:

bash
cat > ~/.config/opencode/antigravity.json << 'EOF'
{
  "$schema": "https://raw.githubusercontent.com/NoeFabris/opencode-antigravity-auth/main/assets/antigravity.schema.json",
  "switch_on_first_rate_limit": true
}
EOF

You should see: Configuration file updated.

Verify Configuration Takes Effect:

Send multiple requests, observe behavior after first rate limit:

bash
export OPENCODE_ANTIGRAVITY_DEBUG=1
for i in {1..5}; do
  opencode run "Test $i" --model=google/antigravity-gemini-3-pro &
done
wait

You should see:

[RateLimit] 429 on Account 0 family=gemini retryAfterMs=30000
Server at capacity. Switching account in 1s... (toast notification)
[AccountContext] Selected account: [email protected] (index: 1)

Key Points:

  • Wait 1s after first 429
  • Automatically switch to next available account (index: 1)
  • No retry of current account

Step 3: Disable Immediate Switch on First Rate Limit

Why If you want to retry the current account first (avoiding frequent switching), you can disable this option.

How

Modify configuration file:

bash
cat > ~/.config/opencode/antigravity.json << 'EOF'
{
  "$schema": "https://raw.githubusercontent.com/NoeFabris/opencode-antigravity-auth/main/assets/antigravity.schema.json",
  "switch_on_first_rate_limit": false
}
EOF

You should see: Configuration file updated.

Verify Configuration Takes Effect:

Send multiple requests again:

bash
export OPENCODE_ANTIGRAVITY_DEBUG=1
for i in {1..5}; do
  opencode run "Test $i" --model=google/antigravity-gemini-3-pro &
done
wait

You should see:

[RateLimit] 429 on Account 0 family=gemini retryAfterMs=30000
Rate limited. Quick retry in 1s... (toast notification)
[RateLimit] 429 on Account 0 family=gemini retryAfterMs=30000
Rate limited again. Switching account in 5s... (toast notification)
[AccountContext] Selected account: [email protected] (index: 1)

Key Points:

  • First 429: Wait 1s and retry current account
  • 2nd 429: Wait 5s then switch account
  • If retry succeeds, continue using current account

Step 4: Enable Gemini Dual Quota Pool Fallback

Why Gemini models support dual quota pools, and enabling fallback can significantly improve request success rates.

How

Modify configuration file:

bash
cat > ~/.config/opencode/antigravity.json << 'EOF'
{
  "$schema": "https://raw.githubusercontent.com/NoeFabris/opencode-antigravity-auth/main/assets/antigravity.schema.json",
  "quota_fallback": true
}
EOF

You should see: Configuration file updated.

Verify Configuration Takes Effect:

Send Gemini requests (ensure triggering Antigravity quota pool rate limit):

bash
export OPENCODE_ANTIGRAVITY_DEBUG=1
for i in {1..5}; do
  opencode run "Test $i" --model=google/antigravity-gemini-3-pro &
done
wait

You should see:

[RateLimit] 429 on Account 0 family=gemini retryAfterMs=30000
Antigravity quota exhausted for gemini-3-pro. Switching to Gemini CLI quota... (toast notification)
[DEBUG] quota fallback: gemini-cli

Key Points:

  • After all accounts' Antigravity quota is exhausted
  • Automatically switch to Gemini CLI quota pool
  • No wait, direct retry

Force Using Antigravity Quota (no fallback):

bash
# Use explicit suffix :antigravity
opencode run "Test" --model=google/antigravity-gemini-3-pro:antigravity

Step 5: Configure Maximum Wait Time

Why When all accounts are rate limited, the plugin waits for the shortest reset time. You can configure a maximum wait time to avoid infinite waiting.

How

Modify configuration file:

bash
cat > ~/.config/opencode/antigravity.json << 'EOF'
{
  "$schema": "https://raw.githubusercontent.com/NoeFabris/opencode-antigravity-auth/main/assets/antigravity.schema.json",
  "max_rate_limit_wait_seconds": 60
}
EOF

You should see: Configuration file updated.

Verify Configuration Takes Effect:

Trigger rate limits on all accounts:

bash
export OPENCODE_ANTIGRAVITY_DEBUG=1
for i in {1..20}; do
  opencode run "Test $i" --model=google/antigravity-claude-opus-4.5 &
done
wait

You should see:

[RateLimit] 429 on Account 0 family=claude retryAfterMs=60000
[RateLimit] 429 on Account 1 family=claude retryAfterMs=60000
[DEBUG] All accounts rate limited. Min wait time: 60s, max wait: 60s
Rate limited. Retrying in 60s... (toast notification)

Key Points:

  • When all accounts rate limited, wait for the shortest reset time
  • If shortest reset time > max_rate_limit_wait_seconds, use the maximum value
  • Default maximum wait is 300 seconds (5 minutes)

Checkpoint ✅

How to Verify Configuration Takes Effect?

  1. Check configuration file to confirm config items are correct
  2. Enable debug logs: OPENCODE_ANTIGRAVITY_DEBUG=1
  3. Observe [RateLimit] events in logs
  4. Observe account switching behavior (AccountContext logs)
  5. Check if toast notifications display as expected

Common Pitfalls

❌ Ignoring Deduplication Window, Misunderstanding Backoff Time

Incorrect Behavior:

  • Send 10 concurrent requests, all encounter 429
  • Think backoff time is 2^10 × 1s = 1024s
  • Actually it's 1s (because of deduplication window)

Correct Approach: Understand the 2-second deduplication window, concurrent requests won't be counted multiple times.

❌ Mixing switch_on_first_rate_limit with Single Account

Incorrect Behavior:

  • Only have 1 account, but configured switch_on_first_rate_limit: true
  • Think it will switch accounts, but no other accounts available

Correct Approach: In single-account scenarios, this config is invalid, automatically uses exponential backoff retry.

❌ Gemini Explicit Suffix Blocks Fallback

Incorrect Behavior:

  • Use google/antigravity-gemini-3-pro:antigravity
  • Configured quota_fallback: true
  • But won't fallback to Gemini CLI when encountering 429

Correct Approach: Explicit suffix forces using specified quota pool. If you need fallback, don't use the suffix.

❌ Wait Time Too Long When All Accounts Rate Limited

Incorrect Behavior:

  • Configured max_rate_limit_wait_seconds: 600 (10 minutes)
  • All accounts rate limited for 60s, but waited 10 minutes

Correct Approach: max_rate_limit_wait_seconds is the maximum value. Actual wait time is the shortest reset time, never exceeding the maximum.

Summary

MechanismCore FeatureUse Case
Rate Limit DetectionDistinguish 5 types (QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, etc.)All scenarios
Exponential BackoffMore failures = longer wait (1s → 2s → 4s → ... → 60s)Single account
Account SwitchingMulti-account prioritizes switching, single account retriesMulti-account
Deduplication WindowConcurrent 429s within 2 seconds treated as same eventConcurrent scenarios
Dual Quota Pool FallbackTry Gemini CLI after Antigravity rate limitedGemini models

Key Configurations:

ConfigDefaultRecommendedDescription
switch_on_first_rate_limittruetrue (multi-account)Immediate switch on first rate limit
quota_fallbackfalsetrue (Gemini)Enable dual quota pool fallback
max_rate_limit_wait_seconds300300Maximum wait time (seconds)

Debug Methods:

  • Enable debug logs: OPENCODE_ANTIGRAVITY_DEBUG=1
  • Check [RateLimit] events: Understand rate limit type and backoff time
  • Check [AccountContext] logs: Observe account switching behavior

Coming Up Next

Next lesson: Session Recovery

You'll learn:

  • How to automatically resume interrupted tool calls
  • Session recovery mechanism for Thinking models
  • Synthetic tool_result injection principles

Appendix: Source Code Reference

Click to expand source code locations

Last updated: 2026-01-23

FeatureFile PathLines
Rate limit type definitionsrc/plugin/accounts.ts10-20
Parse rate limit reasonsrc/plugin/accounts.ts29-49
Calculate backoff timesrc/plugin/accounts.ts51-75
Exponential backoff algorithmsrc/plugin.ts532-567
Mark account rate limitedsrc/plugin/accounts.ts434-461
Check if account rate limitedsrc/plugin/accounts.ts134-152
429 error handlingsrc/plugin.ts1260-1396
Gemini dual quota pool fallbacksrc/plugin.ts1318-1345
Rate limit loggingsrc/plugin/debug.ts354-396
Config Schemasrc/plugin/config/schema.ts256-221

Key Constants:

  • QUOTA_EXHAUSTED_BACKOFFS = [60_000, 300_000, 1_800_000, 7_200_000]: Incremental backoff times for quota exhaustion (accounts.ts:22)
  • RATE_LIMIT_EXCEEDED_BACKOFF = 30_000: Fixed 30s backoff for rate limit (accounts.ts:23)
  • MODEL_CAPACITY_EXHAUSTED_BACKOFF = 15_000: Fixed 15s backoff for capacity exhaustion (accounts.ts:24)
  • SERVER_ERROR_BACKOFF = 20_000: Fixed 20s backoff for server error (accounts.ts:25)
  • RATE_LIMIT_DEDUP_WINDOW_MS = 2000: 2s deduplication window (plugin.ts:509)
  • RATE_LIMIT_STATE_RESET_MS = 120_000: 2-minute state reset (plugin.ts:510)
  • FIRST_RETRY_DELAY_MS = 1000: 1s first quick retry (plugin.ts:1304)

Key Functions:

  • parseRateLimitReason(reason?, message?): Parse rate limit reason (accounts.ts:29)
  • calculateBackoffMs(reason, consecutiveFailures, retryAfterMs?): Calculate backoff time (accounts.ts:51)
  • markRateLimitedWithReason(account, family, headerStyle, model, reason, retryAfterMs?): Mark account rate limited (accounts.ts:445)
  • isRateLimitedForHeaderStyle(account, family, headerStyle, model?): Check if account rate limited (accounts.ts:536)
  • getRateLimitBackoff(accountIndex, quotaKey, serverRetryAfterMs): Get deduplicated backoff time (plugin.ts:532)
  • resetRateLimitState(accountIndex, quotaKey): Reset rate limit state (plugin.ts:573)