Documentation Index Fetch the complete documentation index at: https://mintlify.com/ndycode/codex-multi-auth/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Codex Multi-Auth provides production-grade runtime reliability features that keep your authentication seamless even under adverse conditions like rate limits, network errors, and token expiry.
Live Account Sync
Reload account state without restarting your editor or process.
File System Watching Uses fs.watch to detect account storage changes in real-time with debounced reload.
Polling Fallback Polls file mtime every 2 seconds for platforms where fs.watch is unreliable (Windows).
Zero Downtime Reloads accounts in background without interrupting active requests or streams.
Concurrency Safe Prevents concurrent reloads with in-flight request queuing.
How It Works
const liveSync = new LiveAccountSync ( async () => {
// Reload callback: reloads account manager from disk
await accountManager . reload ();
});
// Start watching storage file
await liveSync . syncToPath ( '/path/to/accounts.json' );
Debounce: Changes are debounced by 250ms to batch rapid writes.
Polling Interval: 2 seconds (configurable via pollIntervalMs).
Use Cases
Multi-Instance Sync: Keep multiple editor windows in sync
External CLI Updates: Reflect codex auth login changes immediately
Team Workflows: Share account updates via version control (with encrypted tokens)
CI/CD: Reload accounts after secret injection
Monitoring
const snapshot = liveSync . getSnapshot ();
console . log ( snapshot );
// {
// path: '~/.codex/multi-auth/openai-codex-accounts.json',
// running: true,
// lastKnownMtimeMs: 1709481234567,
// lastSyncAt: 1709481238901,
// reloadCount: 5,
// errorCount: 0
// }
Proactive Token Refresh
Refresh OAuth tokens before they expire to prevent mid-request failures.
Refresh Guardian
// Default: refresh 5 minutes before expiry
const DEFAULT_PROACTIVE_BUFFER_MS = 5 * 60 * 1000 ;
// Check if account needs proactive refresh
if ( shouldRefreshProactively ( account , bufferMs )) {
await proactiveRefreshAccount ( account );
}
Refresh Strategy
Buffer Window: 5-minute default (configurable via tokenRefreshSkewMs)
Parallel Refresh: Refreshes multiple accounts concurrently
Queued Deduplication: Uses refresh queue to prevent duplicate refresh requests
Failure Handling: Logs failures but doesn’t block request flow
Bulk Refresh
// Refresh all expiring accounts
const results = await refreshExpiringAccounts ( accounts );
for ( const [ index , result ] of results ) {
if ( result . reason === 'success' ) {
console . log ( `Account ${ index } : refreshed successfully` );
} else if ( result . reason === 'failed' ) {
console . log ( `Account ${ index } : refresh failed` );
}
}
Summary Logging:
Proactively refreshing 3 account(s)
Proactive refresh complete: 3 total, 2 succeeded, 1 failed
Benefits
Reduces auth failures during long-running requests
Improves UX with seamless token rotation
Works alongside reactive refresh in fetch pipeline
No configuration required (enabled by default)
Failure Policy
Unified retry and failover decisions for network errors, auth failures, and rate limits.
Policy Decision Tree
type FailureAction = 'retry' | 'rotate' | 'fail' ;
// Network errors → retry with backoff
if ( isNetworkError ( error )) {
return { action: 'retry' , backoffMs: 1000 };
}
// Auth failures → rotate to next account
if ( error . statusCode === 401 ) {
return { action: 'rotate' , reason: 'auth-failure' };
}
// Rate limits → rotate with cooldown
if ( error . statusCode === 429 ) {
return {
action: 'rotate' ,
cooldownMs: parseRetryAfter ( error . headers ),
reason: 'rate-limit'
};
}
// Fatal errors → fail fast
if ( error . statusCode === 400 ) {
return { action: 'fail' , reason: 'client-error' };
}
Retry Categories
Error Type Status Code Action Backoff Network timeout - Retry Exponential (1s, 2s, 4s) Connection refused - Retry Exponential (1s, 2s, 4s) DNS failure - Retry Exponential (1s, 2s, 4s) Auth failure 401 Rotate Immediate Rate limit 429 Rotate Parse Retry-After header Server error 5xx Rotate Immediate Client error 400, 403, 404 Fail None
Cooldown Management
type CooldownReason = 'auth-failure' | 'network-error' | 'rate-limit' ;
interface AccountMetadata {
coolingDownUntil ?: number ; // Timestamp when cooldown expires
cooldownReason ?: CooldownReason ;
}
Cooldown Durations:
Auth Failure: 60 seconds (hard failure cooldown)
Network Error: 30 seconds (soft retry cooldown)
Rate Limit: Parse from Retry-After header or default 60 seconds
Cooldown Behavior:
if ( account . coolingDownUntil && account . coolingDownUntil > Date . now ()) {
// Skip account during selection
continue ;
}
Rate Limit Backoff
Exponential backoff with jitter for retry attempts.
Backoff Algorithm
const baseDelay = 1000 ; // 1 second
const maxDelay = 32000 ; // 32 seconds
const jitterFactor = 0.1 ; // ±10% randomization
function calculateBackoff ( attempt : number ) : number {
const exponential = Math . min ( baseDelay * Math . pow ( 2 , attempt ), maxDelay );
const jitter = exponential * jitterFactor * ( Math . random () * 2 - 1 );
return Math . floor ( exponential + jitter );
}
Attempt Progression:
Attempt 1: ~1000ms ± 10%
Attempt 2: ~2000ms ± 10%
Attempt 3: ~4000ms ± 10%
Attempt 4: ~8000ms ± 10%
Attempt 5: ~16000ms ± 10%
Attempt 6+: ~32000ms ± 10% (capped)
Respects server-provided retry hints:
// Parse Retry-After header
const retryAfter = response . headers . get ( 'Retry-After' );
if ( retryAfter ) {
// Numeric: seconds until retry
if ( / ^ \d + $ / . test ( retryAfter )) {
return parseInt ( retryAfter , 10 ) * 1000 ;
}
// HTTP-date: parse and calculate delta
const retryDate = new Date ( retryAfter );
return Math . max ( 0 , retryDate . getTime () - Date . now ());
}
Stream Failover
Recover from stalled SSE streams with automatic failover.
Stall Detection
// Timeout if no data received for 30 seconds
const STREAM_STALL_TIMEOUT_MS = 30_000 ;
let lastDataTimestamp = Date . now ();
stream . on ( 'data' , ( chunk ) => {
lastDataTimestamp = Date . now ();
processChunk ( chunk );
});
const stallTimer = setInterval (() => {
if ( Date . now () - lastDataTimestamp > STREAM_STALL_TIMEOUT_MS ) {
stream . abort ();
initiateFailover ();
}
}, 5000 );
Failover Strategy
Recovery Steps
Detect Stall: No data received for 30 seconds
Abort Stream: Close stalled connection
Account Rotation: Switch to next healthy account
Resume Request: Retry from last successful chunk
State Reconstruction: Rebuild partial response if possible
Fallback: Return partial content or error if unrecoverable
Partial Content Recovery
interface StreamState {
chunksReceived : number ;
lastCompleteMessage : string ;
partialBuffer : string ;
}
// Resume after failover
if ( state . lastCompleteMessage ) {
// Continue from last complete SSE message
yield state . lastCompleteMessage ;
}
Session Affinity
Reduce account thrash by maintaining session-to-account affinity.
Affinity Cache
const sessionAffinity = new Map < string , number >(); // sessionId → accountIndex
// Sticky account for conversation
function selectAccount ( sessionId ?: string ) : number {
if ( sessionId && sessionAffinity . has ( sessionId )) {
const affinityIndex = sessionAffinity . get ( sessionId );
if ( isAccountHealthy ( affinityIndex )) {
return affinityIndex ; // Reuse same account
}
}
// Select new account if no affinity or account unhealthy
const newIndex = selectHealthyAccount ();
if ( sessionId ) {
sessionAffinity . set ( sessionId , newIndex );
}
return newIndex ;
}
Benefits:
Reduces auth header changes mid-conversation
Improves quota tracking accuracy
Minimizes account switching overhead
Eviction: Affinity cleared when account fails or becomes unhealthy.
Circuit Breaker
Isolate failing accounts to prevent cascade failures.
Breaker States
type CircuitState = 'closed' | 'open' | 'half-open' ;
interface CircuitBreaker {
state : CircuitState ;
failureCount : number ;
lastFailureTime : number ;
nextRetryTime : number ;
}
State Transitions:
Closed: Normal operation, requests allowed
Open: Failure threshold exceeded, fast-fail all requests
Half-Open: Test request allowed after timeout, auto-close on success
Thresholds
const FAILURE_THRESHOLD = 5 ; // Open circuit after 5 failures
const OPEN_DURATION_MS = 60_000 ; // Stay open for 60 seconds
const HALF_OPEN_DURATION_MS = 5_000 ; // Test for 5 seconds before closing
Integration
if ( circuitBreaker . state === 'open' ) {
// Skip account, try next
continue ;
}
try {
const result = await makeRequest ( account );
circuitBreaker . recordSuccess ();
return result ;
} catch ( error ) {
circuitBreaker . recordFailure ();
throw error ;
}
Observability
Runtime telemetry for monitoring reliability features:
interface RuntimeMetrics {
liveSync : {
reloadCount : number ;
errorCount : number ;
lastSyncAt : number ;
};
proactiveRefresh : {
refreshCount : number ;
successCount : number ;
failureCount : number ;
};
failover : {
rotationCount : number ;
retryCount : number ;
failCount : number ;
};
cooldowns : {
activeCount : number ;
totalCooldownTime : number ;
};
}
Access Metrics:
const metrics = await codexManager . getMetrics ();
console . log ( JSON . stringify ( metrics , null , 2 ));
Best Practices
Reliability Recommendations
Enable Live Sync: Keep liveAccountSync: true for multi-instance setups
Monitor Cooldowns: High cooldown rates indicate account or network issues
Proactive Refresh: Use default 5-minute buffer unless latency-sensitive
Respect Rate Limits: Don’t override cooldown timers manually
Session Affinity: Enable for conversational workloads to reduce churn
Circuit Breakers: Isolate chronically failing accounts with enabled: false
Logs: Monitor lib/logger.ts output for failure patterns
tokenRefreshSkewMs - Proactive refresh buffer (default: 5 minutes)
liveAccountSync - Enable live file watching (default: true)
maxRetryAttempts - Maximum retry attempts per request (default: 3)
cooldownDurationMs - Default cooldown duration (default: 60 seconds)