Building a Custom Token Manager with Circuit Breaker Pattern
Authentication token management sounds simple until it isn't. Refresh the token before it expires, redirect to login when it fails — what could go wrong? In our React Native regulated app, the answer was: everything. Race conditions during concurrent refreshes, infinite retry loops when the auth server was down, stale tokens in WebView contexts, and zombie sessions that should have expired hours ago. Here's how I built a custom TokenManager that solved all of it.
The Requirements That Broke Standard Solutions
Our token lifecycle had constraints that off-the-shelf solutions couldn't handle:
- 90% lifetime refresh: Refresh tokens at 90% of their lifetime, not at expiry
- Circuit breaker: Stop hammering the auth server when it's down
- Mutex logout: Only one logout operation at a time, even with concurrent triggers
- 8-hour session timeout: Hard cap regardless of token validity
- 15-minute WebView grace period: WebView contexts get extra time before forced re-auth
No library I found handled all five. So I built one.
The TokenManager Architecture
typescript// tokenManager.ts
interface TokenState {
accessToken: string | null;
refreshToken: string | null;
expiresAt: number;
sessionStartedAt: number;
lastActivity: number;
}
interface CircuitBreakerState {
failures: number;
lastFailure: number;
state: 'closed' | 'open' | 'half-open';
}
const SESSION_TIMEOUT = 8 * 60 * 60 * 1000; // 8 hours
const WEBVIEW_GRACE = 15 * 60 * 1000; // 15 minutes
const REFRESH_THRESHOLD = 0.9; // 90% of lifetime
const CIRCUIT_BREAKER_THRESHOLD = 3;
const CIRCUIT_BREAKER_RESET = 30 * 1000; // 30 seconds
class TokenManager {
private state: TokenState;
private circuitBreaker: CircuitBreakerState;
private refreshPromise: Promise<string> | null = null;
private logoutMutex: boolean = false;
constructor() {
this.state = {
accessToken: null,
refreshToken: null,
expiresAt: 0,
sessionStartedAt: 0,
lastActivity: 0,
};
this.circuitBreaker = {
failures: 0,
lastFailure: 0,
state: 'closed',
};
}
}
The 90% Lifetime Refresh Strategy
Most implementations refresh tokens when they get a 401 response. That's reactive — and in a regulated app, it means the user sees a loading state while the token refreshes. I wanted to be proactive.
typescriptclass TokenManager {
// ...
shouldRefresh(): boolean {
if (!this.state.accessToken || !this.state.expiresAt) return false;
const now = Date.now();
const tokenLifetime = this.state.expiresAt - this.state.sessionStartedAt;
const elapsed = now - this.state.sessionStartedAt;
const lifetimeRatio = elapsed / tokenLifetime;
return lifetimeRatio >= REFRESH_THRESHOLD;
}
async getValidToken(context: 'native' | 'webview' = 'native'): Promise<string> {
// Check session timeout first
if (this.isSessionExpired(context)) {
await this.handleSessionExpiry();
throw new SessionExpiredError();
}
// Proactive refresh at 90% lifetime
if (this.shouldRefresh()) {
return this.refreshWithDedup();
}
if (!this.state.accessToken) {
throw new NoTokenError();
}
this.state.lastActivity = Date.now();
return this.state.accessToken;
}
private isSessionExpired(context: 'native' | 'webview'): boolean {
const now = Date.now();
const sessionAge = now - this.state.sessionStartedAt;
const timeout = context === 'webview'
? SESSION_TIMEOUT + WEBVIEW_GRACE
: SESSION_TIMEOUT;
return sessionAge > timeout;
}
}
The 90% threshold gives us a comfortable buffer. For a token with a 1-hour lifetime, we start refreshing at 54 minutes. The user never sees a token-related loading state during normal usage.
Circuit Breaker: Stop the Bleeding
When the auth server goes down, naive retry logic creates a thundering herd. Every failed refresh triggers another attempt, which fails, which triggers another attempt. In a mobile app with background refresh, this drains battery and floods the server the moment it recovers.
typescriptclass TokenManager {
// ...
private canAttemptRefresh(): boolean {
const { state, failures, lastFailure } = this.circuitBreaker;
switch (state) {
case 'closed':
return true;
case 'open': {
const timeSinceLastFailure = Date.now() - lastFailure;
if (timeSinceLastFailure > CIRCUIT_BREAKER_RESET) {
// Transition to half-open: allow one attempt
this.circuitBreaker.state = 'half-open';
return true;
}
return false;
}
case 'half-open':
// Only one request allowed in half-open state
// The refreshWithDedup method ensures this via the mutex
return true;
default:
return false;
}
}
private recordSuccess(): void {
this.circuitBreaker = {
failures: 0,
lastFailure: 0,
state: 'closed',
};
}
private recordFailure(): void {
this.circuitBreaker.failures += 1;
this.circuitBreaker.lastFailure = Date.now();
if (this.circuitBreaker.failures >= CIRCUIT_BREAKER_THRESHOLD) {
this.circuitBreaker.state = 'open';
}
}
}
The states work like a real electrical circuit breaker. Closed means normal operation — requests flow through. After three consecutive failures, the breaker opens and blocks all refresh attempts for 30 seconds. After the cooldown, it enters half-open state, allowing a single test request. If that succeeds, the breaker closes. If it fails, it opens again.
Request Deduplication with Promise Caching
The most insidious bug we had was concurrent token refreshes. User opens the app, five API calls fire simultaneously, all see an expired token, all try to refresh. Five refresh requests hit the server, four get invalid refresh tokens because the first one already rotated the token pair.
typescriptclass TokenManager {
// ...
private async refreshWithDedup(): Promise<string> {
// If a refresh is already in flight, reuse the same promise
if (this.refreshPromise) {
return this.refreshPromise;
}
if (!this.canAttemptRefresh()) {
throw new CircuitBreakerOpenError(
'Token refresh circuit breaker is open. Try again later.'
);
}
this.refreshPromise = this.executeRefresh();
try {
const token = await this.refreshPromise;
return token;
} finally {
this.refreshPromise = null;
}
}
private async executeRefresh(): Promise<string> {
try {
const response = await authApi.refreshToken({
refreshToken: this.state.refreshToken,
});
this.state.accessToken = response.accessToken;
this.state.refreshToken = response.refreshToken;
this.state.expiresAt = Date.now() + response.expiresIn * 1000;
this.state.lastActivity = Date.now();
this.recordSuccess();
return response.accessToken;
} catch (error) {
this.recordFailure();
if (isRefreshTokenExpired(error)) {
await this.handleSessionExpiry();
}
throw error;
}
}
}
The key insight is this.refreshPromise. By caching the in-flight promise, all concurrent callers await the same operation. This is simpler and more reliable than using a mutex/semaphore approach because JavaScript's single-threaded nature guarantees that the if (this.refreshPromise) check is atomic.
Mutex Logout: One Logout to Rule Them All
A similar race condition exists with logout. Token refresh fails, circuit breaker triggers logout, WebView detects session expiry and triggers logout, user taps logout button — three concurrent logout operations can corrupt state.
typescriptclass TokenManager {
// ...
async logout(): Promise<void> {
if (this.logoutMutex) {
// Another logout is already in progress, wait for it
return new Promise((resolve) => {
const check = setInterval(() => {
if (!this.logoutMutex) {
clearInterval(check);
resolve();
}
}, 100);
});
}
this.logoutMutex = true;
try {
// Revoke tokens on the server
if (this.state.refreshToken) {
await authApi.revokeToken(this.state.refreshToken).catch(() => {
// Best effort — don't block logout if revocation fails
});
}
// Clear all state
this.state = {
accessToken: null,
refreshToken: null,
expiresAt: 0,
sessionStartedAt: 0,
lastActivity: 0,
};
// Reset circuit breaker
this.circuitBreaker = {
failures: 0,
lastFailure: 0,
state: 'closed',
};
// Notify stores
useAuthStore.getState().clearAuth();
// Navigate to login
navigationRef.current?.reset({
index: 0,
routes: [{ name: 'Login' }],
});
} finally {
this.logoutMutex = false;
}
}
}
The WebView Grace Period
Our app embeds React web content via WebView. When the native app's session expires, we can't immediately kill the WebView — the user might be in the middle of a form. The 15-minute grace period handles this:
typescript// In the WebView bridge
webViewBridge.onTokenRequest(async () => {
try {
// 'webview' context gets SESSION_TIMEOUT + WEBVIEW_GRACE
const token = await tokenManager.getValidToken('webview');
return { token, expiresIn: tokenManager.getRemainingTime() };
} catch (error) {
if (error instanceof SessionExpiredError) {
webViewBridge.postMessage({
type: 'SESSION_EXPIRED',
gracePeriodRemaining: 0,
});
}
throw error;
}
});
Integration with Axios Interceptors
The TokenManager plugs into our API layer via Axios interceptors:
typescriptapiClient.interceptors.request.use(async (config) => {
const token = await tokenManager.getValidToken();
config.headers.Authorization = \`Bearer \${token}\`;
return config;
});
apiClient.interceptors.response.use(
(response) => response,
async (error) => {
if (error.response?.status === 401 && !error.config._retry) {
error.config._retry = true;
const token = await tokenManager.getValidToken();
error.config.headers.Authorization = \`Bearer \${token}\`;
return apiClient(error.config);
}
return Promise.reject(error);
}
);
Results in Production
After deploying the TokenManager, we saw zero concurrent refresh race conditions (down from ~50/day), auth server load dropped 40% during incidents thanks to the circuit breaker, session-related support tickets dropped by 80%, and WebView form abandonment decreased by 35% thanks to the grace period.
Building a custom solution was the right call here. The complexity was justified by the constraints, and the result is a system that handles edge cases gracefully instead of failing silently. Sometimes the best library is the one you write yourself.