Retries
LLM service APIs are inherently unstable, with requests frequently failing due to rate limits, temporary outages, network issues, and other transient errors. ActiveAgent relies on the built-in retry mechanisms provided by each provider's underlying SDK (ruby-openai, anthropic-rb, etc.), which implement sophisticated retry strategies with exponential backoff and rate limit handling.
Provider-Native Retries
Each provider SDK includes its own retry logic that's specifically tuned for that provider's API:
When retries trigger:
- Network timeouts and connection errors
- Rate limit responses (429 status codes)
- Temporary server errors (500, 502, 503, 504)
- Socket errors and DNS failures
- Provider-specific transient errors
Retry behavior:
- Automatic exponential backoff
- Rate limit header awareness
- Jitter to prevent thundering herd
- Provider-optimized retry policies
- After max retries, an exception is raised
Configuration
OpenAI and OpenRouter
Both OpenAI and OpenRouter use the ruby-openai gem which provides comprehensive retry configuration:
# config/activeagent.yml
openai:
service: "OpenAI"
access_token: <%= Rails.application.credentials.dig(:openai, :access_token) %>
max_retries: 5 # Number of retry attempts (default: 3)
timeout: 600.0 # Total request timeout in seconds (default: 120)
initial_retry_delay: 1.0 # Initial delay between retries (default: 1.0)
max_retry_delay: 8.0 # Maximum delay between retries (default: 8.0)The ruby-openai gem automatically:
- Retries on network errors and rate limits
- Uses exponential backoff with jitter
- Respects Retry-After headers from the API
- Handles 429, 500, 502, 503, 504 status codes
Anthropic
The Anthropic provider uses the anthropic-rb gem with similar retry configuration:
# config/activeagent.yml
anthropic:
service: "Anthropic"
access_token: <%= Rails.application.credentials.dig(:anthropic, :access_token) %>
max_retries: 5 # Number of retry attempts (default: 2)
timeout: 600.0 # Total request timeout in seconds (default: 600)
initial_retry_delay: 0.5 # Initial delay between retries (default: 0.5)
max_retry_delay: 2.0 # Maximum delay between retries (default: 2.0)Ollama
Ollama runs locally and typically doesn't need extensive retry configuration, but you can still configure timeouts:
# config/activeagent.yml
ollama:
service: "Ollama"
host: "http://localhost:11434"
timeout: 300.0 # Adjust for long-running local modelsPer-Agent Configuration
You can override retry settings on a per-agent basis:
class MyAgent < ApplicationAgent
generate_with :openai,
model: "gpt-4o",
max_retries: 10, # More aggressive retries for critical agents
timeout: 1200.0
endDisabling Retries
To disable retries completely (not recommended for production):
# config/activeagent.yml
openai:
service: "OpenAI"
access_token: <%= Rails.application.credentials.dig(:openai, :access_token) %>
max_retries: 0 # Disable retriesMonitoring Retries
Use instrumentation to monitor API calls and detect retry patterns:
ActiveSupport::Notifications.subscribe("generate.active_agent") do |name, start, finish, id, payload|
duration = finish - start
# Long durations may indicate retries occurred
if duration > 30
Rails.logger.warn("Generation took #{duration}s, agent: #{payload[:agent_class]}")
end
if payload[:error]
Rails.logger.error("Generation failed: #{payload[:error]}")
end
endProvider SDKs may also log retry attempts. Enable debug logging to see detailed retry information:
# For ruby-openai debugging
ENV['OPENAI_LOG'] = 'debug'
# For general Rails logging
ActiveAgent.configure do |config|
config.logger.level = Logger::DEBUG
endAdvanced Retry Strategies
For advanced retry patterns beyond what the provider SDKs offer, consider:
Background Job Retries
Use Sidekiq's retry mechanism for job-level retries:
class GenerationJob < ApplicationJob
queue_as :default
retry_on Net::ReadTimeout, wait: :exponentially_longer, attempts: 10
retry_on SomeProvider::RateLimitError, wait: 1.minute, attempts: 5
def perform(agent_class, input)
agent_class.constantize.generate(input:)
end
endError Handling
For comprehensive error handling patterns including exception handlers and recovery strategies, see Error Handling.
Provider Documentation
Each provider SDK has detailed documentation on their retry implementation:
- ruby-openai - OpenAI and OpenRouter
- anthropic-rb - Anthropic Claude
- ruby-ollama - Ollama (local)
Related Documentation
- Configuration - Framework and provider configuration
- Rails Integration - Rails-specific configuration and setup
- Testing - Test configuration and mock providers
- Error Handling - Agent-level error handling
- Instrumentation - Monitoring and logging
- Providers - Provider-specific documentation