Generation Provider

Generation Providers are the backbone of the Active Agent framework, allowing seamless integration with various AI services. They provide a consistent interface for prompting and generating responses, making it easy to switch between different providers without changing the core logic of your application.

Available Providers

You can use the following generation providers with Active Agent:

OpenAIAnthropicOpenRouterOllama

ruby

class OpenAIAgent < ApplicationAgent
  layout "agent"
  generate_with :openai, model: "gpt-4o-mini", instructions: "You're a basic OpenAI agent."
end

ruby

class AnthropicAgent < ActiveAgent::Base
  generate_with :anthropic
end

ruby

class OpenRouterAgent < ApplicationAgent
  layout "agent"
  generate_with :open_router, model: "qwen/qwen3-30b-a3b:free", instructions: "You're a basic Open Router agent."
end

ruby

class OllamaAgent < ApplicationAgent
  layout "agent"
  generate_with :ollama, model: "gemma3:latest", instructions: "You're a basic Ollama agent."
end

Response

Generation providers handle the request-response cycle for generating responses based on the provided prompts. They process the prompt context, including messages, actions, and parameters, and return the generated response.

Response Object

The ActiveAgent::GenerationProvider::Response class encapsulates the result of a generation request, providing access to both the processed response and debugging information.

Attributes

message - The generated response message from the AI provider
prompt - The complete prompt object used for generation, including updated context, messages, and parameters
raw_response - The unprocessed response data from the AI provider, useful for debugging and accessing provider-specific metadata

Example Usage

ruby

response = ApplicationAgent.with(message: "Hello").prompt_context.generate_now

# Access response content
content = response.message.content

# Access response role
role = response.message.role

# Access full prompt context
messages = response.prompt.messages

# Access usage statistics (if available)
usage = response.usage

Response Example

The response object ensures you have full visibility into both the input prompt context and the raw provider response, making it easy to debug generation issues or access provider-specific response metadata.

Provider Configuration

You can configure generation providers with custom settings:

Model and Temperature Configuration

ruby

class AnthropicConfigAgent < ActiveAgent::Base
  generate_with :anthropic,
    model: "claude-3-5-sonnet-20241022",
    temperature: 0.7
end

ruby

require "test_helper"

class GenerationProviderExamplesTest < ActiveAgentTestCase
  test "provider configuration examples" do
    # Mock configurations for providers that might not be configured
    mock_config = {
      "anthropic" => {
        "service" => "Anthropic",
        "api_key" => "test-key",
        "model" => "claude-3-5-sonnet-20241022"
      },
      "openai" => {
        "service" => "OpenAI",
        "api_key" => "test-key",
        "model" => "gpt-4"
      },
      "open_router" => {
        "service" => "OpenRouter",
        "api_key" => "test-key",
        "model" => "anthropic/claude-3-5-sonnet"
      }
    }

    with_active_agent_config(mock_config) do
      # These are documentation examples only
      # region anthropic_provider_example
      class AnthropicConfigAgent < ActiveAgent::Base
        generate_with :anthropic,
          model: "claude-3-5-sonnet-20241022",
          temperature: 0.7
      end
      # endregion anthropic_provider_example

      # region open_router_provider_example
      class OpenRouterConfigAgent < ActiveAgent::Base
        generate_with :open_router,
          model: "anthropic/claude-3-5-sonnet",
          temperature: 0.5
      end
      # endregion open_router_provider_example

      # region custom_host_configuration
      class CustomHostAgent < ActiveAgent::Base
        generate_with :openai,
          host: "https://your-azure-openai-resource.openai.azure.com",
          api_key: "your-api-key",
          model: "gpt-4"
      end
      # endregion custom_host_configuration

      assert_equal "anthropic", AnthropicConfigAgent.generation_provider_name
      assert_equal "open_router", OpenRouterConfigAgent.generation_provider_name
      assert_equal "openai", CustomHostAgent.generation_provider_name
    end
  end

  test "response object usage" do
    VCR.use_cassette("generation_response_usage_example") do
      # region generation_response_usage
      response = ApplicationAgent.with(message: "Hello").prompt_context.generate_now

      # Access response content
      content = response.message.content

      # Access response role
      role = response.message.role

      # Access full prompt context
      messages = response.prompt.messages

      # Access usage statistics (if available)
      usage = response.usage
      # endregion generation_response_usage

      doc_example_output(response)

      assert_not_nil content
      assert_equal :assistant, role
      assert messages.is_a?(Array)
    end
  end
end

Custom Host Configuration

For Azure OpenAI or other custom endpoints:

ruby

class CustomHostAgent < ActiveAgent::Base
  generate_with :openai,
    host: "https://your-azure-openai-resource.openai.azure.com",
    api_key: "your-api-key",
    model: "gpt-4"
end

Configuration Precedence

ActiveAgent follows a clear hierarchy for configuration parameters, ensuring that you have fine-grained control over your AI generation settings. Parameters can be configured at multiple levels, with higher-priority settings overriding lower-priority ones.

Precedence Order (Highest to Lowest)

Runtime Options - Parameters passed directly to the prompt method
Agent Options - Parameters defined in generate_with at the agent class level
Global Configuration - Parameters in config/active_agent.yml

This hierarchy allows you to:

Set sensible defaults globally
Override them for specific agents
Make runtime adjustments for individual requests

Example: Configuration Precedence in Action

ruby

test "validates configuration precedence: runtime > agent > config" do
  # Step 1: Set up config-level options (lowest priority)
  # This would normally be in config/active_agent.yml
  config_options = {
    "service" => "OpenRouter",
    "model" => "config-model",
    "temperature" => 0.1,
    "max_tokens" => 100,
    "data_collection" => "allow"
  }

  # Create a mock provider that exposes its config for testing
  mock_provider = ActiveAgent::GenerationProvider::OpenRouterProvider.new(config_options)

  # Step 2: Create agent with generate_with options (medium priority)
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-model",
      temperature: 0.5,
      data_collection: "deny"
    # Note: max_tokens not specified here, should fall back to config
  end

  agent = agent_class.new

  # Step 3: Call prompt with runtime options (highest priority)
  prompt_context = agent.prompt(
    message: "test",
    options: {
      temperature: 0.9,  # Override both agent and config
      max_tokens: 500    # Override config (agent didn't specify)
      # Note: model not specified, should use agent-model
      # Note: data_collection not specified, should use deny from agent
    }
  )

  # Verify the merged options follow correct precedence
  merged_options = prompt_context.options

  # Runtime options win when specified
  assert_equal 0.9, merged_options[:temperature], "Runtime temperature should override agent and config"
  assert_equal 500, merged_options[:max_tokens], "Runtime max_tokens should override config"

  # Agent options win over config when runtime not specified
  assert_equal "agent-model", merged_options[:model], "Agent model should override config when runtime not specified"
  assert_equal "deny", merged_options[:data_collection], "Agent data_collection should override config when runtime not specified"
end

Data Collection Precedence Example

The data_collection parameter for OpenRouter follows the same precedence rules:

ruby

test "data_collection follows precedence rules" do
  # 1. Config level (lowest priority)
  config_with_allow = {
    "service" => "OpenRouter",
    "model" => "openai/gpt-4o",
    "data_collection" => "allow"
  }

  # 2. Agent level with generate_with (medium priority)
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "openai/gpt-4o",
      data_collection: "deny"  # Override config
  end

  agent = agent_class.new
  provider = agent.send(:generation_provider)

  # Test without runtime override - should use agent level "deny"
  prompt_without_runtime = agent.prompt(message: "test")
  provider.instance_variable_set(:@prompt, prompt_without_runtime)
  prefs = provider.send(:build_provider_preferences)
  assert_equal "deny", prefs[:data_collection], "Agent-level data_collection should override config"

  # 3. Runtime level (highest priority)
  prompt_with_runtime = agent.prompt(
    message: "test",
    options: {
      data_collection: [ "OpenAI" ]  # Override both agent and config
    }
  )
  provider.instance_variable_set(:@prompt, prompt_with_runtime)
  prefs = provider.send(:build_provider_preferences)
  assert_equal [ "OpenAI" ], prefs[:data_collection], "Runtime data_collection should override everything"
end

Key Principles

1. Runtime Always Wins

Runtime options in the prompt method override all other configurations. See the test demonstrating this behavior:

ruby

test "runtime options override everything" do
  # Create agent with all levels configured
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "gpt-4",
      temperature: 0.5,
      max_tokens: 1000,
      data_collection: "deny"
  end

  agent = agent_class.new

  # Runtime options should override everything
  prompt_context = agent.prompt(
    message: "test",
    options: {
      model: "runtime-model",
      temperature: 0.99,
      max_tokens: 2000,
      data_collection: [ "OpenAI", "Google" ]
    }
  )

  options = prompt_context.options
  assert_equal "runtime-model", options[:model]
  assert_equal 0.99, options[:temperature]
  assert_equal 2000, options[:max_tokens]
  assert_equal [ "OpenAI", "Google" ], options[:data_collection]
end

2. Nil Values Don't Override

Nil values passed at runtime don't override existing configurations:

ruby

test "nil runtime values don't override" do
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-model",
      temperature: 0.5
  end

  agent = agent_class.new

  # Pass nil values in runtime options
  prompt_context = agent.prompt(
    message: "test",
    options: {
      model: nil,
      temperature: nil,
      max_tokens: 999  # Non-nil value should work
    }
  )

  options = prompt_context.options

  # Nil values should not override
  assert_equal "agent-model", options[:model]
  assert_equal 0.5, options[:temperature]

  # Non-nil value should override
  assert_equal 999, options[:max_tokens]
end

3. Agent Configuration Overrides Global

Agent-level settings take precedence over global configuration files:

ruby

test "agent options override config options" do
  # Create agent with generate_with options
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-override-model",
      temperature: 0.7
  end

  agent = agent_class.new

  # Call prompt without runtime options
  prompt_context = agent.prompt(message: "test")

  options = prompt_context.options
  assert_equal "agent-override-model", options[:model]
  assert_equal 0.7, options[:temperature]
end

Supported Runtime Options

The following options can be overridden at runtime:

:model - The AI model to use
:temperature - Creativity/randomness (0.0-1.0)
:max_tokens - Maximum response length
:stream - Enable streaming responses
:top_p - Nucleus sampling parameter
:frequency_penalty - Reduce repetition
:presence_penalty - Encourage topic diversity
:response_format - Structured output format
:seed - For reproducible outputs
:stop - Stop sequences
:tools_choice - Tool selection strategy
:data_collection - Privacy settings (OpenRouter)
:require_parameters - Provider parameter validation (OpenRouter)

Best Practices

Use Global Config for Defaults: Set organization-wide defaults in config/active_agent.yml
Agent-Level for Specific Needs: Override in generate_with for agent-specific requirements
Runtime for Dynamic Adjustments: Use runtime options for user preferences or conditional logic

For a complete example showing all three levels working together, see:

ruby

test "validates configuration precedence: runtime > agent > config" do
  # Step 1: Set up config-level options (lowest priority)
  # This would normally be in config/active_agent.yml
  config_options = {
    "service" => "OpenRouter",
    "model" => "config-model",
    "temperature" => 0.1,
    "max_tokens" => 100,
    "data_collection" => "allow"
  }

  # Create a mock provider that exposes its config for testing
  mock_provider = ActiveAgent::GenerationProvider::OpenRouterProvider.new(config_options)

  # Step 2: Create agent with generate_with options (medium priority)
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-model",
      temperature: 0.5,
      data_collection: "deny"
    # Note: max_tokens not specified here, should fall back to config
  end

  agent = agent_class.new

  # Step 3: Call prompt with runtime options (highest priority)
  prompt_context = agent.prompt(
    message: "test",
    options: {
      temperature: 0.9,  # Override both agent and config
      max_tokens: 500    # Override config (agent didn't specify)
      # Note: model not specified, should use agent-model
      # Note: data_collection not specified, should use deny from agent
    }
  )

  # Verify the merged options follow correct precedence
  merged_options = prompt_context.options

  # Runtime options win when specified
  assert_equal 0.9, merged_options[:temperature], "Runtime temperature should override agent and config"
  assert_equal 500, merged_options[:max_tokens], "Runtime max_tokens should override config"

  # Agent options win over config when runtime not specified
  assert_equal "agent-model", merged_options[:model], "Agent model should override config when runtime not specified"
  assert_equal "deny", merged_options[:data_collection], "Agent data_collection should override config when runtime not specified"
end

Embeddings Support

Generation providers support creating text embeddings for semantic search, clustering, and similarity matching. Embeddings transform text into numerical vectors that capture semantic meaning.

Generating Embeddings Synchronously

Use embed_now to generate embeddings immediately:

ruby

test "generates embeddings synchronously with embed_now" do
  VCR.use_cassette("embedding_agent_sync") do
    # Create a generation for embedding
    generation = ApplicationAgent.with(
      message: "The quick brown fox jumps over the lazy dog"
    ).prompt_context

    # Generate embedding synchronously
    response = generation.embed_now

    # Extract embedding vector
    embedding_vector = response.message.content

    assert_kind_of Array, embedding_vector
    assert embedding_vector.all? { |v| v.is_a?(Float) }
    assert_includes [ 1536, 3072 ], embedding_vector.size  # OpenAI dimensions vary by model

    # Document the example
    doc_example_output(response)

    embedding_vector
  end
end

Response Example

Asynchronous Embedding Generation

Use embed_later for background processing of embeddings:

ruby

test "generates embeddings asynchronously with embed_later" do
  # Create a generation for async embedding
  generation = ApplicationAgent.with(
    message: "Artificial intelligence is transforming technology"
  ).prompt_context

  # Mock the enqueue_generation private method
  generation.instance_eval do
    def enqueue_generation(method, options = {})
      @enqueue_called = true
      @enqueue_method = method
      @enqueue_options = options
      true
    end

    def enqueue_called?
      @enqueue_called
    end

    def enqueue_method
      @enqueue_method
    end

    def enqueue_options
      @enqueue_options
    end
  end

  # Queue embedding for background processing
  result = generation.embed_later(
    priority: :low,
    queue: :embeddings
  )

  assert result
  assert generation.enqueue_called?
  assert_equal :embed_now, generation.enqueue_method
  assert_equal({ priority: :low, queue: :embeddings }, generation.enqueue_options)
end

Embedding Callbacks

Process embeddings with before and after callbacks:

ruby

test "processes embeddings with callbacks" do
  VCR.use_cassette("embedding_agent_callbacks") do
    # Create a custom agent with embedding callbacks
    custom_agent_class = Class.new(ApplicationAgent) do
      attr_accessor :before_embedding_called, :after_embedding_called

      before_embedding :track_before
      after_embedding :track_after

      def track_before
        self.before_embedding_called = true
      end

      def track_after
        self.after_embedding_called = true
      end
    end

    # Generate embedding with callbacks
    generation = custom_agent_class.with(
      message: "Testing embedding callbacks"
    ).prompt_context

    agent = generation.send(:processed_agent)
    response = generation.embed_now

    assert agent.before_embedding_called
    assert agent.after_embedding_called
    assert_not_nil response.message.content

    doc_example_output(response)
  end
end

Response Example

Similarity Search

Use embeddings to find semantically similar content:

ruby

test "performs similarity search with embeddings" do
  VCR.use_cassette("embedding_similarity_search") do
    documents = [
      "The cat sat on the mat",
      "Dogs are loyal companions",
      "Machine learning is a subset of AI",
      "The feline rested on the rug"
    ]

    # Generate embeddings for all documents
    embeddings = documents.map do |doc|
      generation = ApplicationAgent.with(message: doc).prompt_context
      generation.embed_now.message.content
    end

    # Query embedding
    query = "cat on mat"
    query_generation = ApplicationAgent.with(message: query).prompt_context
    query_embedding = query_generation.embed_now.message.content

    # Calculate cosine similarities
    similarities = embeddings.map.with_index do |embedding, index|
      similarity = cosine_similarity(query_embedding, embedding)
      { document: documents[index], similarity: similarity }
    end

    # Sort by similarity
    results = similarities.sort_by { |s| -s[:similarity] }

    # Most similar should be the cat/mat documents
    assert_equal "The cat sat on the mat", results.first[:document]
    assert results.first[:similarity] > 0.5, "Similarity should be > 0.5, got #{results.first[:similarity]}"

    # Document the results
    doc_example_output(results.first(2))
  end
end

Response Example

Provider-Specific Embedding Models

Different providers offer various embedding models:

OpenAI: text-embedding-3-large, text-embedding-3-small, text-embedding-ada-002
Ollama: nomic-embed-text, mxbai-embed-large, all-minilm
Anthropic: Does not natively support embeddings (use a dedicated embedding provider)

Configuration

Configure embedding models in your agent:

ruby

class EmbeddingAgent < ApplicationAgent
  generate_with :openai,
    model: "gpt-4",  # For text generation
    embedding_model: "text-embedding-3-large"  # For embeddings
end

Or in your configuration file:

yaml

development:
  openai:
    model: gpt-4
    embedding_model: text-embedding-3-large
    dimensions: 256  # Optional: reduce embedding dimensions

For more details on embeddings, see the Embeddings Guide.

Provider-Specific Documentation

For detailed documentation on specific providers and their features:

OpenAI Provider - GPT-4, GPT-3.5, function calling, vision, and Azure OpenAI support
Anthropic Provider - Claude 3.5 and Claude 3 models with extended context windows
Ollama Provider - Local LLM inference for privacy-sensitive applications
OpenRouter Provider - Multi-model routing with fallbacks, PDF processing, and vision support

Generation Provider ​

Available Providers ​

Response ​

Response Object ​

Attributes ​

Example Usage ​

Provider Configuration ​

Model and Temperature Configuration ​

Custom Host Configuration ​

Configuration Precedence ​

Precedence Order (Highest to Lowest) ​

Example: Configuration Precedence in Action ​

Data Collection Precedence Example ​

Key Principles ​

1. Runtime Always Wins ​

2. Nil Values Don't Override ​

3. Agent Configuration Overrides Global ​

Supported Runtime Options ​

Best Practices ​

Embeddings Support ​

Generating Embeddings Synchronously ​

Asynchronous Embedding Generation ​

Embedding Callbacks ​

Similarity Search ​

Provider-Specific Embedding Models ​

Configuration ​

Provider-Specific Documentation ​

Generation Provider

Available Providers

Response

Response Object

Attributes

Example Usage

Provider Configuration

Model and Temperature Configuration

Custom Host Configuration

Configuration Precedence

Precedence Order (Highest to Lowest)

Example: Configuration Precedence in Action

Data Collection Precedence Example

Key Principles

1. Runtime Always Wins

2. Nil Values Don't Override

3. Agent Configuration Overrides Global

Supported Runtime Options

Best Practices

Embeddings Support

Generating Embeddings Synchronously

Asynchronous Embedding Generation

Embedding Callbacks

Similarity Search

Provider-Specific Embedding Models

Configuration

Provider-Specific Documentation