Skip to content

Generation Provider

Generation Providers are the backbone of the Active Agent framework, allowing seamless integration with various AI services. They provide a consistent interface for prompting and generating responses, making it easy to switch between different providers without changing the core logic of your application.

Available Providers

You can use the following generation providers with Active Agent:

ruby
class OpenAIAgent < ApplicationAgent
  layout "agent"
  generate_with :openai, model: "gpt-4o-mini", instructions: "You're a basic OpenAI agent."
end
ruby
class AnthropicAgent < ActiveAgent::Base
  generate_with :anthropic
end
ruby
class OpenRouterAgent < ApplicationAgent
  layout "agent"
  generate_with :open_router, model: "qwen/qwen3-30b-a3b:free", instructions: "You're a basic Open Router agent."
end
ruby
class OllamaAgent < ApplicationAgent
  layout "agent"
  generate_with :ollama, model: "gemma3:latest", instructions: "You're a basic Ollama agent."
end

Response

Generation providers handle the request-response cycle for generating responses based on the provided prompts. They process the prompt context, including messages, actions, and parameters, and return the generated response.

Response Object

The ActiveAgent::GenerationProvider::Response class encapsulates the result of a generation request, providing access to both the processed response and debugging information.

Attributes

  • message - The generated response message from the AI provider
  • prompt - The complete prompt object used for generation, including updated context, messages, and parameters
  • raw_response - The unprocessed response data from the AI provider, useful for debugging and accessing provider-specific metadata

Example Usage

ruby
response = ApplicationAgent.with(message: "Hello").prompt_context.generate_now

# Access response content
content = response.message.content

# Access response role
role = response.message.role

# Access full prompt context
messages = response.prompt.messages

# Access usage statistics (if available)
usage = response.usage
Response Example

The response object ensures you have full visibility into both the input prompt context and the raw provider response, making it easy to debug generation issues or access provider-specific response metadata.

Provider Configuration

You can configure generation providers with custom settings:

Model and Temperature Configuration

ruby
class AnthropicConfigAgent < ActiveAgent::Base
  generate_with :anthropic,
    model: "claude-3-5-sonnet-20241022",
    temperature: 0.7
end
ruby
require "test_helper"

class GenerationProviderExamplesTest < ActiveAgentTestCase
  test "provider configuration examples" do
    # Mock configurations for providers that might not be configured
    mock_config = {
      "anthropic" => {
        "service" => "Anthropic",
        "api_key" => "test-key",
        "model" => "claude-3-5-sonnet-20241022"
      },
      "openai" => {
        "service" => "OpenAI",
        "api_key" => "test-key",
        "model" => "gpt-4"
      },
      "open_router" => {
        "service" => "OpenRouter",
        "api_key" => "test-key",
        "model" => "anthropic/claude-3-5-sonnet"
      }
    }

    with_active_agent_config(mock_config) do
      # These are documentation examples only
      # region anthropic_provider_example
      class AnthropicConfigAgent < ActiveAgent::Base
        generate_with :anthropic,
          model: "claude-3-5-sonnet-20241022",
          temperature: 0.7
      end
      # endregion anthropic_provider_example

      # region open_router_provider_example
      class OpenRouterConfigAgent < ActiveAgent::Base
        generate_with :open_router,
          model: "anthropic/claude-3-5-sonnet",
          temperature: 0.5
      end
      # endregion open_router_provider_example

      # region custom_host_configuration
      class CustomHostAgent < ActiveAgent::Base
        generate_with :openai,
          host: "https://your-azure-openai-resource.openai.azure.com",
          api_key: "your-api-key",
          model: "gpt-4"
      end
      # endregion custom_host_configuration

      assert_equal "anthropic", AnthropicConfigAgent.generation_provider_name
      assert_equal "open_router", OpenRouterConfigAgent.generation_provider_name
      assert_equal "openai", CustomHostAgent.generation_provider_name
    end
  end

  test "response object usage" do
    VCR.use_cassette("generation_response_usage_example") do
      # region generation_response_usage
      response = ApplicationAgent.with(message: "Hello").prompt_context.generate_now

      # Access response content
      content = response.message.content

      # Access response role
      role = response.message.role

      # Access full prompt context
      messages = response.prompt.messages

      # Access usage statistics (if available)
      usage = response.usage
      # endregion generation_response_usage

      doc_example_output(response)

      assert_not_nil content
      assert_equal :assistant, role
      assert messages.is_a?(Array)
    end
  end
end

Custom Host Configuration

For Azure OpenAI or other custom endpoints:

ruby
class CustomHostAgent < ActiveAgent::Base
  generate_with :openai,
    host: "https://your-azure-openai-resource.openai.azure.com",
    api_key: "your-api-key",
    model: "gpt-4"
end

Configuration Precedence

ActiveAgent follows a clear hierarchy for configuration parameters, ensuring that you have fine-grained control over your AI generation settings. Parameters can be configured at multiple levels, with higher-priority settings overriding lower-priority ones.

Precedence Order (Highest to Lowest)

  1. Runtime Options - Parameters passed directly to the prompt method
  2. Agent Options - Parameters defined in generate_with at the agent class level
  3. Global Configuration - Parameters in config/active_agent.yml

This hierarchy allows you to:

  • Set sensible defaults globally
  • Override them for specific agents
  • Make runtime adjustments for individual requests

Example: Configuration Precedence in Action

ruby
test "validates configuration precedence: runtime > agent > config" do
  # Step 1: Set up config-level options (lowest priority)
  # This would normally be in config/active_agent.yml
  config_options = {
    "service" => "OpenRouter",
    "model" => "config-model",
    "temperature" => 0.1,
    "max_tokens" => 100,
    "data_collection" => "allow"
  }

  # Create a mock provider that exposes its config for testing
  mock_provider = ActiveAgent::GenerationProvider::OpenRouterProvider.new(config_options)

  # Step 2: Create agent with generate_with options (medium priority)
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-model",
      temperature: 0.5,
      data_collection: "deny"
    # Note: max_tokens not specified here, should fall back to config
  end

  agent = agent_class.new

  # Step 3: Call prompt with runtime options (highest priority)
  prompt_context = agent.prompt(
    message: "test",
    options: {
      temperature: 0.9,  # Override both agent and config
      max_tokens: 500    # Override config (agent didn't specify)
      # Note: model not specified, should use agent-model
      # Note: data_collection not specified, should use deny from agent
    }
  )

  # Verify the merged options follow correct precedence
  merged_options = prompt_context.options

  # Runtime options win when specified
  assert_equal 0.9, merged_options[:temperature], "Runtime temperature should override agent and config"
  assert_equal 500, merged_options[:max_tokens], "Runtime max_tokens should override config"

  # Agent options win over config when runtime not specified
  assert_equal "agent-model", merged_options[:model], "Agent model should override config when runtime not specified"
  assert_equal "deny", merged_options[:data_collection], "Agent data_collection should override config when runtime not specified"
end

Data Collection Precedence Example

The data_collection parameter for OpenRouter follows the same precedence rules:

ruby
test "data_collection follows precedence rules" do
  # 1. Config level (lowest priority)
  config_with_allow = {
    "service" => "OpenRouter",
    "model" => "openai/gpt-4o",
    "data_collection" => "allow"
  }

  # 2. Agent level with generate_with (medium priority)
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "openai/gpt-4o",
      data_collection: "deny"  # Override config
  end

  agent = agent_class.new
  provider = agent.send(:generation_provider)

  # Test without runtime override - should use agent level "deny"
  prompt_without_runtime = agent.prompt(message: "test")
  provider.instance_variable_set(:@prompt, prompt_without_runtime)
  prefs = provider.send(:build_provider_preferences)
  assert_equal "deny", prefs[:data_collection], "Agent-level data_collection should override config"

  # 3. Runtime level (highest priority)
  prompt_with_runtime = agent.prompt(
    message: "test",
    options: {
      data_collection: [ "OpenAI" ]  # Override both agent and config
    }
  )
  provider.instance_variable_set(:@prompt, prompt_with_runtime)
  prefs = provider.send(:build_provider_preferences)
  assert_equal [ "OpenAI" ], prefs[:data_collection], "Runtime data_collection should override everything"
end

Key Principles

1. Runtime Always Wins

Runtime options in the prompt method override all other configurations. See the test demonstrating this behavior:

ruby
test "runtime options override everything" do
  # Create agent with all levels configured
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "gpt-4",
      temperature: 0.5,
      max_tokens: 1000,
      data_collection: "deny"
  end

  agent = agent_class.new

  # Runtime options should override everything
  prompt_context = agent.prompt(
    message: "test",
    options: {
      model: "runtime-model",
      temperature: 0.99,
      max_tokens: 2000,
      data_collection: [ "OpenAI", "Google" ]
    }
  )

  options = prompt_context.options
  assert_equal "runtime-model", options[:model]
  assert_equal 0.99, options[:temperature]
  assert_equal 2000, options[:max_tokens]
  assert_equal [ "OpenAI", "Google" ], options[:data_collection]
end

2. Nil Values Don't Override

Nil values passed at runtime don't override existing configurations:

ruby
test "nil runtime values don't override" do
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-model",
      temperature: 0.5
  end

  agent = agent_class.new

  # Pass nil values in runtime options
  prompt_context = agent.prompt(
    message: "test",
    options: {
      model: nil,
      temperature: nil,
      max_tokens: 999  # Non-nil value should work
    }
  )

  options = prompt_context.options

  # Nil values should not override
  assert_equal "agent-model", options[:model]
  assert_equal 0.5, options[:temperature]

  # Non-nil value should override
  assert_equal 999, options[:max_tokens]
end

3. Agent Configuration Overrides Global

Agent-level settings take precedence over global configuration files:

ruby
test "agent options override config options" do
  # Create agent with generate_with options
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-override-model",
      temperature: 0.7
  end

  agent = agent_class.new

  # Call prompt without runtime options
  prompt_context = agent.prompt(message: "test")

  options = prompt_context.options
  assert_equal "agent-override-model", options[:model]
  assert_equal 0.7, options[:temperature]
end

Supported Runtime Options

The following options can be overridden at runtime:

  • :model - The AI model to use
  • :temperature - Creativity/randomness (0.0-1.0)
  • :max_tokens - Maximum response length
  • :stream - Enable streaming responses
  • :top_p - Nucleus sampling parameter
  • :frequency_penalty - Reduce repetition
  • :presence_penalty - Encourage topic diversity
  • :response_format - Structured output format
  • :seed - For reproducible outputs
  • :stop - Stop sequences
  • :tools_choice - Tool selection strategy
  • :data_collection - Privacy settings (OpenRouter)
  • :require_parameters - Provider parameter validation (OpenRouter)

Best Practices

  1. Use Global Config for Defaults: Set organization-wide defaults in config/active_agent.yml
  2. Agent-Level for Specific Needs: Override in generate_with for agent-specific requirements
  3. Runtime for Dynamic Adjustments: Use runtime options for user preferences or conditional logic

For a complete example showing all three levels working together, see:

ruby
test "validates configuration precedence: runtime > agent > config" do
  # Step 1: Set up config-level options (lowest priority)
  # This would normally be in config/active_agent.yml
  config_options = {
    "service" => "OpenRouter",
    "model" => "config-model",
    "temperature" => 0.1,
    "max_tokens" => 100,
    "data_collection" => "allow"
  }

  # Create a mock provider that exposes its config for testing
  mock_provider = ActiveAgent::GenerationProvider::OpenRouterProvider.new(config_options)

  # Step 2: Create agent with generate_with options (medium priority)
  agent_class = Class.new(ApplicationAgent) do
    generate_with :open_router,
      model: "agent-model",
      temperature: 0.5,
      data_collection: "deny"
    # Note: max_tokens not specified here, should fall back to config
  end

  agent = agent_class.new

  # Step 3: Call prompt with runtime options (highest priority)
  prompt_context = agent.prompt(
    message: "test",
    options: {
      temperature: 0.9,  # Override both agent and config
      max_tokens: 500    # Override config (agent didn't specify)
      # Note: model not specified, should use agent-model
      # Note: data_collection not specified, should use deny from agent
    }
  )

  # Verify the merged options follow correct precedence
  merged_options = prompt_context.options

  # Runtime options win when specified
  assert_equal 0.9, merged_options[:temperature], "Runtime temperature should override agent and config"
  assert_equal 500, merged_options[:max_tokens], "Runtime max_tokens should override config"

  # Agent options win over config when runtime not specified
  assert_equal "agent-model", merged_options[:model], "Agent model should override config when runtime not specified"
  assert_equal "deny", merged_options[:data_collection], "Agent data_collection should override config when runtime not specified"
end

Embeddings Support

Generation providers support creating text embeddings for semantic search, clustering, and similarity matching. Embeddings transform text into numerical vectors that capture semantic meaning.

Generating Embeddings Synchronously

Use embed_now to generate embeddings immediately:

ruby
test "generates embeddings synchronously with embed_now" do
  VCR.use_cassette("embedding_agent_sync") do
    # Create a generation for embedding
    generation = ApplicationAgent.with(
      message: "The quick brown fox jumps over the lazy dog"
    ).prompt_context

    # Generate embedding synchronously
    response = generation.embed_now

    # Extract embedding vector
    embedding_vector = response.message.content

    assert_kind_of Array, embedding_vector
    assert embedding_vector.all? { |v| v.is_a?(Float) }
    assert_includes [ 1536, 3072 ], embedding_vector.size  # OpenAI dimensions vary by model

    # Document the example
    doc_example_output(response)

    embedding_vector
  end
end
Response Example

Asynchronous Embedding Generation

Use embed_later for background processing of embeddings:

ruby
test "generates embeddings asynchronously with embed_later" do
  # Create a generation for async embedding
  generation = ApplicationAgent.with(
    message: "Artificial intelligence is transforming technology"
  ).prompt_context

  # Mock the enqueue_generation private method
  generation.instance_eval do
    def enqueue_generation(method, options = {})
      @enqueue_called = true
      @enqueue_method = method
      @enqueue_options = options
      true
    end

    def enqueue_called?
      @enqueue_called
    end

    def enqueue_method
      @enqueue_method
    end

    def enqueue_options
      @enqueue_options
    end
  end

  # Queue embedding for background processing
  result = generation.embed_later(
    priority: :low,
    queue: :embeddings
  )

  assert result
  assert generation.enqueue_called?
  assert_equal :embed_now, generation.enqueue_method
  assert_equal({ priority: :low, queue: :embeddings }, generation.enqueue_options)
end

Embedding Callbacks

Process embeddings with before and after callbacks:

ruby
test "processes embeddings with callbacks" do
  VCR.use_cassette("embedding_agent_callbacks") do
    # Create a custom agent with embedding callbacks
    custom_agent_class = Class.new(ApplicationAgent) do
      attr_accessor :before_embedding_called, :after_embedding_called

      before_embedding :track_before
      after_embedding :track_after

      def track_before
        self.before_embedding_called = true
      end

      def track_after
        self.after_embedding_called = true
      end
    end

    # Generate embedding with callbacks
    generation = custom_agent_class.with(
      message: "Testing embedding callbacks"
    ).prompt_context

    agent = generation.send(:processed_agent)
    response = generation.embed_now

    assert agent.before_embedding_called
    assert agent.after_embedding_called
    assert_not_nil response.message.content

    doc_example_output(response)
  end
end
Response Example

Use embeddings to find semantically similar content:

ruby
test "performs similarity search with embeddings" do
  VCR.use_cassette("embedding_similarity_search") do
    documents = [
      "The cat sat on the mat",
      "Dogs are loyal companions",
      "Machine learning is a subset of AI",
      "The feline rested on the rug"
    ]

    # Generate embeddings for all documents
    embeddings = documents.map do |doc|
      generation = ApplicationAgent.with(message: doc).prompt_context
      generation.embed_now.message.content
    end

    # Query embedding
    query = "cat on mat"
    query_generation = ApplicationAgent.with(message: query).prompt_context
    query_embedding = query_generation.embed_now.message.content

    # Calculate cosine similarities
    similarities = embeddings.map.with_index do |embedding, index|
      similarity = cosine_similarity(query_embedding, embedding)
      { document: documents[index], similarity: similarity }
    end

    # Sort by similarity
    results = similarities.sort_by { |s| -s[:similarity] }

    # Most similar should be the cat/mat documents
    assert_equal "The cat sat on the mat", results.first[:document]
    assert results.first[:similarity] > 0.5, "Similarity should be > 0.5, got #{results.first[:similarity]}"

    # Document the results
    doc_example_output(results.first(2))
  end
end
Response Example

Provider-Specific Embedding Models

Different providers offer various embedding models:

  • OpenAI: text-embedding-3-large, text-embedding-3-small, text-embedding-ada-002
  • Ollama: nomic-embed-text, mxbai-embed-large, all-minilm
  • Anthropic: Does not natively support embeddings (use a dedicated embedding provider)

Configuration

Configure embedding models in your agent:

ruby
class EmbeddingAgent < ApplicationAgent
  generate_with :openai,
    model: "gpt-4",  # For text generation
    embedding_model: "text-embedding-3-large"  # For embeddings
end

Or in your configuration file:

yaml
development:
  openai:
    model: gpt-4
    embedding_model: text-embedding-3-large
    dimensions: 256  # Optional: reduce embedding dimensions

For more details on embeddings, see the Embeddings Guide.

Provider-Specific Documentation

For detailed documentation on specific providers and their features: