Skip to content

OpenAI Provider

The OpenAI provider enables integration with OpenAI's GPT models including GPT-4, GPT-4 Turbo, and GPT-3.5 Turbo. It supports advanced features like function calling, streaming responses, and structured outputs.

Configuration

Basic Setup

Configure OpenAI in your agent:

ruby
class OpenAIAgent < ApplicationAgent
  layout "agent"
  generate_with :openai, model: "gpt-4o-mini", instructions: "You're a basic OpenAI agent."
end

Configuration File

Set up OpenAI credentials in config/active_agent.yml:

yaml
openai: &openai
  service: "OpenAI"
  access_token: <%= Rails.application.credentials.dig(:openai, :access_token) %>
yaml
openai:
  <<: *openai
  model: "gpt-4o-mini"
  temperature: 0.7

Environment Variables

Alternatively, use environment variables:

bash
OPENAI_ACCESS_TOKEN=your-api-key
OPENAI_ORGANIZATION_ID=your-org-id  # Optional

Supported Models

Chat Completions API Models

  • GPT-4o - Most capable model with vision capabilities
  • GPT-4o-mini - Smaller, faster version of GPT-4o
  • GPT-4o-search-preview - GPT-4o with built-in web search
  • GPT-4o-mini-search-preview - GPT-4o-mini with built-in web search
  • GPT-4 Turbo - Latest GPT-4 with 128k context
  • GPT-4 - Original GPT-4 model
  • GPT-3.5 Turbo - Fast and cost-effective

Responses API Models

  • GPT-5 - Advanced model with support for all built-in tools
  • GPT-4.1 - Enhanced GPT-4 with tool support
  • GPT-4.1-mini - Efficient version with tool support
  • o3 - Reasoning model with advanced capabilities
  • o4-mini - Compact reasoning model

Note: Built-in tools like MCP and image generation require the Responses API and compatible models.

Features

Function Calling

OpenAI supports native function calling with automatic tool execution:

ruby
class DataAnalysisAgent < ApplicationAgent
  generate_with :openai, model: "gpt-4o"
  
  def analyze_data
    @data = params[:data]
    prompt  # Will include all public methods as available tools
  end
  
  def calculate_average(numbers:)
    numbers.sum.to_f / numbers.size
  end
  
  def fetch_external_data(endpoint:)
    # Tool that OpenAI can call
    HTTParty.get(endpoint)
  end
end

Streaming Responses

Enable real-time streaming for better user experience:

ruby
class StreamingAgent < ApplicationAgent
  generate_with :openai, stream: true
  
  on_message_chunk do |chunk|
    # Handle streaming chunks
    broadcast_to_user(chunk)
  end
  
  def chat
    prompt(message: params[:message])
  end
end

Vision Capabilities

GPT-4o models support image analysis:

ruby
class VisionAgent < ApplicationAgent
  generate_with :openai, model: "gpt-4o"
  
  def analyze_image
    @image_url = params[:image_url]
    prompt content_type: :text
  end
end

# In your view (analyze_image.text.erb):
# Analyze this image: <%= @image_url %>

Structured Output

OpenAI provides native structured output support, ensuring responses conform to specified JSON schemas. This feature is available with GPT-4o, GPT-4o-mini, and GPT-3.5-turbo models.

Supported Models

Models with full structured output support:

  • GPT-4o - Vision + structured output
  • GPT-4o-mini - Vision + structured output
  • GPT-4-turbo - Structured output only (no vision)
  • GPT-3.5-turbo - Structured output only

Basic Usage

Enable JSON mode with a schema:

ruby
class StructuredAgent < ApplicationAgent
  generate_with :openai, 
    model: "gpt-4o",
    response_format: { type: "json_object" }
  
  def extract_entities
    @text = params[:text]
    prompt(
      output_schema: :entity_extraction,
      instructions: "Extract entities and return as JSON"
    )
  end
end

With Schema Generator

Use ActiveAgent's schema generator for automatic schema creation:

ruby
# frozen_string_literal: true

require "test_helper"
require "active_agent/schema_generator"

class StructuredOutputJsonParsingTest < ActiveSupport::TestCase
  class DataExtractionAgent < ApplicationAgent
    generate_with :openai

    def extract_user_data
      prompt(
        message: params[:message] || "Extract the following user data from this text: John Doe is 30 years old and his email is john@example.com",
        output_schema: params[:output_schema]
      )
    end

    def extract_with_model_schema
      prompt(
        message: "Extract user information from: Jane Smith, age 25, contact: jane.smith@email.com",
        output_schema: params[:output_schema]
      )
    end

    def extract_with_active_record_schema
      prompt(
        message: "Extract user data from: Alice Johnson, 28 years old, email: alice@example.com, bio: Software engineer",
        output_schema: params[:output_schema]
      )
    end

    # Remove the after_generation callback for now - focus on testing the core functionality
  end

  test "structured output sets content_type to application/json and auto-parses JSON" do
    VCR.use_cassette("structured_output_json_parsing") do
      # Create a test model class with schema generator
      test_user_model = Class.new do
        include ActiveModel::Model
        include ActiveModel::Attributes
        include ActiveModel::Validations
        include ActiveAgent::SchemaGenerator

        attribute :name, :string
        attribute :age, :integer
        attribute :email, :string

        validates :name, presence: true
        validates :age, presence: true, numericality: { greater_than: 0 }
        validates :email, presence: true, format: { with: URI::MailTo::EMAIL_REGEXP }
      end

      # Generate schema from the model using the schema generator
      schema = test_user_model.to_json_schema(strict: true, name: "user_data")

      # Generate with structured output using the .with pattern
      response = DataExtractionAgent.with(output_schema: schema).extract_user_data.generate_now

      # Verify content_type is set to application/json
      assert_equal "application/json", response.message.content_type

      # Verify content is automatically parsed as JSON
      assert response.message.content.is_a?(Hash)
      assert response.message.content.key?("name")
      assert response.message.content.key?("age")

      # Verify raw content is still available as string
      assert response.message.raw_content.is_a?(String)

      doc_example_output(response)
    end
  end

  test "integration with ActiveModel schema generator for structured output" do
    VCR.use_cassette("structured_output_with_model_schema") do
      # Create an ActiveModel class for testing
      test_model = Class.new do
        include ActiveModel::Model
        include ActiveModel::Attributes
        include ActiveAgent::SchemaGenerator

        attribute :name, :string
        attribute :age, :integer
        attribute :email, :string
      end

      # Generate schema from ActiveModel
      schema = test_model.to_json_schema(strict: true, name: "user_data")

      # Generate response using model-generated schema
      response = DataExtractionAgent.with(output_schema: schema).extract_with_model_schema.generate_now

      # Verify content_type
      assert_equal "application/json", response.message.content_type

      # Verify JSON was automatically parsed
      assert response.message.content.is_a?(Hash)
      assert response.message.content.key?("name")
      assert response.message.content.key?("age")
      assert response.message.content.key?("email")

      # Verify values make sense
      assert_equal "Jane Smith", response.message.content["name"]
      assert_equal 25, response.message.content["age"]
      assert response.message.content["email"].include?("@")

      doc_example_output(response)
    end
  end

  test "integration with ActiveRecord schema generator for structured output" do
    VCR.use_cassette("structured_output_with_active_record_schema") do
      # Use the existing User model from test/dummy
      require_relative "../dummy/app/models/user"

      # Generate schema from ActiveRecord model
      schema = User.to_json_schema(strict: true, name: "user_data")

      # Generate response using ActiveRecord-generated schema
      response = DataExtractionAgent.with(output_schema: schema).extract_with_active_record_schema.generate_now

      # Verify content_type
      assert_equal "application/json", response.message.content_type

      # Verify JSON was automatically parsed
      assert response.message.content.is_a?(Hash)
      assert response.message.content.key?("name")
      assert response.message.content.key?("email")
      assert response.message.content.key?("age")

      # Verify the data makes sense
      assert response.message.content["name"].is_a?(String)
      assert response.message.content["age"].is_a?(Integer)
      assert response.message.content["email"].include?("@")

      doc_example_output(response)
    end
  end

  test "without structured output uses text/plain content_type" do
    VCR.use_cassette("plain_text_response") do
      # Generate without structured output (no output_schema)
      response = DataExtractionAgent.with(message: "What is the capital of France?").prompt_context.generate_now

      # Verify content_type is plain text
      assert_equal "text/plain", response.message.content_type

      # Content should not be parsed as JSON
      assert response.message.content.is_a?(String)
      assert response.message.content.downcase.include?("paris")

      doc_example_output(response)
    end
  end

  test "handles invalid JSON gracefully" do
    # This test ensures that if for some reason the provider returns invalid JSON
    # with application/json content_type, we handle it gracefully

    # Create a message with invalid JSON but JSON content_type
    message = ActiveAgent::ActionPrompt::Message.new(
      content: "{invalid json}",
      content_type: "application/json",
      role: :assistant
    )

    # Should return the raw string since parsing failed
    assert_equal "{invalid json}", message.content
    assert_equal "{invalid json}", message.raw_content
  end
end

Strict Mode

OpenAI supports strict schema validation to guarantee output format:

ruby
schema = {
  name: "user_data",
  strict: true,
  schema: {
    type: "object",
    properties: {
      name: { type: "string" },
      age: { type: "integer" },
      email: { type: "string", format: "email" }
    },
    required: ["name", "age", "email"],
    additionalProperties: false
  }
}

response = agent.prompt(
  message: "Extract user information",
  output_schema: schema
).generate_now

Response Handling

Structured output responses are automatically parsed:

ruby
response = OpenAIAgent.with(
  message: "Extract data from: John Doe, 30, john@example.com"
).extract_with_schema.generate_now

# Automatic JSON parsing
response.message.content_type # => "application/json"
response.message.content # => {"name" => "John Doe", "age" => 30, "email" => "john@example.com"}
response.message.raw_content # => '{"name":"John Doe","age":30,"email":"john@example.com"}'

Best Practices

  1. Use strict mode for production applications requiring guaranteed format
  2. Leverage model schemas from ActiveRecord/ActiveModel for consistency
  3. Test with VCR to ensure schemas work with actual API responses
  4. Handle edge cases like empty or invalid inputs gracefully

Limitations

  • Maximum schema complexity varies by model
  • Very large schemas may impact token limits
  • Not all JSON Schema features are supported (check OpenAI docs for specifics)

See the Structured Output guide for comprehensive documentation and examples.

Built-in Tools (Responses API)

OpenAI's Responses API provides powerful built-in tools for web search, image generation, and MCP integration:

Enable web search capabilities using the web_search_preview tool:

ruby
# Example agent demonstrating web search capabilities
# Works with both Chat Completions API and Responses API
class WebSearchAgent < ApplicationAgent
  # For Chat API, use the search-preview models
  # For Responses API, use regular models with web_search_preview tool
  generate_with :openai, model: "gpt-4o"

  # Action for searching current events using Chat API with web search model
  def search_current_events
    @query = params[:query]
    @location = params[:location]

    # When using gpt-4o-search-preview model, web search is automatic
    prompt(
      message: @query,
      options: chat_api_search_options
    )
  end

  # Action for searching with Responses API (more flexible)
  def search_with_tools
    @query = params[:query]
    @context_size = params[:context_size] || "medium"

    prompt(
      message: @query,
      options: {
        use_responses_api: true,  # Force Responses API
        tools: [
          {
            type: "web_search_preview",
            search_context_size: @context_size
          }
        ]
      }
    )
  end

  # Action that combines web search with image generation (Responses API only)
  def research_and_visualize
    @topic = params[:topic]

    prompt(
      message: "Research #{@topic} and create a visualization",
      options: {
        model: "gpt-5",  # Responses API model
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          { type: "image_generation", size: "1024x1024", quality: "high" }
        ]
      }
    )
  end

  private

  def chat_api_search_options
    options = {
      model: "gpt-4o-search-preview"  # Special model for Chat API web search
    }

    # Add web_search_options for Chat API
    if @location
      options[:web_search] = {
        user_location: format_location(@location)
      }
    else
      options[:web_search] = {}  # Enable web search with defaults
    end

    options
  end

  def format_location(location)
    # Format location for API
    {
      country: location[:country] || "US",
      city: location[:city],
      region: location[:region],
      timezone: location[:timezone]
    }.compact
  end
end

For Chat Completions API with specific models, use web_search_options:

ruby
# Example agent demonstrating web search capabilities
# Works with both Chat Completions API and Responses API
class WebSearchAgent < ApplicationAgent
  # For Chat API, use the search-preview models
  # For Responses API, use regular models with web_search_preview tool
  generate_with :openai, model: "gpt-4o"

  # Action for searching current events using Chat API with web search model
  def search_current_events
    @query = params[:query]
    @location = params[:location]

    # When using gpt-4o-search-preview model, web search is automatic
    prompt(
      message: @query,
      options: chat_api_search_options
    )
  end

  # Action for searching with Responses API (more flexible)
  def search_with_tools
    @query = params[:query]
    @context_size = params[:context_size] || "medium"

    prompt(
      message: @query,
      options: {
        use_responses_api: true,  # Force Responses API
        tools: [
          {
            type: "web_search_preview",
            search_context_size: @context_size
          }
        ]
      }
    )
  end

  # Action that combines web search with image generation (Responses API only)
  def research_and_visualize
    @topic = params[:topic]

    prompt(
      message: "Research #{@topic} and create a visualization",
      options: {
        model: "gpt-5",  # Responses API model
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          { type: "image_generation", size: "1024x1024", quality: "high" }
        ]
      }
    )
  end

  private

  def chat_api_search_options
    options = {
      model: "gpt-4o-search-preview"  # Special model for Chat API web search
    }

    # Add web_search_options for Chat API
    if @location
      options[:web_search] = {
        user_location: format_location(@location)
      }
    else
      options[:web_search] = {}  # Enable web search with defaults
    end

    options
  end

  def format_location(location)
    # Format location for API
    {
      country: location[:country] || "US",
      city: location[:city],
      region: location[:region],
      timezone: location[:timezone]
    }.compact
  end
end

Image Generation

Generate and edit images using the image_generation tool:

ruby
# Example agent demonstrating multimodal capabilities with built-in tools
# This agent uses the Responses API to access advanced tools
class MultimodalAgent < ApplicationAgent
  # Use default temperature for Responses API compatibility
  generate_with :openai, model: "gpt-4o", temperature: nil

  # Generate an image based on a description
  def create_image
    @description = params[:description]
    @size = params[:size] || "1024x1024"
    @quality = params[:quality] || "high"

    prompt(
      message: "Generate an image: #{@description}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            size: @size,
            quality: @quality,
            format: "png"
          }
        ]
      }
    )
  end

  # Research a topic and create an infographic
  def create_infographic
    @topic = params[:topic]
    @style = params[:style] || "modern"

    prompt(
      message: build_infographic_prompt,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          {
            type: "image_generation",
            size: "1024x1536",  # Tall format for infographic
            quality: "high",
            background: "opaque"
          }
        ]
      }
    )
  end

  # Analyze an image and search for related information
  def analyze_and_research
    @image_data = params[:image_data]  # Base64 encoded image
    @question = params[:question]

    prompt(
      message: @question,
      image_data: @image_data,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview" }
        ]
      }
    )
  end

  # Edit an existing image with AI
  def edit_image
    @original_image = params[:original_image]
    @instructions = params[:instructions]

    prompt(
      message: @instructions,
      image_data: @original_image,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            partial_images: 2  # Show progress during generation
          }
        ]
      }
    )
  end

  private

  def build_infographic_prompt
    <<~PROMPT
      Create a #{@style} infographic about #{@topic}.

      First, research the topic to gather accurate, up-to-date information.
      Then generate a visually appealing infographic that includes:
      - Key statistics and facts
      - Clear visual hierarchy
      - #{@style} design aesthetic
      - Easy-to-read layout

      Make it informative and visually engaging.
    PROMPT
  end
end

MCP (Model Context Protocol) Integration

Connect to external services and MCP servers:

ruby
# Example agent demonstrating MCP (Model Context Protocol) integration
# MCP allows connecting to external services and tools
class McpIntegrationAgent < ApplicationAgent
  generate_with :openai, model: "gpt-5"  # Responses API required for MCP

  # Use MCP connectors for cloud storage services
  def search_cloud_storage
    @query = params[:query]
    @service = params[:service] || "dropbox"
    @auth_token = params[:auth_token]

    prompt(
      message: "Search for: #{@query}",
      options: {
        use_responses_api: true,
        tools: [ build_connector_tool(@service, @auth_token) ]
      }
    )
  end

  # Use custom MCP server for specialized functionality
  def use_custom_mcp
    @query = params[:query]
    @server_url = params[:server_url]
    @allowed_tools = params[:allowed_tools]

    prompt(
      message: @query,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: "Custom MCP Server",
            server_url: @server_url,
            server_description: "Custom MCP server for specialized tasks",
            require_approval: "always",  # Require approval for safety
            allowed_tools: @allowed_tools
          }
        ]
      }
    )
  end

  # Combine multiple MCP servers for comprehensive search
  def multi_source_search
    @query = params[:query]
    @sources = params[:sources] || [ "github", "dropbox" ]
    @auth_tokens = params[:auth_tokens] || {}

    tools = @sources.map do |source|
      case source
      when "github"
        {
          type: "mcp",
          server_label: "GitHub",
          server_url: "https://api.githubcopilot.com/mcp/",
          server_description: "Search GitHub repositories",
          require_approval: "never"
        }
      when "dropbox"
        build_connector_tool("dropbox", @auth_tokens["dropbox"])
      when "google_drive"
        build_connector_tool("google_drive", @auth_tokens["google_drive"])
      end
    end.compact

    prompt(
      message: "Search across multiple sources: #{@query}",
      options: {
        use_responses_api: true,
        tools: tools
      }
    )
  end

  # Use MCP with approval workflow
  def sensitive_operation
    @operation = params[:operation]
    @mcp_config = params[:mcp_config]

    prompt(
      message: "Perform operation: #{@operation}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: @mcp_config[:label],
            server_url: @mcp_config[:url],
            authorization: @mcp_config[:auth],
            require_approval: {
              never: {
                tool_names: [ "read", "search" ]  # Safe operations
              }
            }
            # All other operations will require approval
          }
        ]
      }
    )
  end

  private

  def build_connector_tool(service, auth_token)
    connector_configs = {
      "dropbox" => {
        connector_id: "connector_dropbox",
        label: "Dropbox"
      },
      "google_drive" => {
        connector_id: "connector_googledrive",
        label: "Google Drive"
      },
      "gmail" => {
        connector_id: "connector_gmail",
        label: "Gmail"
      },
      "sharepoint" => {
        connector_id: "connector_sharepoint",
        label: "SharePoint"
      },
      "outlook" => {
        connector_id: "connector_outlookemail",
        label: "Outlook Email"
      }
    }

    config = connector_configs[service]
    return nil unless config && auth_token

    {
      type: "mcp",
      server_label: config[:label],
      connector_id: config[:connector_id],
      authorization: auth_token,
      require_approval: "never"  # Or configure based on your needs
    }
  end
end

Connect to custom MCP servers:

ruby
# Example agent demonstrating MCP (Model Context Protocol) integration
# MCP allows connecting to external services and tools
class McpIntegrationAgent < ApplicationAgent
  generate_with :openai, model: "gpt-5"  # Responses API required for MCP

  # Use MCP connectors for cloud storage services
  def search_cloud_storage
    @query = params[:query]
    @service = params[:service] || "dropbox"
    @auth_token = params[:auth_token]

    prompt(
      message: "Search for: #{@query}",
      options: {
        use_responses_api: true,
        tools: [ build_connector_tool(@service, @auth_token) ]
      }
    )
  end

  # Use custom MCP server for specialized functionality
  def use_custom_mcp
    @query = params[:query]
    @server_url = params[:server_url]
    @allowed_tools = params[:allowed_tools]

    prompt(
      message: @query,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: "Custom MCP Server",
            server_url: @server_url,
            server_description: "Custom MCP server for specialized tasks",
            require_approval: "always",  # Require approval for safety
            allowed_tools: @allowed_tools
          }
        ]
      }
    )
  end

  # Combine multiple MCP servers for comprehensive search
  def multi_source_search
    @query = params[:query]
    @sources = params[:sources] || [ "github", "dropbox" ]
    @auth_tokens = params[:auth_tokens] || {}

    tools = @sources.map do |source|
      case source
      when "github"
        {
          type: "mcp",
          server_label: "GitHub",
          server_url: "https://api.githubcopilot.com/mcp/",
          server_description: "Search GitHub repositories",
          require_approval: "never"
        }
      when "dropbox"
        build_connector_tool("dropbox", @auth_tokens["dropbox"])
      when "google_drive"
        build_connector_tool("google_drive", @auth_tokens["google_drive"])
      end
    end.compact

    prompt(
      message: "Search across multiple sources: #{@query}",
      options: {
        use_responses_api: true,
        tools: tools
      }
    )
  end

  # Use MCP with approval workflow
  def sensitive_operation
    @operation = params[:operation]
    @mcp_config = params[:mcp_config]

    prompt(
      message: "Perform operation: #{@operation}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: @mcp_config[:label],
            server_url: @mcp_config[:url],
            authorization: @mcp_config[:auth],
            require_approval: {
              never: {
                tool_names: [ "read", "search" ]  # Safe operations
              }
            }
            # All other operations will require approval
          }
        ]
      }
    )
  end

  private

  def build_connector_tool(service, auth_token)
    connector_configs = {
      "dropbox" => {
        connector_id: "connector_dropbox",
        label: "Dropbox"
      },
      "google_drive" => {
        connector_id: "connector_googledrive",
        label: "Google Drive"
      },
      "gmail" => {
        connector_id: "connector_gmail",
        label: "Gmail"
      },
      "sharepoint" => {
        connector_id: "connector_sharepoint",
        label: "SharePoint"
      },
      "outlook" => {
        connector_id: "connector_outlookemail",
        label: "Outlook Email"
      }
    }

    config = connector_configs[service]
    return nil unless config && auth_token

    {
      type: "mcp",
      server_label: config[:label],
      connector_id: config[:connector_id],
      authorization: auth_token,
      require_approval: "never"  # Or configure based on your needs
    }
  end
end

Available MCP Connectors:

  • Dropbox - connector_dropbox
  • Gmail - connector_gmail
  • Google Calendar - connector_googlecalendar
  • Google Drive - connector_googledrive
  • Microsoft Teams - connector_microsoftteams
  • Outlook Calendar - connector_outlookcalendar
  • Outlook Email - connector_outlookemail
  • SharePoint - connector_sharepoint
  • GitHub - Use server URL: https://api.githubcopilot.com/mcp/

Combining Multiple Tools

Use multiple built-in tools together:

ruby
# Example agent demonstrating multimodal capabilities with built-in tools
# This agent uses the Responses API to access advanced tools
class MultimodalAgent < ApplicationAgent
  # Use default temperature for Responses API compatibility
  generate_with :openai, model: "gpt-4o", temperature: nil

  # Generate an image based on a description
  def create_image
    @description = params[:description]
    @size = params[:size] || "1024x1024"
    @quality = params[:quality] || "high"

    prompt(
      message: "Generate an image: #{@description}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            size: @size,
            quality: @quality,
            format: "png"
          }
        ]
      }
    )
  end

  # Research a topic and create an infographic
  def create_infographic
    @topic = params[:topic]
    @style = params[:style] || "modern"

    prompt(
      message: build_infographic_prompt,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          {
            type: "image_generation",
            size: "1024x1536",  # Tall format for infographic
            quality: "high",
            background: "opaque"
          }
        ]
      }
    )
  end

  # Analyze an image and search for related information
  def analyze_and_research
    @image_data = params[:image_data]  # Base64 encoded image
    @question = params[:question]

    prompt(
      message: @question,
      image_data: @image_data,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview" }
        ]
      }
    )
  end

  # Edit an existing image with AI
  def edit_image
    @original_image = params[:original_image]
    @instructions = params[:instructions]

    prompt(
      message: @instructions,
      image_data: @original_image,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            partial_images: 2  # Show progress during generation
          }
        ]
      }
    )
  end

  private

  def build_infographic_prompt
    <<~PROMPT
      Create a #{@style} infographic about #{@topic}.

      First, research the topic to gather accurate, up-to-date information.
      Then generate a visually appealing infographic that includes:
      - Key statistics and facts
      - Clear visual hierarchy
      - #{@style} design aesthetic
      - Easy-to-read layout

      Make it informative and visually engaging.
    PROMPT
  end
end

Using Concerns for Shared Tools

Create reusable tool configurations with concerns:

ruby
# Concern that provides research-related tools that work with both
# OpenAI Responses API (built-in tools) and Chat Completions API (function calling)
module ResearchTools
  extend ActiveSupport::Concern

  included do
    # Class-level configuration for built-in tools
    class_attribute :research_tools_config, default: {}
  end

  # Action methods that become function tools in Chat API
  # These are standard ActiveAgent actions that get converted to tool schemas

  def search_academic_papers
    @query = params[:query]
    @year_from = params[:year_from]
    @year_to = params[:year_to]
    @field = params[:field]

    prompt(
      message: build_academic_search_query,
      # For Responses API - add web search as built-in tool
      tools: responses_api? ? [ { type: "web_search_preview", search_context_size: "high" } ] : nil
    )
  end

  def analyze_research_data
    @data = params[:data]
    @analysis_type = params[:analysis_type]

    prompt(
      message: "Analyze the following research data:\n#{@data}\nAnalysis type: #{@analysis_type}",
      content_type: :json
    )
  end

  def generate_research_visualization
    @data = params[:data]
    @chart_type = params[:chart_type] || "bar"
    @title = params[:title]

    prompt(
      message: "Create a #{@chart_type} chart visualization for: #{@title}\nData: #{@data}",
      # For Responses API - add image generation as built-in tool
      tools: responses_api? ? [
        {
          type: "image_generation",
          size: "1024x1024",
          quality: "high"
        }
      ] : nil
    )
  end

  def search_with_mcp_sources
    @query = params[:query]
    @sources = params[:sources] || []

    # Build MCP tools configuration based on requested sources
    mcp_tools = build_mcp_tools(@sources)

    prompt(
      message: "Research query: #{@query}",
      tools: responses_api? ? mcp_tools : nil
    )
  end

  private

  def build_academic_search_query
    query_parts = [ "Academic papers search: #{@query}" ]
    query_parts << "Published between #{@year_from} and #{@year_to}" if @year_from && @year_to
    query_parts << "Field: #{@field}" if @field
    query_parts << "Include citations and abstracts"
    query_parts.join("\n")
  end

  def build_mcp_tools(sources)
    tools = []

    sources.each do |source|
      case source
      when "arxiv"
        tools << {
          type: "mcp",
          server_label: "ArXiv Papers",
          server_url: "https://arxiv-mcp.example.com/sse",
          server_description: "Search and retrieve academic papers from ArXiv",
          require_approval: "never",
          allowed_tools: [ "search_papers", "get_paper", "get_citations" ]
        }
      when "pubmed"
        tools << {
          type: "mcp",
          server_label: "PubMed",
          server_url: "https://pubmed-mcp.example.com/sse",
          server_description: "Search medical and life science literature",
          require_approval: "never"
        }
      when "github"
        tools << {
          type: "mcp",
          server_label: "GitHub Research",
          server_url: "https://api.githubcopilot.com/mcp/",
          server_description: "Search code repositories and documentation",
          require_approval: "never"
        }
      end
    end

    tools
  end

  def responses_api?
    # Check if we're using the Responses API
    # For now, we'll check if the model or options indicate Responses API usage
    false # This would be determined by the actual provider configuration
  end

  class_methods do
    # Class method to configure research tools for the agent
    def configure_research_tools(**options)
      self.research_tools_config = research_tools_config.merge(options)
    end
  end
end

Use the concern in your agents:

ruby
class ResearchAgent < ApplicationAgent
  include ResearchTools

  # Configure the agent to use OpenAI with specific settings
  generate_with :openai, model: "gpt-4o"

  # Configure research tools at the class level
  configure_research_tools(
    enable_web_search: true,
    mcp_servers: [ "arxiv", "github" ],
    default_search_context: "high"
  )

  # Agent-specific action that uses both concern tools and custom logic
  def comprehensive_research
    @topic = params[:topic]
    @depth = params[:depth] || "detailed"

    # This action combines multiple tools
    prompt(
      message: "Conduct comprehensive research on: #{@topic}",
      tools: build_comprehensive_tools
    )
  end

  def literature_review
    @topic = params[:topic]
    @sources = params[:sources] || [ "arxiv", "pubmed" ]

    # Use the concern's search_with_mcp_sources internally
    mcp_tools = build_mcp_tools(@sources)

    prompt(
      message: "Conduct a literature review on: #{@topic}\nFocus on peer-reviewed sources from the last 5 years.",
      tools: [
        { type: "web_search_preview", search_context_size: "high" },
        *mcp_tools
      ]
    )
  end

  private

  def build_comprehensive_tools
    tools = []

    # Add web search for general information
    tools << {
      type: "web_search_preview",
      search_context_size: @depth == "detailed" ? "high" : "medium"
    }

    # Add MCP servers from configuration
    if research_tools_config[:mcp_servers]
      tools.concat(build_mcp_tools(research_tools_config[:mcp_servers]))
    end

    # Add image generation for visualizations
    if @depth == "detailed"
      tools << {
        type: "image_generation",
        size: "1024x1024",
        quality: "high"
      }
    end

    tools
  end
end

Tool Configuration Example

Here's how built-in tools are configured in the prompt options:

ruby
test "tool configuration in prompt options" do
  # Example showing how to configure built-in tools
  tools_config = [
    {
      type: "web_search_preview",
      search_context_size: "high",
      user_location: {
        country: "US",
        city: "San Francisco"
      }
    },
    {
      type: "image_generation",
      size: "1024x1024",
      quality: "high",
      format: "png"
    },
    {
      type: "mcp",
      server_label: "GitHub",
      server_url: "https://api.githubcopilot.com/mcp/",
      require_approval: "never"
    }
  ]

  # Show how the options would be passed to prompt
  example_options = {
    use_responses_api: true,
    model: "gpt-5",
    tools: tools_config
  }

  # Verify the configuration structure
  assert example_options[:tools].is_a?(Array)
  assert_equal 3, example_options[:tools].length
  assert_equal "web_search_preview", example_options[:tools][0][:type]
  assert_equal "image_generation", example_options[:tools][1][:type]
  assert_equal "mcp", example_options[:tools][2][:type]

  doc_example_output({
    description: "Example configuration for built-in tools in prompt options",
    options: example_options,
    tools_configured: tools_config
  })
end
Configuration Output

activeagent/test/agents/builtin_tools_doc_test.rb:108

json
{
  "description": "Example configuration for built-in tools in prompt options",
  "options": {
    "use_responses_api": true,
    "model": "gpt-5",
    "tools": [
      {
        "type": "web_search_preview",
        "search_context_size": "high",
        "user_location": {
          "country": "US",
          "city": "San Francisco"
        }
      },
      {
        "type": "image_generation",
        "size": "1024x1024",
        "quality": "high",
        "format": "png"
      },
      {
        "type": "mcp",
        "server_label": "GitHub",
        "server_url": "https://api.githubcopilot.com/mcp/",
        "require_approval": "never"
      }
    ]
  },
  "tools_configured": [
    {
      "type": "web_search_preview",
      "search_context_size": "high",
      "user_location": {
        "country": "US",
        "city": "San Francisco"
      }
    },
    {
      "type": "image_generation",
      "size": "1024x1024",
      "quality": "high",
      "format": "png"
    },
    {
      "type": "mcp",
      "server_label": "GitHub",
      "server_url": "https://api.githubcopilot.com/mcp/",
      "require_approval": "never"
    }
  ]
}

Embeddings

Generate high-quality text embeddings using OpenAI's embedding models. See the Embeddings Framework Documentation for comprehensive coverage.

Basic Embedding Generation

ruby
test "uses configured OpenAI embedding model" do
  VCR.use_cassette("embedding_openai_model") do
    # Create agent with specific OpenAI model configuration
    custom_agent_class = Class.new(ApplicationAgent) do
      generate_with :openai,
        model: "gpt-4o",
        embedding_model: "text-embedding-3-small"
    end

    generation = custom_agent_class.with(
      message: "Testing OpenAI embedding model configuration"
    ).prompt_context

    response = generation.embed_now
    embedding = response.message.content

    # text-embedding-3-small can have different dimensions depending on truncation
    assert_includes [ 1536, 3072 ], embedding.size
    assert embedding.all? { |v| v.is_a?(Float) }

    doc_example_output({
      model: "text-embedding-3-small",
      dimensions: embedding.size,
      sample: embedding[0..2]
    })
  end
end
Response Example

Available Embedding Models

  • text-embedding-3-large - Highest quality (3072 dimensions, configurable down to 256)
  • text-embedding-3-small - Balanced performance (1536 dimensions, configurable)
  • text-embedding-ada-002 - Legacy model (1536 dimensions, fixed)

For detailed model comparisons and benchmarks, see OpenAI's Embeddings Documentation.

Similarity Search Example

ruby
test "performs similarity search with embeddings" do
  VCR.use_cassette("embedding_similarity_search") do
    documents = [
      "The cat sat on the mat",
      "Dogs are loyal companions",
      "Machine learning is a subset of AI",
      "The feline rested on the rug"
    ]

    # Generate embeddings for all documents
    embeddings = documents.map do |doc|
      generation = ApplicationAgent.with(message: doc).prompt_context
      generation.embed_now.message.content
    end

    # Query embedding
    query = "cat on mat"
    query_generation = ApplicationAgent.with(message: query).prompt_context
    query_embedding = query_generation.embed_now.message.content

    # Calculate cosine similarities
    similarities = embeddings.map.with_index do |embedding, index|
      similarity = cosine_similarity(query_embedding, embedding)
      { document: documents[index], similarity: similarity }
    end

    # Sort by similarity
    results = similarities.sort_by { |s| -s[:similarity] }

    # Most similar should be the cat/mat documents
    assert_equal "The cat sat on the mat", results.first[:document]
    assert results.first[:similarity] > 0.5, "Similarity should be > 0.5, got #{results.first[:similarity]}"

    # Document the results
    doc_example_output(results.first(2))
  end
end
Response Example

For more advanced embedding patterns, see the Embeddings Documentation.

Dimension Configuration

OpenAI's text-embedding-3 models support configurable dimensions:

ruby
test "verifies embedding dimensions for different models" do
  VCR.use_cassette("embedding_dimensions") do
    # Test with default model (usually text-embedding-3-small or ada-002)
    generation = ApplicationAgent.with(
      message: "Testing embedding dimensions"
    ).prompt_context

    response = generation.embed_now
    embedding = response.message.content

    # Most OpenAI models return 1536 dimensions by default
    assert_includes [ 1536, 3072 ], embedding.size

    doc_example_output({
      model: "default",
      dimensions: embedding.size,
      sample: embedding[0..4]
    })
  end
end
Response Example

Dimension Reduction

OpenAI's text-embedding-3-large and text-embedding-3-small models support native dimension reduction by specifying a dimensions parameter. This can significantly reduce storage costs while maintaining good performance.

Batch Processing

Efficiently process multiple embeddings:

ruby
test "processes multiple embeddings in batch" do
  VCR.use_cassette("embedding_batch_processing") do
    texts = [
      "First document for embedding",
      "Second document with different content",
      "Third document about technology"
    ]

    embeddings = []
    texts.each do |text|
      generation = ApplicationAgent.with(message: text).prompt_context
      embedding = generation.embed_now.message.content
      embeddings << {
        text: text[0..20] + "...",
        dimensions: embedding.size,
        sample: embedding[0..2]
      }
    end

    assert_equal 3, embeddings.size
    embeddings.each do |result|
      assert result[:dimensions] > 0
      assert result[:sample].all? { |v| v.is_a?(Float) }
    end

    doc_example_output(embeddings)
  end
end
Response Example

Cost Optimization for Embeddings

Choose the right model based on your needs:

ModelDimensionsCost per 1M tokensBest for
text-embedding-3-large3072 (configurable)$0.13Highest quality, semantic search
text-embedding-3-small1536 (configurable)$0.02Good balance, most applications
text-embedding-ada-0021536$0.10Legacy support

Cost Savings

  • Use text-embedding-3-small for most applications (85% cheaper than large)
  • Cache embeddings aggressively - they don't change for the same input
  • Consider dimension reduction for large-scale applications

Provider-Specific Parameters

Model Parameters

  • model - Model identifier (e.g., "gpt-4o", "gpt-3.5-turbo")
  • embedding_model - Embedding model (e.g., "text-embedding-3-large")
  • dimensions - Reduced dimensions for embeddings (for 3-large and 3-small models)
  • temperature - Controls randomness (0.0 to 2.0)
  • max_tokens - Maximum tokens in response
  • top_p - Nucleus sampling parameter
  • frequency_penalty - Penalize frequent tokens (-2.0 to 2.0)
  • presence_penalty - Penalize new topics (-2.0 to 2.0)
  • seed - For deterministic outputs
  • response_format - Output format ({ type: "json_object" } or { type: "text" })

Organization Settings

  • organization_id - OpenAI organization ID
  • project_id - OpenAI project ID for usage tracking

Advanced Options

  • stream - Enable streaming responses (true/false)
  • tools - Array of built-in tools for Responses API (web_search_preview, image_generation, mcp)
  • tool_choice - Control tool usage ("auto", "required", "none", or specific tool)
  • parallel_tool_calls - Allow parallel tool execution (true/false)
  • use_responses_api - Force use of Responses API (true/false)
  • web_search - Web search configuration for Chat API with search-preview models
  • web_search_options - Alternative parameter name for web search in Chat API

Azure OpenAI

For Azure OpenAI Service, configure a custom host:

ruby
class AzureAgent < ApplicationAgent
  generate_with :openai,
    access_token: Rails.application.credentials.dig(:azure, :api_key),
    host: "https://your-resource.openai.azure.com",
    api_version: "2024-02-01",
    model: "your-deployment-name"
end

Error Handling

Handle OpenAI-specific errors:

ruby
class RobustAgent < ApplicationAgent
  generate_with :openai,
    max_retries: 3,
    request_timeout: 30
  
  rescue_from OpenAI::RateLimitError do |error|
    Rails.logger.error "Rate limit hit: #{error.message}"
    retry_with_backoff
  end
  
  rescue_from OpenAI::APIError do |error|
    Rails.logger.error "OpenAI API error: #{error.message}"
    fallback_response
  end
end

Testing

Use VCR for consistent tests:

ruby
require "test_helper"

class OpenAIAgentTest < ActiveAgentTestCase
  test "it renders a prompt_context generates a response" do
    VCR.use_cassette("openai_prompt_context_response") do
      message = "Show me a cat"
      prompt = OpenAIAgent.with(message: message).prompt_context
      response = prompt.generate_now
      assert_equal message, OpenAIAgent.with(message: message).prompt_context.message.content
      assert_equal 3, response.prompt.messages.size
      assert_equal :system, response.prompt.messages[0].role
      assert_equal :user, response.prompt.messages[1].role
      assert_equal :assistant, response.prompt.messages[2].role
    end
  end
end

class OpenAIClientTest < ActiveAgentTestCase
  def setup
    super
    # Configure OpenAI before tests
    OpenAI.configure do |config|
      config.access_token = "test-api-key"
      config.log_errors = Rails.env.development?
      config.request_timeout = 600
    end
  end

  test "loads configuration from environment" do
    # Use empty config to test environment-based configuration
    with_active_agent_config({}) do
      class OpenAIClientAgent < ApplicationAgent
        layout "agent"
        generate_with :openai
      end

      client = OpenAI::Client.new
      assert_equal OpenAIClientAgent.generation_provider.access_token, client.access_token
    end
  end
end

Cost Optimization

Use Appropriate Models

  • Use GPT-3.5 Turbo for simple tasks
  • Reserve GPT-4o for complex reasoning
  • Consider GPT-4o-mini for a balance

Optimize Token Usage

ruby
class EfficientAgent < ApplicationAgent
  generate_with :openai,
    model: "gpt-3.5-turbo",
    max_tokens: 500,  # Limit response length
    temperature: 0.3  # More focused responses
  
  def summarize
    @content = params[:content]
    # Truncate input if needed
    @content = @content.truncate(3000) if @content.length > 3000
    prompt
  end
end

Cache Responses

ruby
class CachedAgent < ApplicationAgent
  generate_with :openai
  
  def answer_faq
    question = params[:question]
    
    Rails.cache.fetch("faq/#{question.parameterize}", expires_in: 1.day) do
      prompt(message: question).generate_now
    end
  end
end

Best Practices

  1. Set appropriate temperature - Lower for factual tasks, higher for creative
  2. Use system messages effectively - Provide clear instructions
  3. Implement retry logic - Handle transient failures
  4. Monitor usage - Track token consumption and costs
  5. Use the latest models - They're often more capable and cost-effective
  6. Validate outputs - Especially for critical applications