OpenAI Provider

The OpenAI provider enables integration with OpenAI's GPT models including GPT-4, GPT-4 Turbo, and GPT-3.5 Turbo. It supports advanced features like function calling, streaming responses, and structured outputs.

Configuration

Basic Setup

Configure OpenAI in your agent:

ruby

class OpenAIAgent < ApplicationAgent
  layout "agent"
  generate_with :openai, model: "gpt-4o-mini", instructions: "You're a basic OpenAI agent."
end

Configuration File

Set up OpenAI credentials in config/active_agent.yml:

active_agent.ymlactive_agent.yml

yaml

openai: &openai
  service: "OpenAI"
  access_token: <%= Rails.application.credentials.dig(:openai, :access_token) %>

yaml

openai:
  <<: *openai
  model: "gpt-4o-mini"
  temperature: 0.7

Environment Variables

Alternatively, use environment variables:

bash

OPENAI_ACCESS_TOKEN=your-api-key
OPENAI_ORGANIZATION_ID=your-org-id  # Optional

Supported Models

Chat Completions API Models

GPT-4o - Most capable model with vision capabilities
GPT-4o-mini - Smaller, faster version of GPT-4o
GPT-4o-search-preview - GPT-4o with built-in web search
GPT-4o-mini-search-preview - GPT-4o-mini with built-in web search
GPT-4 Turbo - Latest GPT-4 with 128k context
GPT-4 - Original GPT-4 model
GPT-3.5 Turbo - Fast and cost-effective

Responses API Models

GPT-5 - Advanced model with support for all built-in tools
GPT-4.1 - Enhanced GPT-4 with tool support
GPT-4.1-mini - Efficient version with tool support
o3 - Reasoning model with advanced capabilities
o4-mini - Compact reasoning model

Note: Built-in tools like MCP and image generation require the Responses API and compatible models.

Features

Function Calling

OpenAI supports native function calling with automatic tool execution:

ruby

class DataAnalysisAgent < ApplicationAgent
  generate_with :openai, model: "gpt-4o"
  
  def analyze_data
    @data = params[:data]
    prompt  # Will include all public methods as available tools
  end
  
  def calculate_average(numbers:)
    numbers.sum.to_f / numbers.size
  end
  
  def fetch_external_data(endpoint:)
    # Tool that OpenAI can call
    HTTParty.get(endpoint)
  end
end

Streaming Responses

Enable real-time streaming for better user experience:

ruby

class StreamingAgent < ApplicationAgent
  generate_with :openai, stream: true
  
  on_message_chunk do |chunk|
    # Handle streaming chunks
    broadcast_to_user(chunk)
  end
  
  def chat
    prompt(message: params[:message])
  end
end

Vision Capabilities

GPT-4o models support image analysis:

ruby

class VisionAgent < ApplicationAgent
  generate_with :openai, model: "gpt-4o"
  
  def analyze_image
    @image_url = params[:image_url]
    prompt content_type: :text
  end
end

# In your view (analyze_image.text.erb):
# Analyze this image: <%= @image_url %>

Structured Output

OpenAI provides native structured output support, ensuring responses conform to specified JSON schemas. This feature is available with GPT-4o, GPT-4o-mini, and GPT-3.5-turbo models.

Supported Models

Models with full structured output support:

GPT-4o - Vision + structured output
GPT-4o-mini - Vision + structured output
GPT-4-turbo - Structured output only (no vision)
GPT-3.5-turbo - Structured output only

Basic Usage

Enable JSON mode with a schema:

ruby

class StructuredAgent < ApplicationAgent
  generate_with :openai, 
    model: "gpt-4o",
    response_format: { type: "json_object" }
  
  def extract_entities
    @text = params[:text]
    prompt(
      output_schema: :entity_extraction,
      instructions: "Extract entities and return as JSON"
    )
  end
end

With Schema Generator

Use ActiveAgent's schema generator for automatic schema creation:

ruby

# frozen_string_literal: true

require "test_helper"
require "active_agent/schema_generator"

class StructuredOutputJsonParsingTest < ActiveSupport::TestCase
  class DataExtractionAgent < ApplicationAgent
    generate_with :openai

    def extract_user_data
      prompt(
        message: params[:message] || "Extract the following user data from this text: John Doe is 30 years old and his email is john@example.com",
        output_schema: params[:output_schema]
      )
    end

    def extract_with_model_schema
      prompt(
        message: "Extract user information from: Jane Smith, age 25, contact: jane.smith@email.com",
        output_schema: params[:output_schema]
      )
    end

    def extract_with_active_record_schema
      prompt(
        message: "Extract user data from: Alice Johnson, 28 years old, email: alice@example.com, bio: Software engineer",
        output_schema: params[:output_schema]
      )
    end

    # Remove the after_generation callback for now - focus on testing the core functionality
  end

  test "structured output sets content_type to application/json and auto-parses JSON" do
    VCR.use_cassette("structured_output_json_parsing") do
      # Create a test model class with schema generator
      test_user_model = Class.new do
        include ActiveModel::Model
        include ActiveModel::Attributes
        include ActiveModel::Validations
        include ActiveAgent::SchemaGenerator

        attribute :name, :string
        attribute :age, :integer
        attribute :email, :string

        validates :name, presence: true
        validates :age, presence: true, numericality: { greater_than: 0 }
        validates :email, presence: true, format: { with: URI::MailTo::EMAIL_REGEXP }
      end

      # Generate schema from the model using the schema generator
      schema = test_user_model.to_json_schema(strict: true, name: "user_data")

      # Generate with structured output using the .with pattern
      response = DataExtractionAgent.with(output_schema: schema).extract_user_data.generate_now

      # Verify content_type is set to application/json
      assert_equal "application/json", response.message.content_type

      # Verify content is automatically parsed as JSON
      assert response.message.content.is_a?(Hash)
      assert response.message.content.key?("name")
      assert response.message.content.key?("age")

      # Verify raw content is still available as string
      assert response.message.raw_content.is_a?(String)

      doc_example_output(response)
    end
  end

  test "integration with ActiveModel schema generator for structured output" do
    VCR.use_cassette("structured_output_with_model_schema") do
      # Create an ActiveModel class for testing
      test_model = Class.new do
        include ActiveModel::Model
        include ActiveModel::Attributes
        include ActiveAgent::SchemaGenerator

        attribute :name, :string
        attribute :age, :integer
        attribute :email, :string
      end

      # Generate schema from ActiveModel
      schema = test_model.to_json_schema(strict: true, name: "user_data")

      # Generate response using model-generated schema
      response = DataExtractionAgent.with(output_schema: schema).extract_with_model_schema.generate_now

      # Verify content_type
      assert_equal "application/json", response.message.content_type

      # Verify JSON was automatically parsed
      assert response.message.content.is_a?(Hash)
      assert response.message.content.key?("name")
      assert response.message.content.key?("age")
      assert response.message.content.key?("email")

      # Verify values make sense
      assert_equal "Jane Smith", response.message.content["name"]
      assert_equal 25, response.message.content["age"]
      assert response.message.content["email"].include?("@")

      doc_example_output(response)
    end
  end

  test "integration with ActiveRecord schema generator for structured output" do
    VCR.use_cassette("structured_output_with_active_record_schema") do
      # Use the existing User model from test/dummy
      require_relative "../dummy/app/models/user"

      # Generate schema from ActiveRecord model
      schema = User.to_json_schema(strict: true, name: "user_data")

      # Generate response using ActiveRecord-generated schema
      response = DataExtractionAgent.with(output_schema: schema).extract_with_active_record_schema.generate_now

      # Verify content_type
      assert_equal "application/json", response.message.content_type

      # Verify JSON was automatically parsed
      assert response.message.content.is_a?(Hash)
      assert response.message.content.key?("name")
      assert response.message.content.key?("email")
      assert response.message.content.key?("age")

      # Verify the data makes sense
      assert response.message.content["name"].is_a?(String)
      assert response.message.content["age"].is_a?(Integer)
      assert response.message.content["email"].include?("@")

      doc_example_output(response)
    end
  end

  test "without structured output uses text/plain content_type" do
    VCR.use_cassette("plain_text_response") do
      # Generate without structured output (no output_schema)
      response = DataExtractionAgent.with(message: "What is the capital of France?").prompt_context.generate_now

      # Verify content_type is plain text
      assert_equal "text/plain", response.message.content_type

      # Content should not be parsed as JSON
      assert response.message.content.is_a?(String)
      assert response.message.content.downcase.include?("paris")

      doc_example_output(response)
    end
  end

  test "handles invalid JSON gracefully" do
    # This test ensures that if for some reason the provider returns invalid JSON
    # with application/json content_type, we handle it gracefully

    # Create a message with invalid JSON but JSON content_type
    message = ActiveAgent::ActionPrompt::Message.new(
      content: "{invalid json}",
      content_type: "application/json",
      role: :assistant
    )

    # Should return the raw string since parsing failed
    assert_equal "{invalid json}", message.content
    assert_equal "{invalid json}", message.raw_content
  end
end

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170

Strict Mode

OpenAI supports strict schema validation to guarantee output format:

ruby

schema = {
  name: "user_data",
  strict: true,
  schema: {
    type: "object",
    properties: {
      name: { type: "string" },
      age: { type: "integer" },
      email: { type: "string", format: "email" }
    },
    required: ["name", "age", "email"],
    additionalProperties: false
  }
}

response = agent.prompt(
  message: "Extract user information",
  output_schema: schema
).generate_now

Response Handling

Structured output responses are automatically parsed:

ruby

response = OpenAIAgent.with(
  message: "Extract data from: John Doe, 30, john@example.com"
).extract_with_schema.generate_now

# Automatic JSON parsing
response.message.content_type # => "application/json"
response.message.content # => {"name" => "John Doe", "age" => 30, "email" => "john@example.com"}
response.message.raw_content # => '{"name":"John Doe","age":30,"email":"john@example.com"}'

Best Practices

Use strict mode for production applications requiring guaranteed format
Leverage model schemas from ActiveRecord/ActiveModel for consistency
Test with VCR to ensure schemas work with actual API responses
Handle edge cases like empty or invalid inputs gracefully

Limitations

Maximum schema complexity varies by model
Very large schemas may impact token limits
Not all JSON Schema features are supported (check OpenAI docs for specifics)

See the Structured Output guide for comprehensive documentation and examples.

Built-in Tools (Responses API)

OpenAI's Responses API provides powerful built-in tools for web search, image generation, and MCP integration:

Web Search

Enable web search capabilities using the web_search_preview tool:

ruby

# Example agent demonstrating web search capabilities
# Works with both Chat Completions API and Responses API
class WebSearchAgent < ApplicationAgent
  # For Chat API, use the search-preview models
  # For Responses API, use regular models with web_search_preview tool
  generate_with :openai, model: "gpt-4o"

  # Action for searching current events using Chat API with web search model
  def search_current_events
    @query = params[:query]
    @location = params[:location]

    # When using gpt-4o-search-preview model, web search is automatic
    prompt(
      message: @query,
      options: chat_api_search_options
    )
  end

  # Action for searching with Responses API (more flexible)
  def search_with_tools
    @query = params[:query]
    @context_size = params[:context_size] || "medium"

    prompt(
      message: @query,
      options: {
        use_responses_api: true,  # Force Responses API
        tools: [
          {
            type: "web_search_preview",
            search_context_size: @context_size
          }
        ]
      }
    )
  end

  # Action that combines web search with image generation (Responses API only)
  def research_and_visualize
    @topic = params[:topic]

    prompt(
      message: "Research #{@topic} and create a visualization",
      options: {
        model: "gpt-5",  # Responses API model
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          { type: "image_generation", size: "1024x1024", quality: "high" }
        ]
      }
    )
  end

  private

  def chat_api_search_options
    options = {
      model: "gpt-4o-search-preview"  # Special model for Chat API web search
    }

    # Add web_search_options for Chat API
    if @location
      options[:web_search] = {
        user_location: format_location(@location)
      }
    else
      options[:web_search] = {}  # Enable web search with defaults
    end

    options
  end

  def format_location(location)
    # Format location for API
    {
      country: location[:country] || "US",
      city: location[:city],
      region: location[:region],
      timezone: location[:timezone]
    }.compact
  end
end

For Chat Completions API with specific models, use web_search_options:

ruby

# Example agent demonstrating web search capabilities
# Works with both Chat Completions API and Responses API
class WebSearchAgent < ApplicationAgent
  # For Chat API, use the search-preview models
  # For Responses API, use regular models with web_search_preview tool
  generate_with :openai, model: "gpt-4o"

  # Action for searching current events using Chat API with web search model
  def search_current_events
    @query = params[:query]
    @location = params[:location]

    # When using gpt-4o-search-preview model, web search is automatic
    prompt(
      message: @query,
      options: chat_api_search_options
    )
  end

  # Action for searching with Responses API (more flexible)
  def search_with_tools
    @query = params[:query]
    @context_size = params[:context_size] || "medium"

    prompt(
      message: @query,
      options: {
        use_responses_api: true,  # Force Responses API
        tools: [
          {
            type: "web_search_preview",
            search_context_size: @context_size
          }
        ]
      }
    )
  end

  # Action that combines web search with image generation (Responses API only)
  def research_and_visualize
    @topic = params[:topic]

    prompt(
      message: "Research #{@topic} and create a visualization",
      options: {
        model: "gpt-5",  # Responses API model
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          { type: "image_generation", size: "1024x1024", quality: "high" }
        ]
      }
    )
  end

  private

  def chat_api_search_options
    options = {
      model: "gpt-4o-search-preview"  # Special model for Chat API web search
    }

    # Add web_search_options for Chat API
    if @location
      options[:web_search] = {
        user_location: format_location(@location)
      }
    else
      options[:web_search] = {}  # Enable web search with defaults
    end

    options
  end

  def format_location(location)
    # Format location for API
    {
      country: location[:country] || "US",
      city: location[:city],
      region: location[:region],
      timezone: location[:timezone]
    }.compact
  end
end

Image Generation

Generate and edit images using the image_generation tool:

ruby

# Example agent demonstrating multimodal capabilities with built-in tools
# This agent uses the Responses API to access advanced tools
class MultimodalAgent < ApplicationAgent
  # Use default temperature for Responses API compatibility
  generate_with :openai, model: "gpt-4o", temperature: nil

  # Generate an image based on a description
  def create_image
    @description = params[:description]
    @size = params[:size] || "1024x1024"
    @quality = params[:quality] || "high"

    prompt(
      message: "Generate an image: #{@description}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            size: @size,
            quality: @quality,
            format: "png"
          }
        ]
      }
    )
  end

  # Research a topic and create an infographic
  def create_infographic
    @topic = params[:topic]
    @style = params[:style] || "modern"

    prompt(
      message: build_infographic_prompt,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          {
            type: "image_generation",
            size: "1024x1536",  # Tall format for infographic
            quality: "high",
            background: "opaque"
          }
        ]
      }
    )
  end

  # Analyze an image and search for related information
  def analyze_and_research
    @image_data = params[:image_data]  # Base64 encoded image
    @question = params[:question]

    prompt(
      message: @question,
      image_data: @image_data,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview" }
        ]
      }
    )
  end

  # Edit an existing image with AI
  def edit_image
    @original_image = params[:original_image]
    @instructions = params[:instructions]

    prompt(
      message: @instructions,
      image_data: @original_image,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            partial_images: 2  # Show progress during generation
          }
        ]
      }
    )
  end

  private

  def build_infographic_prompt
    <<~PROMPT
      Create a #{@style} infographic about #{@topic}.

      First, research the topic to gather accurate, up-to-date information.
      Then generate a visually appealing infographic that includes:
      - Key statistics and facts
      - Clear visual hierarchy
      - #{@style} design aesthetic
      - Easy-to-read layout

      Make it informative and visually engaging.
    PROMPT
  end
end

MCP (Model Context Protocol) Integration

Connect to external services and MCP servers:

ruby

# Example agent demonstrating MCP (Model Context Protocol) integration
# MCP allows connecting to external services and tools
class McpIntegrationAgent < ApplicationAgent
  generate_with :openai, model: "gpt-5"  # Responses API required for MCP

  # Use MCP connectors for cloud storage services
  def search_cloud_storage
    @query = params[:query]
    @service = params[:service] || "dropbox"
    @auth_token = params[:auth_token]

    prompt(
      message: "Search for: #{@query}",
      options: {
        use_responses_api: true,
        tools: [ build_connector_tool(@service, @auth_token) ]
      }
    )
  end

  # Use custom MCP server for specialized functionality
  def use_custom_mcp
    @query = params[:query]
    @server_url = params[:server_url]
    @allowed_tools = params[:allowed_tools]

    prompt(
      message: @query,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: "Custom MCP Server",
            server_url: @server_url,
            server_description: "Custom MCP server for specialized tasks",
            require_approval: "always",  # Require approval for safety
            allowed_tools: @allowed_tools
          }
        ]
      }
    )
  end

  # Combine multiple MCP servers for comprehensive search
  def multi_source_search
    @query = params[:query]
    @sources = params[:sources] || [ "github", "dropbox" ]
    @auth_tokens = params[:auth_tokens] || {}

    tools = @sources.map do |source|
      case source
      when "github"
        {
          type: "mcp",
          server_label: "GitHub",
          server_url: "https://api.githubcopilot.com/mcp/",
          server_description: "Search GitHub repositories",
          require_approval: "never"
        }
      when "dropbox"
        build_connector_tool("dropbox", @auth_tokens["dropbox"])
      when "google_drive"
        build_connector_tool("google_drive", @auth_tokens["google_drive"])
      end
    end.compact

    prompt(
      message: "Search across multiple sources: #{@query}",
      options: {
        use_responses_api: true,
        tools: tools
      }
    )
  end

  # Use MCP with approval workflow
  def sensitive_operation
    @operation = params[:operation]
    @mcp_config = params[:mcp_config]

    prompt(
      message: "Perform operation: #{@operation}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: @mcp_config[:label],
            server_url: @mcp_config[:url],
            authorization: @mcp_config[:auth],
            require_approval: {
              never: {
                tool_names: [ "read", "search" ]  # Safe operations
              }
            }
            # All other operations will require approval
          }
        ]
      }
    )
  end

  private

  def build_connector_tool(service, auth_token)
    connector_configs = {
      "dropbox" => {
        connector_id: "connector_dropbox",
        label: "Dropbox"
      },
      "google_drive" => {
        connector_id: "connector_googledrive",
        label: "Google Drive"
      },
      "gmail" => {
        connector_id: "connector_gmail",
        label: "Gmail"
      },
      "sharepoint" => {
        connector_id: "connector_sharepoint",
        label: "SharePoint"
      },
      "outlook" => {
        connector_id: "connector_outlookemail",
        label: "Outlook Email"
      }
    }

    config = connector_configs[service]
    return nil unless config && auth_token

    {
      type: "mcp",
      server_label: config[:label],
      connector_id: config[:connector_id],
      authorization: auth_token,
      require_approval: "never"  # Or configure based on your needs
    }
  end
end

Connect to custom MCP servers:

ruby

# Example agent demonstrating MCP (Model Context Protocol) integration
# MCP allows connecting to external services and tools
class McpIntegrationAgent < ApplicationAgent
  generate_with :openai, model: "gpt-5"  # Responses API required for MCP

  # Use MCP connectors for cloud storage services
  def search_cloud_storage
    @query = params[:query]
    @service = params[:service] || "dropbox"
    @auth_token = params[:auth_token]

    prompt(
      message: "Search for: #{@query}",
      options: {
        use_responses_api: true,
        tools: [ build_connector_tool(@service, @auth_token) ]
      }
    )
  end

  # Use custom MCP server for specialized functionality
  def use_custom_mcp
    @query = params[:query]
    @server_url = params[:server_url]
    @allowed_tools = params[:allowed_tools]

    prompt(
      message: @query,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: "Custom MCP Server",
            server_url: @server_url,
            server_description: "Custom MCP server for specialized tasks",
            require_approval: "always",  # Require approval for safety
            allowed_tools: @allowed_tools
          }
        ]
      }
    )
  end

  # Combine multiple MCP servers for comprehensive search
  def multi_source_search
    @query = params[:query]
    @sources = params[:sources] || [ "github", "dropbox" ]
    @auth_tokens = params[:auth_tokens] || {}

    tools = @sources.map do |source|
      case source
      when "github"
        {
          type: "mcp",
          server_label: "GitHub",
          server_url: "https://api.githubcopilot.com/mcp/",
          server_description: "Search GitHub repositories",
          require_approval: "never"
        }
      when "dropbox"
        build_connector_tool("dropbox", @auth_tokens["dropbox"])
      when "google_drive"
        build_connector_tool("google_drive", @auth_tokens["google_drive"])
      end
    end.compact

    prompt(
      message: "Search across multiple sources: #{@query}",
      options: {
        use_responses_api: true,
        tools: tools
      }
    )
  end

  # Use MCP with approval workflow
  def sensitive_operation
    @operation = params[:operation]
    @mcp_config = params[:mcp_config]

    prompt(
      message: "Perform operation: #{@operation}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "mcp",
            server_label: @mcp_config[:label],
            server_url: @mcp_config[:url],
            authorization: @mcp_config[:auth],
            require_approval: {
              never: {
                tool_names: [ "read", "search" ]  # Safe operations
              }
            }
            # All other operations will require approval
          }
        ]
      }
    )
  end

  private

  def build_connector_tool(service, auth_token)
    connector_configs = {
      "dropbox" => {
        connector_id: "connector_dropbox",
        label: "Dropbox"
      },
      "google_drive" => {
        connector_id: "connector_googledrive",
        label: "Google Drive"
      },
      "gmail" => {
        connector_id: "connector_gmail",
        label: "Gmail"
      },
      "sharepoint" => {
        connector_id: "connector_sharepoint",
        label: "SharePoint"
      },
      "outlook" => {
        connector_id: "connector_outlookemail",
        label: "Outlook Email"
      }
    }

    config = connector_configs[service]
    return nil unless config && auth_token

    {
      type: "mcp",
      server_label: config[:label],
      connector_id: config[:connector_id],
      authorization: auth_token,
      require_approval: "never"  # Or configure based on your needs
    }
  end
end

Available MCP Connectors:

Dropbox - connector_dropbox
Gmail - connector_gmail
Google Calendar - connector_googlecalendar
Google Drive - connector_googledrive
Microsoft Teams - connector_microsoftteams
Outlook Calendar - connector_outlookcalendar
Outlook Email - connector_outlookemail
SharePoint - connector_sharepoint
GitHub - Use server URL: https://api.githubcopilot.com/mcp/

Combining Multiple Tools

Use multiple built-in tools together:

ruby

# Example agent demonstrating multimodal capabilities with built-in tools
# This agent uses the Responses API to access advanced tools
class MultimodalAgent < ApplicationAgent
  # Use default temperature for Responses API compatibility
  generate_with :openai, model: "gpt-4o", temperature: nil

  # Generate an image based on a description
  def create_image
    @description = params[:description]
    @size = params[:size] || "1024x1024"
    @quality = params[:quality] || "high"

    prompt(
      message: "Generate an image: #{@description}",
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            size: @size,
            quality: @quality,
            format: "png"
          }
        ]
      }
    )
  end

  # Research a topic and create an infographic
  def create_infographic
    @topic = params[:topic]
    @style = params[:style] || "modern"

    prompt(
      message: build_infographic_prompt,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview", search_context_size: "high" },
          {
            type: "image_generation",
            size: "1024x1536",  # Tall format for infographic
            quality: "high",
            background: "opaque"
          }
        ]
      }
    )
  end

  # Analyze an image and search for related information
  def analyze_and_research
    @image_data = params[:image_data]  # Base64 encoded image
    @question = params[:question]

    prompt(
      message: @question,
      image_data: @image_data,
      options: {
        use_responses_api: true,
        tools: [
          { type: "web_search_preview" }
        ]
      }
    )
  end

  # Edit an existing image with AI
  def edit_image
    @original_image = params[:original_image]
    @instructions = params[:instructions]

    prompt(
      message: @instructions,
      image_data: @original_image,
      options: {
        use_responses_api: true,
        tools: [
          {
            type: "image_generation",
            partial_images: 2  # Show progress during generation
          }
        ]
      }
    )
  end

  private

  def build_infographic_prompt
    <<~PROMPT
      Create a #{@style} infographic about #{@topic}.

      First, research the topic to gather accurate, up-to-date information.
      Then generate a visually appealing infographic that includes:
      - Key statistics and facts
      - Clear visual hierarchy
      - #{@style} design aesthetic
      - Easy-to-read layout

      Make it informative and visually engaging.
    PROMPT
  end
end

Using Concerns for Shared Tools

Create reusable tool configurations with concerns:

ruby

# Concern that provides research-related tools that work with both
# OpenAI Responses API (built-in tools) and Chat Completions API (function calling)
module ResearchTools
  extend ActiveSupport::Concern

  included do
    # Class-level configuration for built-in tools
    class_attribute :research_tools_config, default: {}
  end

  # Action methods that become function tools in Chat API
  # These are standard ActiveAgent actions that get converted to tool schemas

  def search_academic_papers
    @query = params[:query]
    @year_from = params[:year_from]
    @year_to = params[:year_to]
    @field = params[:field]

    prompt(
      message: build_academic_search_query,
      # For Responses API - add web search as built-in tool
      tools: responses_api? ? [ { type: "web_search_preview", search_context_size: "high" } ] : nil
    )
  end

  def analyze_research_data
    @data = params[:data]
    @analysis_type = params[:analysis_type]

    prompt(
      message: "Analyze the following research data:\n#{@data}\nAnalysis type: #{@analysis_type}",
      content_type: :json
    )
  end

  def generate_research_visualization
    @data = params[:data]
    @chart_type = params[:chart_type] || "bar"
    @title = params[:title]

    prompt(
      message: "Create a #{@chart_type} chart visualization for: #{@title}\nData: #{@data}",
      # For Responses API - add image generation as built-in tool
      tools: responses_api? ? [
        {
          type: "image_generation",
          size: "1024x1024",
          quality: "high"
        }
      ] : nil
    )
  end

  def search_with_mcp_sources
    @query = params[:query]
    @sources = params[:sources] || []

    # Build MCP tools configuration based on requested sources
    mcp_tools = build_mcp_tools(@sources)

    prompt(
      message: "Research query: #{@query}",
      tools: responses_api? ? mcp_tools : nil
    )
  end

  private

  def build_academic_search_query
    query_parts = [ "Academic papers search: #{@query}" ]
    query_parts << "Published between #{@year_from} and #{@year_to}" if @year_from && @year_to
    query_parts << "Field: #{@field}" if @field
    query_parts << "Include citations and abstracts"
    query_parts.join("\n")
  end

  def build_mcp_tools(sources)
    tools = []

    sources.each do |source|
      case source
      when "arxiv"
        tools << {
          type: "mcp",
          server_label: "ArXiv Papers",
          server_url: "https://arxiv-mcp.example.com/sse",
          server_description: "Search and retrieve academic papers from ArXiv",
          require_approval: "never",
          allowed_tools: [ "search_papers", "get_paper", "get_citations" ]
        }
      when "pubmed"
        tools << {
          type: "mcp",
          server_label: "PubMed",
          server_url: "https://pubmed-mcp.example.com/sse",
          server_description: "Search medical and life science literature",
          require_approval: "never"
        }
      when "github"
        tools << {
          type: "mcp",
          server_label: "GitHub Research",
          server_url: "https://api.githubcopilot.com/mcp/",
          server_description: "Search code repositories and documentation",
          require_approval: "never"
        }
      end
    end

    tools
  end

  def responses_api?
    # Check if we're using the Responses API
    # For now, we'll check if the model or options indicate Responses API usage
    false # This would be determined by the actual provider configuration
  end

  class_methods do
    # Class method to configure research tools for the agent
    def configure_research_tools(**options)
      self.research_tools_config = research_tools_config.merge(options)
    end
  end
end

Use the concern in your agents:

ruby

class ResearchAgent < ApplicationAgent
  include ResearchTools

  # Configure the agent to use OpenAI with specific settings
  generate_with :openai, model: "gpt-4o"

  # Configure research tools at the class level
  configure_research_tools(
    enable_web_search: true,
    mcp_servers: [ "arxiv", "github" ],
    default_search_context: "high"
  )

  # Agent-specific action that uses both concern tools and custom logic
  def comprehensive_research
    @topic = params[:topic]
    @depth = params[:depth] || "detailed"

    # This action combines multiple tools
    prompt(
      message: "Conduct comprehensive research on: #{@topic}",
      tools: build_comprehensive_tools
    )
  end

  def literature_review
    @topic = params[:topic]
    @sources = params[:sources] || [ "arxiv", "pubmed" ]

    # Use the concern's search_with_mcp_sources internally
    mcp_tools = build_mcp_tools(@sources)

    prompt(
      message: "Conduct a literature review on: #{@topic}\nFocus on peer-reviewed sources from the last 5 years.",
      tools: [
        { type: "web_search_preview", search_context_size: "high" },
        *mcp_tools
      ]
    )
  end

  private

  def build_comprehensive_tools
    tools = []

    # Add web search for general information
    tools << {
      type: "web_search_preview",
      search_context_size: @depth == "detailed" ? "high" : "medium"
    }

    # Add MCP servers from configuration
    if research_tools_config[:mcp_servers]
      tools.concat(build_mcp_tools(research_tools_config[:mcp_servers]))
    end

    # Add image generation for visualizations
    if @depth == "detailed"
      tools << {
        type: "image_generation",
        size: "1024x1024",
        quality: "high"
      }
    end

    tools
  end
end

Tool Configuration Example

Here's how built-in tools are configured in the prompt options:

ruby

test "tool configuration in prompt options" do
  # Example showing how to configure built-in tools
  tools_config = [
    {
      type: "web_search_preview",
      search_context_size: "high",
      user_location: {
        country: "US",
        city: "San Francisco"
      }
    },
    {
      type: "image_generation",
      size: "1024x1024",
      quality: "high",
      format: "png"
    },
    {
      type: "mcp",
      server_label: "GitHub",
      server_url: "https://api.githubcopilot.com/mcp/",
      require_approval: "never"
    }
  ]

  # Show how the options would be passed to prompt
  example_options = {
    use_responses_api: true,
    model: "gpt-5",
    tools: tools_config
  }

  # Verify the configuration structure
  assert example_options[:tools].is_a?(Array)
  assert_equal 3, example_options[:tools].length
  assert_equal "web_search_preview", example_options[:tools][0][:type]
  assert_equal "image_generation", example_options[:tools][1][:type]
  assert_equal "mcp", example_options[:tools][2][:type]

  doc_example_output({
    description: "Example configuration for built-in tools in prompt options",
    options: example_options,
    tools_configured: tools_config
  })
end

Configuration Output

activeagent/test/agents/builtin_tools_doc_test.rb:108

json

{
  "description": "Example configuration for built-in tools in prompt options",
  "options": {
    "use_responses_api": true,
    "model": "gpt-5",
    "tools": [
      {
        "type": "web_search_preview",
        "search_context_size": "high",
        "user_location": {
          "country": "US",
          "city": "San Francisco"
        }
      },
      {
        "type": "image_generation",
        "size": "1024x1024",
        "quality": "high",
        "format": "png"
      },
      {
        "type": "mcp",
        "server_label": "GitHub",
        "server_url": "https://api.githubcopilot.com/mcp/",
        "require_approval": "never"
      }
    ]
  },
  "tools_configured": [
    {
      "type": "web_search_preview",
      "search_context_size": "high",
      "user_location": {
        "country": "US",
        "city": "San Francisco"
      }
    },
    {
      "type": "image_generation",
      "size": "1024x1024",
      "quality": "high",
      "format": "png"
    },
    {
      "type": "mcp",
      "server_label": "GitHub",
      "server_url": "https://api.githubcopilot.com/mcp/",
      "require_approval": "never"
    }
  ]
}

Embeddings

Generate high-quality text embeddings using OpenAI's embedding models. See the Embeddings Framework Documentation for comprehensive coverage.

Basic Embedding Generation

ruby

test "uses configured OpenAI embedding model" do
  VCR.use_cassette("embedding_openai_model") do
    # Create agent with specific OpenAI model configuration
    custom_agent_class = Class.new(ApplicationAgent) do
      generate_with :openai,
        model: "gpt-4o",
        embedding_model: "text-embedding-3-small"
    end

    generation = custom_agent_class.with(
      message: "Testing OpenAI embedding model configuration"
    ).prompt_context

    response = generation.embed_now
    embedding = response.message.content

    # text-embedding-3-small can have different dimensions depending on truncation
    assert_includes [ 1536, 3072 ], embedding.size
    assert embedding.all? { |v| v.is_a?(Float) }

    doc_example_output({
      model: "text-embedding-3-small",
      dimensions: embedding.size,
      sample: embedding[0..2]
    })
  end
end

Response Example

Available Embedding Models

text-embedding-3-large - Highest quality (3072 dimensions, configurable down to 256)
text-embedding-3-small - Balanced performance (1536 dimensions, configurable)
text-embedding-ada-002 - Legacy model (1536 dimensions, fixed)

For detailed model comparisons and benchmarks, see OpenAI's Embeddings Documentation.

Similarity Search Example

ruby

test "performs similarity search with embeddings" do
  VCR.use_cassette("embedding_similarity_search") do
    documents = [
      "The cat sat on the mat",
      "Dogs are loyal companions",
      "Machine learning is a subset of AI",
      "The feline rested on the rug"
    ]

    # Generate embeddings for all documents
    embeddings = documents.map do |doc|
      generation = ApplicationAgent.with(message: doc).prompt_context
      generation.embed_now.message.content
    end

    # Query embedding
    query = "cat on mat"
    query_generation = ApplicationAgent.with(message: query).prompt_context
    query_embedding = query_generation.embed_now.message.content

    # Calculate cosine similarities
    similarities = embeddings.map.with_index do |embedding, index|
      similarity = cosine_similarity(query_embedding, embedding)
      { document: documents[index], similarity: similarity }
    end

    # Sort by similarity
    results = similarities.sort_by { |s| -s[:similarity] }

    # Most similar should be the cat/mat documents
    assert_equal "The cat sat on the mat", results.first[:document]
    assert results.first[:similarity] > 0.5, "Similarity should be > 0.5, got #{results.first[:similarity]}"

    # Document the results
    doc_example_output(results.first(2))
  end
end

Response Example

For more advanced embedding patterns, see the Embeddings Documentation.

Dimension Configuration

OpenAI's text-embedding-3 models support configurable dimensions:

ruby

test "verifies embedding dimensions for different models" do
  VCR.use_cassette("embedding_dimensions") do
    # Test with default model (usually text-embedding-3-small or ada-002)
    generation = ApplicationAgent.with(
      message: "Testing embedding dimensions"
    ).prompt_context

    response = generation.embed_now
    embedding = response.message.content

    # Most OpenAI models return 1536 dimensions by default
    assert_includes [ 1536, 3072 ], embedding.size

    doc_example_output({
      model: "default",
      dimensions: embedding.size,
      sample: embedding[0..4]
    })
  end
end

Response Example

Dimension Reduction

OpenAI's text-embedding-3-large and text-embedding-3-small models support native dimension reduction by specifying a dimensions parameter. This can significantly reduce storage costs while maintaining good performance.

Batch Processing

Efficiently process multiple embeddings:

ruby

test "processes multiple embeddings in batch" do
  VCR.use_cassette("embedding_batch_processing") do
    texts = [
      "First document for embedding",
      "Second document with different content",
      "Third document about technology"
    ]

    embeddings = []
    texts.each do |text|
      generation = ApplicationAgent.with(message: text).prompt_context
      embedding = generation.embed_now.message.content
      embeddings << {
        text: text[0..20] + "...",
        dimensions: embedding.size,
        sample: embedding[0..2]
      }
    end

    assert_equal 3, embeddings.size
    embeddings.each do |result|
      assert result[:dimensions] > 0
      assert result[:sample].all? { |v| v.is_a?(Float) }
    end

    doc_example_output(embeddings)
  end
end

Response Example

Cost Optimization for Embeddings

Choose the right model based on your needs:

Model	Dimensions	Cost per 1M tokens	Best for
text-embedding-3-large	3072 (configurable)	$0.13	Highest quality, semantic search
text-embedding-3-small	1536 (configurable)	$0.02	Good balance, most applications
text-embedding-ada-002	1536	$0.10	Legacy support

Cost Savings

Use text-embedding-3-small for most applications (85% cheaper than large)
Cache embeddings aggressively - they don't change for the same input
Consider dimension reduction for large-scale applications

Provider-Specific Parameters

Model Parameters

model - Model identifier (e.g., "gpt-4o", "gpt-3.5-turbo")
embedding_model - Embedding model (e.g., "text-embedding-3-large")
dimensions - Reduced dimensions for embeddings (for 3-large and 3-small models)
temperature - Controls randomness (0.0 to 2.0)
max_tokens - Maximum tokens in response
top_p - Nucleus sampling parameter
frequency_penalty - Penalize frequent tokens (-2.0 to 2.0)
presence_penalty - Penalize new topics (-2.0 to 2.0)
seed - For deterministic outputs
response_format - Output format ({ type: "json_object" } or { type: "text" })

Organization Settings

organization_id - OpenAI organization ID
project_id - OpenAI project ID for usage tracking

Advanced Options

stream - Enable streaming responses (true/false)
tools - Array of built-in tools for Responses API (web_search_preview, image_generation, mcp)
tool_choice - Control tool usage ("auto", "required", "none", or specific tool)
parallel_tool_calls - Allow parallel tool execution (true/false)
use_responses_api - Force use of Responses API (true/false)
web_search - Web search configuration for Chat API with search-preview models
web_search_options - Alternative parameter name for web search in Chat API

Azure OpenAI

For Azure OpenAI Service, configure a custom host:

ruby

class AzureAgent < ApplicationAgent
  generate_with :openai,
    access_token: Rails.application.credentials.dig(:azure, :api_key),
    host: "https://your-resource.openai.azure.com",
    api_version: "2024-02-01",
    model: "your-deployment-name"
end

Error Handling

Handle OpenAI-specific errors:

ruby

class RobustAgent < ApplicationAgent
  generate_with :openai,
    max_retries: 3,
    request_timeout: 30
  
  rescue_from OpenAI::RateLimitError do |error|
    Rails.logger.error "Rate limit hit: #{error.message}"
    retry_with_backoff
  end
  
  rescue_from OpenAI::APIError do |error|
    Rails.logger.error "OpenAI API error: #{error.message}"
    fallback_response
  end
end

Testing

Use VCR for consistent tests:

ruby

require "test_helper"

class OpenAIAgentTest < ActiveAgentTestCase
  test "it renders a prompt_context generates a response" do
    VCR.use_cassette("openai_prompt_context_response") do
      message = "Show me a cat"
      prompt = OpenAIAgent.with(message: message).prompt_context
      response = prompt.generate_now
      assert_equal message, OpenAIAgent.with(message: message).prompt_context.message.content
      assert_equal 3, response.prompt.messages.size
      assert_equal :system, response.prompt.messages[0].role
      assert_equal :user, response.prompt.messages[1].role
      assert_equal :assistant, response.prompt.messages[2].role
    end
  end
end

class OpenAIClientTest < ActiveAgentTestCase
  def setup
    super
    # Configure OpenAI before tests
    OpenAI.configure do |config|
      config.access_token = "test-api-key"
      config.log_errors = Rails.env.development?
      config.request_timeout = 600
    end
  end

  test "loads configuration from environment" do
    # Use empty config to test environment-based configuration
    with_active_agent_config({}) do
      class OpenAIClientAgent < ApplicationAgent
        layout "agent"
        generate_with :openai
      end

      client = OpenAI::Client.new
      assert_equal OpenAIClientAgent.generation_provider.access_token, client.access_token
    end
  end
end

Cost Optimization

Use Appropriate Models

Use GPT-3.5 Turbo for simple tasks
Reserve GPT-4o for complex reasoning
Consider GPT-4o-mini for a balance

Optimize Token Usage

ruby

class EfficientAgent < ApplicationAgent
  generate_with :openai,
    model: "gpt-3.5-turbo",
    max_tokens: 500,  # Limit response length
    temperature: 0.3  # More focused responses
  
  def summarize
    @content = params[:content]
    # Truncate input if needed
    @content = @content.truncate(3000) if @content.length > 3000
    prompt
  end
end

Cache Responses

ruby

class CachedAgent < ApplicationAgent
  generate_with :openai
  
  def answer_faq
    question = params[:question]
    
    Rails.cache.fetch("faq/#{question.parameterize}", expires_in: 1.day) do
      prompt(message: question).generate_now
    end
  end
end

Best Practices

Set appropriate temperature - Lower for factual tasks, higher for creative
Use system messages effectively - Provide clear instructions
Implement retry logic - Handle transient failures
Monitor usage - Track token consumption and costs
Use the latest models - They're often more capable and cost-effective
Validate outputs - Especially for critical applications

OpenAI Provider ​

Configuration ​

Basic Setup ​

Configuration File ​

Environment Variables ​

Supported Models ​

Chat Completions API Models ​

Responses API Models ​

Features ​

Function Calling ​

Streaming Responses ​

Vision Capabilities ​

Structured Output ​

Supported Models ​

Basic Usage ​

With Schema Generator ​

Strict Mode ​

Response Handling ​

Best Practices ​

Limitations ​

Built-in Tools (Responses API) ​

Web Search ​

Image Generation ​

MCP (Model Context Protocol) Integration ​

Combining Multiple Tools ​

Using Concerns for Shared Tools ​

Tool Configuration Example ​

Embeddings ​

Basic Embedding Generation ​

Available Embedding Models ​

Similarity Search Example ​

Dimension Configuration ​

Batch Processing ​

Cost Optimization for Embeddings ​

Provider-Specific Parameters ​

Model Parameters ​

Organization Settings ​

Advanced Options ​

Azure OpenAI ​

Error Handling ​

Testing ​

Cost Optimization ​

Use Appropriate Models ​

Optimize Token Usage ​

Cache Responses ​

Best Practices ​

Related Documentation ​

OpenAI Provider

Configuration

Basic Setup

Configuration File

Environment Variables

Supported Models

Chat Completions API Models

Responses API Models

Features

Function Calling

Streaming Responses

Vision Capabilities

Structured Output

Supported Models

Basic Usage

With Schema Generator

Strict Mode

Response Handling

Best Practices

Limitations

Built-in Tools (Responses API)

Web Search

Image Generation

MCP (Model Context Protocol) Integration

Combining Multiple Tools

Using Concerns for Shared Tools

Tool Configuration Example

Embeddings

Basic Embedding Generation

Available Embedding Models

Similarity Search Example

Dimension Configuration

Batch Processing

Cost Optimization for Embeddings

Provider-Specific Parameters

Model Parameters

Organization Settings

Advanced Options

Azure OpenAI

Error Handling

Testing

Cost Optimization

Use Appropriate Models

Optimize Token Usage

Cache Responses

Best Practices

Related Documentation