Skip to content

Actions

Actions are public methods in your agent that define specific AI behaviors. Each action calls prompt() to generate text or embed() to create vector embeddings.

Think of actions like controller actions in Rails—they define what your agent can do and how it responds to different requests.

Quick Example

Define an action by creating a method that calls prompt():

ruby
class SummaryAgent < ApplicationAgent
  def summarize
    prompt(
      instructions: "Summarize in 2-3 sentences",
      message: params[:text],
      temperature: 0.3
    )
  end
end
ruby
# Synchronous execution
SummaryAgent.with(text:).summarize.generate_now
# Create generation for async execution
SummaryAgent.with(text:).summarize.generate_later

Action Capabilities

Actions can use these capabilities to build sophisticated AI interactions:

Messages

Control conversation context with text, images, and documents:

ruby
response = ApplicationAgent.prompt(
  "Analyze this image", image: "https://picsum.photos/200"
).generate_now

Tools

Let AI call Ruby methods during generation:

ruby
class WeatherAgent < ApplicationAgent
  generate_with :openai, model: "gpt-4o"

  def weather_update
    prompt(
      input: "What's the weather in Boston?",
      tools: [ {
        type: "function",
        name: "get_current_weather",
        description: "Get the current weather in a given location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA"
            },
            unit: {
              type: "string",
              enum: [ "celsius", "fahrenheit" ]
            }
          },
          required: [ "location" ]
        }
      } ]
    )
  end

  def get_current_weather(location:, unit: "fahrenheit")
    { location: location, unit: unit, temperature: "22", conditions: "sunny" }
  end
end

Structured Output

Enforce JSON responses with schemas:

ruby
class DataExtractionAgent < ApplicationAgent
  generate_with :openai, model: "gpt-4o"

  def parse_resume
    prompt(
      message: "Extract resume data: #{params[:file_data]}",
      # Loads views/agents/data_extraction/parse_resume/schema.json
      response_format: :json_schema
    )
  end
end

Embeddings

Generate vectors for semantic search:

ruby
class MyAgent < ApplicationAgent
  embed_with :openai, model: "text-embedding-3-small"

  def vectorize
    embed(input: params[:text])
  end
end

response = MyAgent.with(text: "Hello world").vectorize.embed_now
vector = response.data.first[:embedding]  # => [0.123, -0.456, ...]

Usage Statistics

Track token consumption and costs:

ruby
response = agent.summarize.generate_now
response.usage.total_tokens  #=> 125

Common Patterns

Multi-Capability Actions

Combine multiple capabilities in a single action for complex behaviors.

Use this pattern when you need the AI to:

  • Search for information AND structure the results
  • Process data with tools AND validate the output format
  • Combine multimodal inputs (text + images) with structured responses

Chaining Generations

Build multi-step workflows by passing previous responses as conversation history.

This approach works well for:

  • Multi-turn conversations where context matters
  • Iterative refinement (generate → critique → improve)
  • Workflows where each step builds on previous results

Multiple Actions Per Agent

Define multiple actions in a single agent for related behaviors.

  • Agents - Understanding the agent lifecycle and invocation
  • Generation - Synchronous and asynchronous execution
  • Callbacks - Hooks for before/after action execution
  • Instructions - System prompts that guide agent behavior
  • Streaming - Real-time response updates
  • Configuration - Configure action behavior across environments
  • Testing - Test action patterns and behaviors