The Inference class is the primary facade for making requests to LLM providers in Polyglot. It provides a unified interface for configuring providers, building requests, and executing inference operations.

Architecture Overview

The Inference class combines functionality through traits:
  • HandlesLLMProvider: Provider configuration and driver management
  • HandlesRequestBuilder: Request construction and configuration
  • HandlesInvocation: Request execution and PendingInference creation
  • HandlesShortcuts: Convenient methods for common response formats

Basic Usage

<?php
use Cognesy\Polyglot\Inference\Inference;

// Simple text completion
$response = (new Inference())
    ->withMessages('What is the capital of France?')
    ->get();

// Using a specific provider
$response = (new Inference())
    ->using('openai')
    ->withMessages('Explain quantum physics')
    ->get();

LLM Provider Configuration Methods

Configure the underlying LLM provider:
// Provider selection and configuration
$inference->using('openai');                           // Use preset configuration
$inference->withDsn('openai://model=gpt-4');          // Configure via DSN
$inference->withConfig($customConfig);                 // Explicit configuration
$inference->withConfigProvider($configProvider);      // Custom config provider
$inference->withLLMConfigOverrides(['temperature' => 0.7]);

// HTTP client configuration
$inference->withHttpClient($customHttpClient);        // Custom HTTP client
$inference->withHttpClientPreset('debug');           // HTTP client preset
$inference->withDebugPreset('verbose');              // Debug configuration

// Driver management
$inference->withDriver($customDriver);               // Custom inference driver

Request Building Methods

Configure the inference request:
// Message configuration
$inference->withMessages('Hello, world!');           // String message
$inference->withMessages(['user' => 'Hello']);       // Array format
$inference->withMessages($messageObject);            // Message object

// Model and generation parameters
$inference->withModel('gpt-4');                      // Specific model
$inference->withMaxTokens(100);                      // Token limit
$inference->withOutputMode($outputMode);             // Response format mode

// Tool usage
$inference->withTools($toolDefinitions);             // Available tools
$inference->withToolChoice('auto');                  // Tool selection strategy

// Response formatting
$inference->withResponseFormat(['type' => 'json']); // Response format
$inference->withOptions(['temperature' => 0.5]);    // Additional options

// Advanced features
$inference->withStreaming(true);                     // Enable streaming
$inference->withCachedContext($messages, $tools);   // Context caching

Invocation Methods

Execute inference requests:
// Flexible configuration method
$inference->with(
    messages: 'Hello',
    model: 'gpt-4',
    tools: [],
    toolChoice: 'auto',
    responseFormat: ['type' => 'text'],
    options: ['temperature' => 0.7],
    mode: OutputMode::Text
);

// Create pending inference for advanced handling
$pending = $inference->create();

// Direct request execution
$inference->withRequest($existingRequest);

Response Shortcuts

Get responses in different formats:
// Text responses
$text = $inference->get();                           // Plain text
$response = $inference->response();                  // Full InferenceResponse object

// JSON responses  
$json = $inference->asJson();                        // JSON string
$data = $inference->asJsonData();                    // Parsed array

// Streaming
$stream = $inference->stream();                      // InferenceStream object

Driver Registration

Register custom drivers for new providers:
// Register with class name
Inference::registerDriver('custom-provider', CustomDriver::class);

// Register with factory callable
Inference::registerDriver('custom-provider', function($config, $httpClient) {
    return new CustomDriver($config, $httpClient);
});