Skip to content

LLM Integration

The current extension does not use a FastAPI backend, LangGraph, or SSE. It uses an OpenAI-compatible client that sends requests directly to the configured provider.

OpenAI-Compatible Providers are configured as provider profiles in extension storage.

Provider profiles are stored in chrome.storage.local under:

llmConfig
llmProfiles
activeProfileId
advancedConfig

The default testing profile points to a hosted demo endpoint and uses qwen3.5-plus.

The providers registry includes OpenAI, Anthropic via OpenAI-compatible API, Gemini, Groq, DeepSeek, Mistral, OpenRouter, xAI, Together AI, Fireworks, Cerebras, Perplexity, NVIDIA NIM, Cloudflare Workers AI, Ollama, and Custom.

All model calls use:

POST {baseURL}/chat/completions

Request body:

{
"model": "gpt-4o-mini",
"temperature": 0,
"messages": [
{ "role": "system", "content": "..." },
{ "role": "user", "content": "..." }
],
"tools": [],
"parallel_tool_calls": false,
"tool_choice": {
"type": "function",
"function": { "name": "AgentOutput" }
}
}

When disableNamedToolChoice is enabled, the client sends tool_choice: "required" instead of forcing AgentOutput.

AgentOutput wraps reflection fields plus exactly one action object.

Click action:

{
"evaluation_previous_goal": "The page is visible.",
"memory": "The checkout button is index 8.",
"next_goal": "Click the checkout button.",
"action": {
"click_element_by_index": {
"index": 8
}
}
}

Input action:

{
"evaluation_previous_goal": "The search field was focused.",
"memory": "The search field is index 4.",
"next_goal": "Enter the query.",
"action": {
"input_text": {
"index": 4,
"text": "wireless headphones"
}
}
}

The client accepts finish_reason values tool_calls, function_call, or stop. It parses message.tool_calls[0].function.arguments as JSON and validates it with the Zod schema for the selected tool.

If validation fails, execution fails before the DOM action is called.

The LLM wrapper retries retryable failures up to maxRetries. UI activity events are:

{ type: 'thinking' }
{ type: 'executing', tool, input }
{ type: 'executed', tool, input, output, duration }
{ type: 'retrying', attempt, maxAttempts }
{ type: 'error', message }

The side panel renders these with copy such as Thinking..., Executing {{tool}}..., Done: {{tool}}, and Retrying ({{attempt}}/{{max}})....

There are no SSE event names in the current source.