Architecture Overview
Navvy is implemented as a WXT Manifest V3 extension. The source directory is packages/extension/src, with WXT entrypoints under src/entrypoints.
WXT configuration
Section titled “WXT configuration”The extension config lives in packages/extension/wxt.config.js.
| Field | Current value |
|---|---|
srcDir | src |
| WXT modules | @wxt-dev/module-react |
| Manifest version | MV3 in generated manifest |
| Permissions | tabs, tabGroups, sidePanel, storage |
| Host permissions | <all_urls> |
| Content script matches | <all_urls> |
| Content script timing | document_end |
| Side panel | sidepanel.html |
| Options page | settings.html |
| Externally connectable | http://localhost/* |
The generated manifest name is Navvy and the description is AI-powered browser automation assistant. Control web pages with natural language.
Entrypoints
Section titled “Entrypoints”entrypoints/sidepanel/App.tsx: React command UI. It rendersStatusDot,Composer,HistoryList,EventCard, andActivityCard.entrypoints/settings/App.tsx: settings UI for General, Providers, Skills, Advanced, and About tabs.entrypoints/background.ts: service worker. It creates the extension auth token, routes messages, opens Hub tabs, and enables side panel behavior.entrypoints/content.ts: content script. It initializes the remote page controller and can expose the main-world bridge when the page token matches the extension token.entrypoints/main-world.ts: unlisted script injected into the page world. It exposeswindow.PAGE_AGENT_EXTandwindow.PAGE_AGENT_EXT_VERSION.entrypoints/hub/App.tsxandentrypoints/hub/hub-ws.ts: Hub UI and WebSocket client for external local callers.
Runtime message flow
Section titled “Runtime message flow”sequenceDiagram participant SidePanel as "sidepanel/App.tsx" participant Core as "Multi-page agent" participant Background as "background.ts" participant Content as "content.ts" participant Page as "PageController" participant Provider as "OpenAI-compatible API"
SidePanel->>Core: execute(task) Core->>Background: TAB_CONTROL get_active_tab Background-->>Core: { success, tab } Core->>Background: PAGE_CONTROL get_browser_state Background->>Content: PAGE_CONTROL get_browser_state Content->>Page: updateTree() Page-->>Content: BrowserState Content-->>Background: BrowserState Background-->>Core: BrowserState Core->>Provider: POST {baseURL}/chat/completions Provider-->>Core: tool call AgentOutput Core->>Background: PAGE_CONTROL click_element / input_text / scroll Background->>Content: PAGE_CONTROL action Content->>Page: indexed DOM action Page-->>Content: ActionResult Content-->>Core: ActionResult Core-->>SidePanel: historychange / activity / statuschangeMessage types
Section titled “Message types”Chrome runtime messages
Section titled “Chrome runtime messages”| Type | Direction | Action names | Payload |
|---|---|---|---|
TAB_CONTROL | agent environment to background.ts | get_active_tab, get_tab_info, open_new_tab, create_tab_group, update_tab_group, add_tab_to_group, close_tab, get_window_tabs | Depends on action, for example { tabId }, { url }, { tabIds, windowId }, { groupId, properties } |
PAGE_CONTROL | agent environment to background.ts to content.ts | get_my_tab_id, get_last_update_time, get_browser_state, update_tree, clean_up_highlights, click_element, input_text, select_option, scroll, scroll_horizontally, execute_javascript | { targetTabId, payload }; DOM actions pass positional arguments in payload |
OPEN_HUB | external localhost page to background.ts | Opens or focuses hub.html?ws={wsPort} | { type: "OPEN_HUB", wsPort } |
Tab event port
Section titled “Tab event port”The agent connects with chrome.runtime.connect({ name: 'tab-events' }). The background broadcasts:
{ action: 'created', payload: { tab } }{ action: 'removed', payload: { tabId, removeInfo } }{ action: 'updated', payload: { tabId, changeInfo, tab } }Main-world bridge
Section titled “Main-world bridge”When the page’s local token matches the extension token, main-world.ts exposes:
window.PAGE_AGENT_EXT.execute(task, config)window.PAGE_AGENT_EXT.stop()The page world and content script exchange window.postMessage events:
{ channel: 'PAGE_AGENT_EXT_REQUEST', id, action: 'execute', payload: { task, config } }{ channel: 'PAGE_AGENT_EXT_REQUEST', id, action: 'stop' }{ channel: 'PAGE_AGENT_EXT_RESPONSE', id, action: 'status_change_event', payload }{ channel: 'PAGE_AGENT_EXT_RESPONSE', id, action: 'activity_event', payload }{ channel: 'PAGE_AGENT_EXT_RESPONSE', id, action: 'history_change_event', payload }{ channel: 'PAGE_AGENT_EXT_RESPONSE', id, action: 'execute_result', payload?, error? }Hub WebSocket
Section titled “Hub WebSocket”hub.html?ws=PORT connects to ws://localhost:{PORT} as a WebSocket client.
Inbound caller messages:
{ type: 'execute', task: string, config?: object }{ type: 'stop' }Outbound Hub messages:
{ type: 'ready' }{ type: 'result', success: boolean, data: string }{ type: 'error', message: string }There is no FastAPI backend or SSE transport in the current monorepo.