Skip to main content
Enigma is real-time browser automation infrastructure for AI agents. Give it a task in plain English, and an AI agent executes it in a real browserβ€”clicking, typing, navigating, and extracting data autonomously. Key capabilities:
  • Sub-100ms response times via hybrid CNN-LLM architecture
  • Live video streaming to observe sessions in real-time
  • Human-in-the-loop with guardrails and manual takeover
  • Flexible integration via REST API, WebSocket, MCP, or OpenAI-compatible endpoints

Get Started


What Can Enigma Do?

Research & Data Extraction

Gather information from multiple sources, extract structured data, and compile research reports automatically. Example: β€œSearch LinkedIn for engineering managers in San Francisco, extract their profiles, and compile contact information into a structured list.”

Form Automation

Fill out complex forms, handle multi-step workflows, and submit applications with conditional logic. Example: β€œGo to this insurance quote form, fill it out using the customer data I provide, and return the final quote.”

E-commerce Operations

Search for products, compare prices, add items to cart, and even complete checkout flows with human approval. Example: β€œFind the top 3 wireless keyboards on Amazon under $50, add the best-rated one to cart, and show me the checkout page.”

Dynamic Testing

Test web applications with natural language instructions, adapting to UI changes without brittle selectors. Example: β€œNavigate through the signup flow, try to register with invalid data, and report any validation errors you encounter.”

How It Works

Sessions & Tasks

A session is an isolated browser instance controlled by an AI agent. A task is a single objective for the agent to complete within that session.
Session (sessionId: "sess_abc")
 β”œβ”€β”€ Task 1 β†’ completed
 β”œβ”€β”€ Task 2 β†’ completed
 └── Task 3 β†’ guardrail triggered
Sessions persist until you terminate them or they time out (max 5 minutes). One session can run multiple tasks sequentially. Learn more about Sessions | Learn more about Tasks

Response Model

Most browser tasks complete in 10-40 seconds. Enigma waits up to 50 seconds for your task to finishβ€”meaning you typically get results inline, in a single request.
POST /start/run-task
     ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Task completes in < 50s?          β”‚
β”‚  β”œβ”€β”€ Yes β†’ Result returned inline  β”‚
β”‚  └── No  β†’ pollUrl returned        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
This gives you the simplicity of synchronous APIs for typical tasks, with the reliability of async for complex multi-step operations. Learn more about Response Model

Guardrails

When the agent needs human inputβ€”credentials, clarification, approvalβ€”it triggers a guardrail and pauses. Your application detects this and provides the input. Common triggers: Login forms, purchase confirmations, CAPTCHAs, ambiguous instructions. Learn more about Guardrails

Choose Your Integration

Which endpoint should I use?

Need browser automation?
β”œβ”€β”€ Single task, don't care about session? β†’ POST /start/run-task
β”œβ”€β”€ Multiple tasks in sequence? β†’ POST /start/start-session + /send-message
β”œβ”€β”€ Using LangChain/existing OpenAI code? β†’ POST /v1/chat/completions
└── Using Claude Desktop/MCP client? β†’ MCP Server

REST vs WebSocket?

  • REST: Simpler. Good enough for 90% of use cases. Poll for results.
  • WebSocket: Only if you need live agent thoughts or sub-second event handling.

Integration Methods

MethodBest ForReal-time Events
REST APISimple integrations, serverless, stateless workflowsPoll for updates
WebSocketLive dashboards, interactive UIs, real-time agent thoughtsYes
OpenAI-CompatibleLangChain, LlamaIndex, Vercel AI SDK, existing OpenAI toolingPoll or stream
MCP ServerClaude Desktop, Cline, any MCP-compatible AI assistantNo
Workflowsn8n, Make.com, ZapierPoll for updates

Enigma vs. Traditional Automation

EnigmaPlaywright/Puppeteer
InputNatural languageCode
AdaptabilityAI agent adapts to UI changesScripts break on changes
MaintenanceSelf-healingManual updates required
LatencySub-100ms decisions~50ms per action
Best forDynamic tasks, scraping, form-fillingRegression testing, CI/CD

Next Steps