For two years the credible browser-using AI agents — the ones that can actually log into a vendor portal, click through a form, read the confirmation page, and write the result back to a record of work — have been a closed set: OpenAI Operator, Anthropic Computer Use, and Google Gemini 2.5 Computer Use. All three were proprietary, all three cloud-only. For mid-market firms wanting to deploy a browser-using AI Employee against the dozens of carrier portals, county-recorder portals, ERP web consoles, and EHR-vendor portals that make up the surface area of mid-market work, the proprietary-cloud constraint was binding. The credential never wanted to leave the building. The architecture wouldn't let it.
That just changed. According to MarkTechPost's 2026-05-22 reporting on Microsoft's Fara1.5 release, Microsoft Research has published a family of three browser-computer-use models — 4B, 9B, and 27B parameters — that outscore both OpenAI Operator and Gemini 2.5 Computer Use on the Online-Mind2Web benchmark. The headline numbers: Fara1.5-27B reaches 72% task success on Online-Mind2Web; OpenAI Operator sits at 58.3%; Gemini 2.5 Computer Use sits at 57.3%. The 9B variant lands at 63.4% — still ahead of both proprietary baselines. The 4B variant is small enough to run on a single workstation GPU. The architectural constraint that kept browser CUAs locked inside the vendor clouds for two years has, in the span of a week, evaporated.
Key Takeaways
- Microsoft Fara1.5 is the first open-source browser computer-use agent family that beats the proprietary baselines from OpenAI and Google on Online-Mind2Web — 27B at 72%, 9B at 63.4%, both ahead of OpenAI Operator (58.3%) and Gemini 2.5 Computer Use (57.3%).
- The 4B and 9B variants are small enough to run on commodity workstation GPUs, unlocking the on-premise deployment pattern that mid-market firms have been waiting for.
- The 5-Question Browser-Agent Readiness Test helps a mid-market operator decide whether a specific portal-driven workflow is ready to move from “human clicks the portal” to “AI Employee drives the portal.”
- The Secure AI Gateway is the natural host for a Fara-class browser agent — it vaults the credentials, injects them at the network layer, enforces the per-portal authorization scope, writes the portal-action audit log as a side-effect, and rate-limits the agent against the carrier's terms of service.
- The single most important architectural rule is: the credential never enters the agent's working memory. The gateway injects it into the browser session at the network layer, on a per-action authorization scope, with an audit record written before the action runs.
- Browser-acting agents are uniquely exposed to prompt-injection attacks embedded in portal DOM content. The OWASP LLM Top 10 controls and the confused-deputy framing are the right defenses to wire in from day one.
What is a browser computer-use agent, and what just changed?
A browser computer-use agent (CUA) is an AI agent whose primary interface is a real web browser. It observes pages through screenshots, plans through a language-model planner, and acts through mouse and keyboard primitives — click at coordinates, type into a field, scroll, navigate, wait for the page to settle, screenshot, repeat. The architecture matters because most mid-market business work happens inside vendor web portals, not inside APIs the firm has integration access to. An AI Employee that can drive a browser can reach any portal a human can reach. An API-only AI Employee is locked out of anything the vendor has not chosen to expose.
Until last week, three proprietary CUA stacks defined the category: OpenAI's Operator, Anthropic's Computer Use, and Google's Gemini Computer Use. Each is a credible production system. None is deployable on premises. None lets the buyer choose where the browser session runs. For workflows where the buyer cannot — for regulatory, contractual, or vendor-ToS reasons — send portal credentials to a third-party cloud, the proprietary CUAs were architecturally unavailable.
Microsoft Fara1.5 changes that. The family is built on the Qwen3.5 base and ships in 4B, 9B, and 27B parameter sizes. The training corpus is roughly two million samples — 60% web trajectories, with smaller slices for synthetic environments, form-filling, grounding, visual question answering, and safety data. The reported benchmarks include Online-Mind2Web (27B at 72%, 9B at 63.4%), WebVoyager (27B at 88.6%, 9B at 86.6%, 4B at 80.8%), and WebTailBench v1.5 process success at 64.5% for the 9B variant. The 27B model is a major jump over the predecessor Fara-7B, which scored 34.1% on the same Online-Mind2Web benchmark.
Two other Fara1.5 features matter for the mid-market deployment picture. First, the model is trained to stop and ask the user in three situations: when personal information is needed but missing, when the task is ambiguous, and when an action is irreversible. That trio is the practical instantiation of a human-in-the-loop gate inside a browser session. Second, all actions are logged inside the MagenticLite sandboxed browser interface — the audit record is produced by the runtime, not by a separate compliance workstream. Both are architectural fits for the buyer-owned pattern.
The capability trajectory matches the broader pattern the Stanford HAI 2026 AI Index Report tracks across benchmark categories where the agent has to act on a digital environment, not just produce text. Leaders change quarter to quarter, the open-weight tier is closing the gap with the closed-weight tier, and the choice between “vendor cloud” and “buyer-owned” is now a real choice for most mid-market workloads.

The 5-Question Browser-Agent Readiness Test
Not every portal-driven workflow is ready to move from “human clicks the portal” to “AI Employee drives the portal.” The five questions below are the structural diligence test. Each question has a clean buyer-side acceptable answer, and the combination of answers determines whether the workflow is ready, partially ready, or not yet a candidate for browser-CUA deployment.
1. Does this workflow involve at least three portal logins per ticket, quote, claim, or matter?
The acceptable answer is yes. Browser CUAs are slower per-action than APIs and slower per-action than a fast human typist. The economic case for deploying one only closes when the workflow already costs the human enough portal-driving time per unit of work to amortize the slower per-action latency. Three portal logins per unit is a useful threshold. A workflow that touches one portal twice a quarter is not the right place to start. A workflow that touches six portals per claim, every claim, is.
2. Are the portal credentials currently shared, or are they per-user?
The acceptable answer is either, but you need to know which. Shared credentials (a single service-account login used by multiple humans) are the easier case for a browser CUA — the gateway already vaults a credential that no individual owns. Per-user credentials are more complex because the agent has to act under a specific human's identity, which requires a delegated-authority pattern at the gateway and an authorization record written before each portal action runs. Both are workable. Neither is workable without the gateway.
3. Is the workflow's success state machine-checkable?
The acceptable answer is yes. The agent needs to know whether a step succeeded — that is, whether the portal returned a confirmation page, a record number, an updated status, or whatever the vendor's success signal is. Workflows where success is “the portal probably did the thing” are not yet ready. Workflows where success is “the portal returns a quote number ending in -2026” are ready. Machine-checkable success is what lets the agent decide whether to advance, retry, or escalate.
4. Is there a regulatory audit obligation that requires preserving every action taken inside the portal?
The acceptable answer is yes — and this is a strong yes for browser-CUA deployment, not a complication. A workflow with an audit obligation is a workflow where the browser-side audit log is already a requirement on the human side; deploying a CUA whose audit log is produced as a structural side-effect of the runtime actually improves compliance posture rather than degrading it. We described the runtime pattern in the Fort Wayne AI agent authorization and audit playbook; the per-action authorization decision point and the audit record live at the same gateway.
5. Is a two-second per-action latency acceptable in the workflow?
The acceptable answer is yes. Browser CUAs operate at the speed of the page — observe, plan, act, wait, observe again. Per-action latency is in the seconds, not the milliseconds. Workflows where two seconds per click is fine — back-office processing, overnight batch runs, asynchronous portal sweeps — are good candidates. Real-time customer-facing interactions where the human is waiting on the other end of the conversation are not.
The decision matrix is straightforward. A workflow that answers yes to questions 1, 3, 4, and 5 — and has a defensible answer to question 2 — is ready to be the firm's first browser-CUA deployment. A workflow that fails question 1 or question 5 is not yet ready, regardless of how well it does on the other three. A workflow that fails question 3 needs a state-machine retrofit before any agent is deployed against it. A workflow that fails question 4 is fine for browser-CUA in principle but probably is not the first place to focus.

How do the major browser-CUA deployment patterns compare?
The matrix below is the comparison we walk new mid-market clients through when they're scoping their first browser-CUA workflow.
| Deployment pattern | Credential handling | Audit-log architecture | Portal coverage breadth | Mid-market fit verdict |
|---|---|---|---|---|
| OpenAI Operator (cloud) | Credentials transit to OpenAI cloud session | Vendor-side log; export and join required for buyer's pipeline | Broad — any public web portal | Good for low-sensitivity portals; bad for any portal whose ToS or regulation forbids third-party credential transit |
| Gemini 2.5 Computer Use (cloud) | Credentials transit to Google cloud session | Vendor-side log; export and join required | Broad — any public web portal | Same shape as Operator — broad capability, narrow eligible-data footprint |
| Anthropic Computer Use (cloud) | Credentials transit to Anthropic cloud session | Vendor-side log; export and join required | Broad — any public web portal | Same shape — depends on what the firm is allowed to send to a third-party cloud |
| Fara1.5 on-premise (buyer host) | Credentials vaulted on buyer infrastructure | Buyer-owned log; written as runtime side-effect | Broad — any portal the buyer's network can reach | Strong fit where regulation, contract, or risk policy forbids third-party credential transit; requires buyer to operate the runtime |
| Fara1.5 inside the Secure AI Gateway | Credentials vaulted at gateway; injected at network layer | Buyer-owned log; written as runtime side-effect; gated per-action authorization | Broad — any portal the gateway's egress allow-list permits | Strongest fit for mid-market — buyer owns the credentials, the audit, the policy, and the egress, without operating the model runtime in-house |
The difference between rows four and five is operational burden. Running Fara1.5 on a workstation GPU on-premise is feasible — the headline benefit of the model being small — but it does require the firm to own the model deployment, the workspace, the credential vault, the authorization decision point, the audit pipeline, and the rate-limit enforcement. The gateway pattern moves model deployment and the operational layers around it to a layer Cloud Radix runs, while leaving credentials, audit, policies, and egress in the buyer's control. For most mid-market firms, the gateway pattern is the lower-friction path to the same outcome.
The lineage is the same one we argued in the buyer-owned AI agent harness and persistent memory architecture post: the durable thing the buyer is buying is not the model and not the agent — it is the runtime layer where credentials, policies, and audit live. Fara1.5 makes that runtime layer cheaper to populate with a real browser agent than it has ever been.

How does the Secure AI Gateway host a Fara-class browser agent safely?
The single most important architectural rule for a buyer-owned browser CUA: the credential never enters the agent's working memory. The agent does not see the portal password. The agent does not hold the OAuth token in its context. The agent issues an action — “log into the carrier portal under the firm's broker identity” — and the gateway injects the credential into the browser session at the network layer, scoped to that single action, identity, and time window. If the agent's working memory is exfiltrated tomorrow through a prompt-injection attack in the carrier's DOM, the attacker gets the conversation. They do not get a usable credential.
That single rule is the difference between a browser CUA that is a security asset and one that is a structural liability. The rule maps onto the OWASP Top 10 for LLM Applications 2025 entries that matter most for execution-capable agents: LLM01 (Prompt Injection) — because the agent is reading attacker-controllable DOM content and treating it as instructions; LLM02 (Sensitive Information Disclosure) — because credentials and PII are visible on every portal page; and LLM06 (Excessive Agency) — because the agent's reach is bounded only by the gateway's egress allow-list.
Around that rule, the gateway adds five structural defenses for a browser CUA:
Per-portal authorization scope. Each portal the agent is allowed to drive is its own authorization grant. The grant has a credential reference, a list of allowed action verbs, a working-hours window, a rate-limit policy, and an audit record that ties every action to the human or service identity that initiated it. The carrier-portal grant does not authorize actions against the court-docket portal.
Network egress allow-list at the gateway. The agent cannot navigate to an arbitrary URL just because the LLM decided it should. The gateway enforces the allow-list at the network boundary. New endpoints require explicit policy admission. The agent that has been induced by a prompt-injection attack to try to navigate to an exfiltration endpoint simply cannot reach it.
Action audit before execution. Every portal action — every click, every form submission, every file upload — is written into the buyer's audit log before the action is executed at the portal. If the action is reversible, it runs. If it is high-tier — initiating a payment, submitting a binder, signing on behalf — it pauses for the human-approval gate Fara1.5 is trained to surface. The audit record is the artifact, not the deliverable.
Rate-limit enforcement against vendor terms of service. Every carrier, every court system, every EHR vendor has a rate limit (sometimes written, sometimes implicit) for portal traffic from a single account. The gateway enforces the buyer's policy against those limits — partly as a vendor-relationship discipline, partly because hammering a carrier portal with an agent looks like the kind of behavior that gets the firm's account flagged.
Confused-deputy containment. A browser CUA is the textbook confused-deputy: the agent has the firm's authority, the portal trusts the agent's identity, and a prompt injection in the portal's content can attempt to redirect the agent's authority toward an attacker's goal. We described the audit shape for this risk in the confused-deputy AI agents audit matrix; the same matrix applies to browser CUAs as much as it does to any other execution-capable agent. The gateway containment is the per-action authorization scope plus the egress allow-list plus the audit log; together they are how the agent gets bounded.
This is also the place to note the vendor caveat. Microsoft Fara1.5 is a Microsoft model, and the prompt-injection risk surface for browser-acting Microsoft systems has been a live conversation in mid-market IT — we wrote about the broader pattern in Fort Wayne Microsoft Copilot prompt injection risk. The same defenses apply. The model identity does not change the architecture; the gateway and the four boundary lines do.

What does a Fara-class browser CUA look like across Northeast Indiana?
Three named scenarios make the architecture concrete. Each is a real shape of mid-market work in Allen, DeKalb, and Whitley County firms.
The Allen County independent insurance brokerage running quote retrievals across six carrier portals. Today, the brokerage's CSR logs into each carrier portal in sequence, runs the quote, downloads the binder summary, attaches it to the client file in the agency-management system, and repeats. Six portals at three minutes each is eighteen minutes per quote, every quote. A Fara1.5-9B agent inside the Secure AI Gateway runs the same sequence with the credentials vaulted at the gateway, the per-portal authorization scope enforced at each step, and an audit record written before each portal action. The CSR initiates the work, reviews the results, and approves the binding step. Eighteen minutes drops to three or four minutes of human attention per quote — the rest is the audited agent session.
The upstream context for the workflow comes from the AI Employees and conversational context capture architecture layer: the agent needs to know which carriers to quote, which client identity to act under, which prior policies to compare against. The capture layer feeds the browser CUA the context; the gateway enforces the boundaries. Both layers run on the buyer's infrastructure.
The DeKalb County manufacturer pulling permit statuses from county-government portals. The plant manager checks four county permit portals every Monday — environmental, building, zoning, and stormwater — to track open permit applications and renewal deadlines. The work is low-criticality per check but cumulative across the year. A Fara1.5-4B agent on a workstation GPU runs the four-portal sweep nightly, writes the status into a single dashboard, and surfaces only the deltas to the plant manager. The browser CUA's audit log is the answer to the regulator's question, when it comes, about how the firm has been monitoring its own permit posture.
The Fort Wayne law firm's paralegals running daily court-docket sweeps. The firm's paralegals log into the Allen County e-filing system, the U.S. District Court system, and the Indiana state courts every morning to check for filings, hearing notices, and orders on active matters. The sweep is repetitive, the failure cost of missing a filing is real, and the work is exactly the kind of “audit-required, machine-checkable, multi-portal” workflow the readiness test was built to identify. A Fara-class agent runs the sweep, writes the results to the matter file, queues the paralegal's review of any filing that touched an active matter, and provides the audit log the firm's malpractice carrier will eventually ask to see.
In all three scenarios, the runtime that hosts the browser session is the self-hosted Kubernetes AI agent runtime we have written about elsewhere, the credentials live in the gateway, and the audit log is written into the firm's logging pipeline as a side-effect of the work. The browser CUA is the action layer. The gateway is the runtime where the action layer is bounded.

What does this mean for NE Indiana mid-market buyers right now?
For mid-market operations and IT leaders across Northeast Indiana — firms in Auburn, Fort Wayne, DeKalb, Allen, Whitley, and Noble Counties evaluating where to put AI Employee investment this quarter — the practical move is to run the 5-Question Browser-Agent Readiness Test against the top five most repetitive portal-driven workflows in the building and pick the strongest candidate as the first deployment. The right first workflow scores yes on questions 1, 3, 4, and 5, has a defensible answer to question 2, and is owned by a single operator willing to be the design partner on the rollout. One workflow done well beats five workflows in pilot.
The choice between “send credentials to a third-party cloud” and “keep credentials in a buyer-owned gateway and run a Fara-class model behind it” is now a real choice, not a hypothetical one. Cloud CUAs are good systems and the right answer for some workloads — public-web research, low-sensitivity portals, prototyping. The buyer-owned pattern is the right answer for any workflow whose credentials, audit obligations, or vendor terms of service forbid third-party transit. The decision framework should map onto the NIST AI Risk Management Framework Govern/Map/Measure/Manage functions — Govern says what the firm's policy is, Map inventories the data the portal sees, Measure produces the audit record, Manage enforces the runtime controls.
Cloud Radix's Secure AI Gateway is the buyer-owned runtime layer we stand up for mid-market firms that want to deploy browser-using AI Employees against their actual portal-driven workflows without sending the credentials to a third-party cloud. The gateway vaults the credentials, injects them at the network layer, enforces the per-portal authorization scope, writes the audit log as a side-effect of the runtime, and rate-limits the agent against vendor terms of service. Our AI Consulting practice runs the 30-day Browser-Agent Workflow Pilot as a fixed-scope engagement: we identify the buyer's single highest-leverage portal-driven workflow, deploy a Fara-class browser agent inside the gateway, and report back full audit data, ROI math, and a 90-day rollout plan for the next two workflows. The deliverable is the working pilot, the audit record, and the architecture diagram — not a slide deck.

Frequently Asked Questions
Q1.What is Microsoft Fara1.5?
Microsoft Fara1.5 is a family of three open-source browser computer-use agent models — 4B, 9B, and 27B parameters — released by Microsoft Research and reported by MarkTechPost on 2026-05-22. On the Online-Mind2Web benchmark, the 27B variant scores 72% task success and the 9B variant scores 63.4%, both above OpenAI Operator (58.3%) and Gemini 2.5 Computer Use (57.3%). The 4B variant runs on a single workstation GPU.
Q2.What is a browser computer-use agent (CUA)?
A browser computer-use agent is an AI agent whose primary interface is a real web browser. It observes pages through screenshots, plans the next action through a language-model planner, and acts through mouse and keyboard primitives. CUAs can drive any web portal a human can drive — which is the majority of mid-market business work that lacks an integration API.
Q3.Why does an on-premise browser CUA matter for Fort Wayne and NE Indiana mid-market firms?
Many Northeast Indiana mid-market workflows touch portals whose credentials cannot — for regulatory, contractual, or vendor-ToS reasons — be transmitted to a third-party cloud. Carrier portals, Allen County e-filing systems, EHR portals, and DeKalb/Allen/Whitley county-government portals are typical examples. An on-premise or gateway-hosted CUA keeps the credentials in the buyer's infrastructure.
Q4.What is the 5-Question Browser-Agent Readiness Test?
A structural diligence checklist for deciding whether a portal-driven workflow is ready for browser-CUA deployment. The five questions cover portal-login density, credential handling, machine-checkable success state, audit obligation, and per-action latency tolerance. Workflows that score well on questions 1, 3, 4, and 5 — with a defensible answer to question 2 — are the right candidates.
Q5.How does the Secure AI Gateway protect a browser CUA from prompt injection?
The gateway enforces the rule that the credential never enters the agent's working memory, the per-portal authorization scope, the network egress allow-list, the action audit before execution, and the rate-limit policy. A prompt injection in portal DOM content cannot exfiltrate credentials the agent does not hold, cannot reach URLs outside the allow-list, and cannot execute high-tier actions without the human-approval gate. The OWASP entries that matter are LLM01, LLM02, and LLM06.
Q6.Is a browser CUA the same as an RPA bot?
No. RPA bots record a fixed click-path and replay it; when the portal's HTML changes, they break. A browser CUA uses a vision-language model to understand the page and plan the action, so it absorbs minor changes without re-scripting and handles workflows the developer did not pre-wire. CUAs are qualitatively higher in capability and security surface area — which is why gateway containment matters.
Q7.How long does the Browser-Agent Workflow Pilot take?
Thirty days, fixed scope. The pilot covers the buyer's single highest-leverage portal-driven workflow — the one identified by the 5-Question Readiness Test as the strongest candidate. The deliverable is a working Fara-class agent inside the Secure AI Gateway, the full audit record, ROI math against the human baseline, and a 90-day rollout plan for the next two workflows.
Sources & Further Reading
- MarkTechPost: marktechpost.com/2026/05/22/microsoft-releases-fara1-5 — 2026-05-22 reporting on Microsoft Fara1.5 with benchmark comparisons against OpenAI Operator and Gemini 2.5 Computer Use.
- Microsoft Research: microsoft.com/en-us/research — Source of the Fara1.5 model family release notes and the MagenticLite sandboxed browser interface description.
- OpenAI — Operator product: openai.com — Reference for the proprietary cloud baseline.
- Anthropic Computer Use documentation: docs.anthropic.com — Reference for the Anthropic Computer Use proprietary cloud baseline.
- NIST AI Risk Management Framework: nist.gov/itl/ai-risk-management-framework — Govern/Map/Measure/Manage functions referenced as the decision framework for cloud-vs-buyer-owned CUA deployments.
- OWASP Top 10 for LLM Applications 2025: genai.owasp.org/llm-top-10 — LLM01, LLM02, and LLM06 controls referenced as the right defenses for browser-acting agents.
- Stanford HAI 2026 AI Index Report: hai.stanford.edu/ai-index/2026-ai-index-report — The benchmark-trajectory evidence for the closing gap between open-weight and closed-weight computer-use agents.
Run the 30-Day Browser-Agent Workflow Pilot
Fixed-scope pilot deploying a Fara-class browser agent inside the Cloud Radix Secure AI Gateway against your single highest-leverage portal-driven workflow. Audit record, ROI math, and 90-day rollout plan included.
Schedule the Browser-Agent PilotNo contracts. No pressure. Just an honest conversation about your highest-leverage portal workflow.



