Webglide - Computer-Using Agents, 2025

Explorations in universal interface for AI to interact with the web.

Role: Design lead, PM.
Context:
Confidential · Details available upon request.
Explorations in universal interface for AI to interact with the digital world. Using IBM CUGA framework, #1 on AppWorld benchmark and #2 on WebArena.

My focus was not on the benchmark performance itself, but on the interface layer required to make agent behaviour legible, steerable and trustworthy.

Overview

This project explored what a “universal interface” might look like when AI agents interact directly with the digital world: navigating web environments, executing tasks, and reasoning across applications autonomously.

The work was built on IBM’s CUGA framework (ranked #1 on AppWorld and #2 on WebArena benchmarks at the time), which provided a strong technical foundation for agent performance across complex, real-world environments.My focus was not on the benchmark performance itself, but on the interface layer required to make agent behaviour legible, steerable and trustworthy. This project is confidential - details upon request.

This project is confidential. Details available upon request.

About CUGA

CUGA (ConfigUrable Generalist Agent) is an open-source generalist agent framework from IBM Research, purpose-built for enterprise automation. Designed for developers, CUGA combines and improves the best of foundational agentic patterns such as ReAct, CodeAct, and Planner-Executor — into a modular architecture enabling trustworthy, policy-aware, and composable automation across web interfaces, APIs, and custom enterprise systems.
CUGA achieves state-of-the-art performance on leading benchmarks:
→🥇 #1 on AppWorld — a benchmark with 750 real-world tasks across 457 APIs
→🥈 #2 on WebArena — a complex benchmark for autonomous web agents across application domains.

Key features
Complex task execution: State of the art results across Web and APIs.
Flexible tool integrations: CUGA works across REST APIs via OpenAPI specs, MCP servers, and custom connectors.
Composable agent architecture: CUGA itself can be exposed as a tool to other agents, enabling nested reasoning and multi-agent collaboration.
Configurable reasoning modes: Choose between fast heuristics or deep planning depending on your task’s complexity and latency needs.
Policy-aware instructions (Experimental): CUGA components can be configured with policy-aware instructions to improve alignment of the agent behavior.
Save & Reuse (Experimental): CUGA captures and reuses successful execution paths, enabling consistent and faster behavior across repeated tasks.

ReAct Agents

ReAct (Reasoning and Acting) agents are a powerful framework for AI agents that enhance a Large Language Model’s (LLM) ability to handle complex, multi-step tasks by integrating logical reasoning (Thought) with external action (Action) in a continuous, interleaved loop. The core idea is to prompt the LLM to alternate between articulating its chain-of-thought to plan and track its progress, and executing actions like using a search engine, calling an API, or interacting with a database.

This continuous cycle of Thought → Action → Observation allows the agent to dynamically gather external information, adjust its plan based on the results of its actions, overcome issues like factual hallucinations, and ultimately reach a more accurate and reliable final answer than reasoning-only or action-only approaches.

More Here: https://research.ibm.com/blog/cuga-agent-framework

OverviewAbout CUGA
Benjamin Woodmansee is an AI product designer working on frontier AI systems including agent interfaces, computer-using agents, generative AI tools, developer platforms, and human-AI interaction design.