Agent S: an open agentic framework that uses computers like a human
-
Updated
Oct 31, 2025 - Python
Agent S: an open agentic framework that uses computers like a human
GELab: GUI Exploration Lab. One of the best GUI agent solutions in the galaxy, built by the StepFun-GELab team and powered by Step’s research capabilities.
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
ScaleCUA is the open-sourced computer use agents that can operate on corss-platform environments (Windows, macOS, Ubuntu, Android).
[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents
[AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615
Official implementation of UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning
[CVPR 2025] Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"
Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.
[ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
A Gradio-based demonstration for the Microsoft Fara-7B model, designed as a computer use agent. Users upload UI screenshots (e.g., desktop or app interfaces), provide task instructions (e.g., "Click on the search bar"), and receive parsed actions with visualized indicators overlaid on the image.
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
💻 Control AI agents to automate tasks on computers, enabling true autonomy with browser, terminal, and desktop interaction. Perfect for developers.
Add a description, image, and links to the gui-agents topic page so that developers can more easily learn about it.
To associate your repository with the gui-agents topic, visit your repo's landing page and select "manage topics."