Skip to content
Change the repository type filter

All

    Repositories list

    • GUI interaction capture -- production-ready event streams with time-aligned media
      Python
      0201Updated Feb 6, 2026Feb 6, 2026
    • Evaluation infrastructure for GUI agent benchmarks
      Python
      0030Updated Feb 6, 2026Feb 6, 2026
    • OpenAdapt’s open-source ML toolkit for training and evaluating general multimodal GUI-action models.
      Python
      0211Updated Feb 6, 2026Feb 6, 2026
    • Multimodal demo retrieval for GUI automation
      Python
      0000Updated Jan 29, 2026Jan 29, 2026
    • Temporal smoothing for UI element detection with OmniParser integration
      Python
      0000Updated Jan 29, 2026Jan 29, 2026
    • PII/PHI detection and redaction for GUI automation data (text, images, dicts)
      Python
      2100Updated Jan 29, 2026Jan 29, 2026
    • HTML viewer components for ML dashboards and benchmarks
      Python
      0000Updated Jan 29, 2026Jan 29, 2026
    • System tray application for OpenAdapt
      Python
      0010Updated Jan 29, 2026Jan 29, 2026
    • JavaScript
      12752Updated Jan 29, 2026Jan 29, 2026
    • Self-hosting infrastructure for OpenAdapt recursive development
      Python
      0050Updated Jan 18, 2026Jan 18, 2026
    • .github

      Public
      0000Updated Jan 18, 2026Jan 18, 2026
    • OpenAdapt

      Public
      Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] /…
      Python
      2171.5k00Updated Jan 17, 2026Jan 17, 2026
    • Production execution engine for OpenAdapt GUI automation agents. Wraps trained models with safety gates, human-in-the-loop confirmation, session management, and…
      Python
      0000Updated Jan 17, 2026Jan 17, 2026
    • Unified error tracking and usage analytics for OpenAdapt packages. GlitchTip/Sentry SDK integration with privacy filtering (PII scrubbing, path sanitization), o…
      Python
      0000Updated Jan 17, 2026Jan 17, 2026
    • OpenCUA

      Public
      OpenCUA: Open Foundations for Computer-Use Agents
      Python
      86100Updated Aug 18, 2025Aug 18, 2025
    • OmniMCP

      Public
      OmniMCP uses Microsoft OmniParser and Model Context Protocol (MCP) to provide AI models with rich UI context and powerful interaction capabilities.
      Python
      1569124Updated Apr 8, 2025Apr 8, 2025
    • A simple library to document Pydantic models for structured LLM outputs using standard Python docstrings.
      Python
      0601Updated Apr 6, 2025Apr 6, 2025
    • OmniMCP.web

      Public archive
      Web interface for OmniMCP.
      JavaScript
      12000Updated Mar 22, 2025Mar 22, 2025
    • OpenAdapter

      Public archive
      Effortless Deployment and Integration for SOTA Screenshot Parsing and Action Models
      0131Updated Feb 18, 2025Feb 18, 2025
    • Jupyter Notebook
      2.1k411Updated Feb 14, 2025Feb 14, 2025
    • R1-V

      Public archive
      Witness the aha moment of VLM with less than $3.
      Python
      287000Updated Feb 4, 2025Feb 4, 2025
    • Qwen2.5-VL

      Public archive
      Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
      Jupyter Notebook
      1.6k000Updated Jan 28, 2025Jan 28, 2025
    • open-r1-multimodal

      Public archive
      A fork to add multimodal model training to open-r1
      Python
      70000Updated Jan 28, 2025Jan 28, 2025
    • Janus

      Public archive
      Janus-Series: Unified Multimodal Understanding and Generation Models
      Python
      2.2k000Updated Jan 27, 2025Jan 27, 2025
    • An GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
      TypeScript
      2.7k100Updated Jan 21, 2025Jan 21, 2025
    • app

      Public archive
      A desktop application that enables end-users to automate their workflows with OpenAdapt
      3000Updated Jan 11, 2025Jan 11, 2025
    • omniparser-api

      Public archive
      Self-hosted version of Microsoft's OmniParser Image-to-text model
      Python
      24000Updated Nov 26, 2024Nov 26, 2024
    • OpenReflector links the Anthropic Computer Use container to a Windows or Mac desktop, using OpenAdapt and WebSockets for real-time, two-way mirroring of actions…
      0200Updated Oct 31, 2024Oct 31, 2024
    • A privacy-focused module for detecting and scrubbing PII/PHI from screen data and user actions.
      0500Updated Oct 31, 2024Oct 31, 2024
    • OpenAdaptVault

      Public archive
      Archival snapshot of OpenAdapt: Open Source Generative Process Automation (Generative RPA) with foundational AI models ([Language (LLMs) / Action (LAMs) / Multi…
      Python
      217200Updated Oct 30, 2024Oct 30, 2024