Robert Brennan 31b2e4b5b2 allow specifying exact remote image (#4135) 1 жил өмнө
..
controller c919086e25 Fix for regression (#4075) 1 жил өмнө
core 8059e8e298 make runtime url configurable (#4093) 1 жил өмнө
events ad0b549d8b Feat Tightening up Timeouts and interrupt conditions. (#3926) 1 жил өмнө
linter 54ac340e0b refactor: standardize linter output data structure and interface (#4077) 1 жил өмнө
llm e582806004 Vision and prompt caching fixes (#4014) 1 жил өмнө
memory 41a54378dc Add delegates events to eval trajectories (#3881) 1 жил өмнө
runtime 31b2e4b5b2 allow specifying exact remote image (#4135) 1 жил өмнө
security 6487175a31 refactored all relative paths to absolute paths (#3495) 1 жил өмнө
server ec1a86f150 Handle errors when starting session (#4134) 1 жил өмнө
storage 31dbd3d02e Fix google cloud session manager (#3942) 1 жил өмнө
utils d9a8b53bc2 feat: specialize CodeAct into micro agents by providing markdown files (#3511) 1 жил өмнө
README.md dc0a1f3940 Fix wrong doc url (#3531) 1 жил өмнө
__init__.py 34f3b61536 [runtime hash] fix runtime hash mismatch between inside `app` image and in "development mode" (#4039) 1 жил өмнө
py.typed 6ce77e157b Fix pypi build (#3548) 1 жил өмнө

README.md

OpenHands Architecture

This directory contains the core components of OpenHands.

This diagram provides an overview of the roles of each component and how they communicate and collaborate. OpenHands System Architecture Diagram (July 4, 2024)

Classes

The key classes in OpenHands are:

  • LLM: brokers all interactions with large language models. Works with any underlying completion model, thanks to LiteLLM.
  • Agent: responsible for looking at the current State, and producing an Action that moves one step closer toward the end-goal.
  • AgentController: initializes the Agent, manages State, and drive the main loop that pushes the Agent forward, step by step
  • State: represents the current state of the Agent's task. Includes things like the current step, a history of recent events, the Agent's long-term plan, etc
  • EventStream: a central hub for Events, where any component can publish Events, or listen for Events published by other components
    • Event: an Action or Observeration
      • Action: represents a request to e.g. edit a file, run a command, or send a message
      • Observation: represents information collected from the environment, e.g. file contents or command output
  • Runtime: responsible for performing Actions, and sending back Observations
    • Sandbox: the part of the runtime responsible for running commands, e.g. inside of Docker
  • Server: brokers OpenHands sessions over HTTP, e.g. to drive the frontend
    • Session: holds a single EventStream, a single AgentController, and a single Runtime. Generally represents a single task (but potentially including several user prompts)
    • SessionManager: keeps a list of active sessions, and ensures requests are routed to the correct Session

Control Flow

Here's the basic loop (in pseudocode) that drives agents.

while True:
  prompt = agent.generate_prompt(state)
  response = llm.completion(prompt)
  action = agent.parse_response(response)
  observation = runtime.run(action)
  state = state.update(action, observation)

In reality, most of this is achieved through message passing, via the EventStream. The EventStream serves as the backbone for all communication in OpenHands.

flowchart LR
  Agent--Actions-->AgentController
  AgentController--State-->Agent
  AgentController--Actions-->EventStream
  EventStream--Observations-->AgentController
  Runtime--Observations-->EventStream
  EventStream--Actions-->Runtime
  Frontend--Actions-->EventStream

Runtime

Please refer to the documentation to learn more about Runtime.