|
@@ -4,6 +4,64 @@ sidebar_position: 3
|
|
|
|
|
|
|
|
# 🧠 Agents and Capabilities
|
|
# 🧠 Agents and Capabilities
|
|
|
|
|
|
|
|
|
|
+## CodeAct Agent
|
|
|
|
|
+
|
|
|
|
|
+### Description
|
|
|
|
|
+
|
|
|
|
|
+This agent implements the CodeAct idea ([paper](https://arxiv.org/abs/2402.13463), [tweet](https://twitter.com/xingyaow_/status/1754556835703751087)) that consolidates LLM agents’ **act**ions into a unified **code** action space for both *simplicity* and *performance* (see paper for more details).
|
|
|
|
|
+
|
|
|
|
|
+The conceptual idea is illustrated below. At each turn, the agent can:
|
|
|
|
|
+
|
|
|
|
|
+1. **Converse**: Communicate with humans in natural language to ask for clarification, confirmation, etc.
|
|
|
|
|
+2. **CodeAct**: Choose to perform the task by executing code
|
|
|
|
|
+- Execute any valid Linux `bash` command
|
|
|
|
|
+- Execute any valid `Python` code with [an interactive Python interpreter](https://ipython.org/). This is simulated through `bash` command, see plugin system below for more details.
|
|
|
|
|
+
|
|
|
|
|
+
|
|
|
|
|
+
|
|
|
|
|
+### Plugin System
|
|
|
|
|
+
|
|
|
|
|
+To make the CodeAct agent more powerful with only access to `bash` action space, CodeAct agent leverages OpenDevin's plugin system:
|
|
|
|
|
+- [Jupyter plugin](https://github.com/OpenDevin/OpenDevin/tree/main/opendevin/runtime/plugins/jupyter): for IPython execution via bash command
|
|
|
|
|
+- [SWE-agent tool plugin](https://github.com/OpenDevin/OpenDevin/tree/main/opendevin/runtime/plugins/swe_agent_commands): Powerful bash command line tools for software development tasks introduced by [swe-agent](https://github.com/princeton-nlp/swe-agent).
|
|
|
|
|
+
|
|
|
|
|
+### Demo
|
|
|
|
|
+
|
|
|
|
|
+https://github.com/OpenDevin/OpenDevin/assets/38853559/f592a192-e86c-4f48-ad31-d69282d5f6ac
|
|
|
|
|
+
|
|
|
|
|
+*Example of CodeActAgent with `gpt-4-turbo-2024-04-09` performing a data science task (linear regression)*
|
|
|
|
|
+
|
|
|
|
|
+
|
|
|
|
|
+### Actions
|
|
|
|
|
+
|
|
|
|
|
+`Action`,
|
|
|
|
|
+`CmdRunAction`,
|
|
|
|
|
+`IPythonRunCellAction`,
|
|
|
|
|
+`AgentEchoAction`,
|
|
|
|
|
+`AgentFinishAction`,
|
|
|
|
|
+`AgentTalkAction`
|
|
|
|
|
+
|
|
|
|
|
+### Observations
|
|
|
|
|
+
|
|
|
|
|
+`CmdOutputObservation`,
|
|
|
|
|
+`IPythonRunCellObservation`,
|
|
|
|
|
+`AgentMessageObservation`,
|
|
|
|
|
+`UserMessageObservation`
|
|
|
|
|
+
|
|
|
|
|
+### Methods
|
|
|
|
|
+
|
|
|
|
|
+| Method | Description |
|
|
|
|
|
+| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
|
|
|
+| `__init__` | Initializes an agent with `llm` and a list of messages `List[Mapping[str, str]]` |
|
|
|
|
|
+| `step` | Performs one step using the CodeAct Agent. This includes gathering info on previous steps and prompting the model to make a command to execute. |
|
|
|
|
|
+| `search_memory` | Not yet implemented |
|
|
|
|
|
+
|
|
|
|
|
+### Work-in-progress & Next step
|
|
|
|
|
+
|
|
|
|
|
+[] Support web-browsing
|
|
|
|
|
+[] Complete the workflow for CodeAct agent to submit Github PRs
|
|
|
|
|
+
|
|
|
|
|
+
|
|
|
## Monologue Agent
|
|
## Monologue Agent
|
|
|
|
|
|
|
|
### Description
|
|
### Description
|
|
@@ -82,29 +140,3 @@ The agent is given its previous action-observation pairs, current task, and hint
|
|
|
| `__init__` | Initializes an agent with `llm` |
|
|
| `__init__` | Initializes an agent with `llm` |
|
|
|
| `step` | Checks to see if current step is completed, returns `AgentFinishAction` if True. Otherwise, creates a plan prompt and sends to model for inference, adding the result as the next action. |
|
|
| `step` | Checks to see if current step is completed, returns `AgentFinishAction` if True. Otherwise, creates a plan prompt and sends to model for inference, adding the result as the next action. |
|
|
|
| `search_memory` | Not yet implemented |
|
|
| `search_memory` | Not yet implemented |
|
|
|
-
|
|
|
|
|
-## CodeAct Agent
|
|
|
|
|
-
|
|
|
|
|
-### Description
|
|
|
|
|
-
|
|
|
|
|
-The Code Act Agent is a minimalist agent. The agent works by passing the model a list of action-observation pairs and prompting the model to take the next step.
|
|
|
|
|
-
|
|
|
|
|
-### Actions
|
|
|
|
|
-
|
|
|
|
|
-`Action`,
|
|
|
|
|
-`CmdRunAction`,
|
|
|
|
|
-`AgentEchoAction`,
|
|
|
|
|
-`AgentFinishAction`,
|
|
|
|
|
-
|
|
|
|
|
-### Observations
|
|
|
|
|
-
|
|
|
|
|
-`CmdOutputObservation`,
|
|
|
|
|
-`AgentMessageObservation`,
|
|
|
|
|
-
|
|
|
|
|
-### Methods
|
|
|
|
|
-
|
|
|
|
|
-| Method | Description |
|
|
|
|
|
-| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
|
|
|
-| `__init__` | Initializes an agent with `llm` and a list of messages `List[Mapping[str, str]]` |
|
|
|
|
|
-| `step` | First, gets messages from state and then compiles them into a list for context. Next, pass the context list with the prompt to get the next command to execute. Finally, Execute command if valid, else return `AgentEchoAction(INVALID_INPUT_MESSAGE)` |
|
|
|
|
|
-| `search_memory` | Not yet implemented |
|
|
|