Parcourir la source

Ask the agent not to push changes to GitHub (#4222)

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Robert Brennan il y a 1 an
Parent
commit
98b39023f4

+ 2 - 1
openhands/agenthub/codeact_agent/system_prompt.j2

@@ -43,7 +43,8 @@ Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per resp
 If the assistant is finished with the task you MUST include <finish></finish> in your response.
 IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
-The assistant must avoid apologies and thanks in its responses.
+The assistant MUST NOT apologize to the user or thank the user after running commands or editing files. It should only address the user in response to an explicit message from the user, or to ask for more information.
+The assistant MUST NOT push any changes to GitHub unless explicitly requested to do so.
 
 {% endset %}
 {# Combine all parts without newlines between them #}

+ 2 - 7
tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_001.log

@@ -173,7 +173,8 @@ Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per resp
 If the assistant is finished with the task you MUST include <finish></finish> in your response.
 IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
-The assistant must avoid apologies and thanks in its responses.
+The assistant MUST NOT apologize to the user or thank the user after running commands or editing files. It should only address the user in response to an explicit message from the user, or to ask for more information.
+The assistant MUST NOT push any changes to GitHub unless explicitly requested to do so.
 
 ----------
 
@@ -401,13 +402,7 @@ The server is running on port 5000 with PID 126. You can access the list of numb
 
 
 NOW, LET'S START!
-
-----------
-
 Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.
 
-----------
-
-
 
 ENVIRONMENT REMINDER: You have 19 turns left to complete the task. When finished reply with <finish></finish>.

+ 2 - 7
tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_005.log

@@ -173,7 +173,8 @@ Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per resp
 If the assistant is finished with the task you MUST include <finish></finish> in your response.
 IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
-The assistant must avoid apologies and thanks in its responses.
+The assistant MUST NOT apologize to the user or thank the user after running commands or editing files. It should only address the user in response to an explicit message from the user, or to ask for more information.
+The assistant MUST NOT push any changes to GitHub unless explicitly requested to do so.
 
 ----------
 
@@ -401,9 +402,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb
 
 
 NOW, LET'S START!
-
-----------
-
 Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.
 
 ----------
@@ -418,8 +416,5 @@ Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me fo
 OBSERVATION:
 The answer to life, the universe, and everything has been revealed: OpenHands is all you need!
 
-----------
-
-
 
 ENVIRONMENT REMINDER: You have 13 turns left to complete the task. When finished reply with <finish></finish>.

+ 2 - 1
tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_001.log

@@ -173,7 +173,8 @@ Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per resp
 If the assistant is finished with the task you MUST include <finish></finish> in your response.
 IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
-The assistant must avoid apologies and thanks in its responses.
+The assistant MUST NOT apologize to the user or thank the user after running commands or editing files. It should only address the user in response to an explicit message from the user, or to ask for more information.
+The assistant MUST NOT push any changes to GitHub unless explicitly requested to do so.
 
 ----------
 

+ 2 - 1
tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_002.log

@@ -173,7 +173,8 @@ Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per resp
 If the assistant is finished with the task you MUST include <finish></finish> in your response.
 IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
-The assistant must avoid apologies and thanks in its responses.
+The assistant MUST NOT apologize to the user or thank the user after running commands or editing files. It should only address the user in response to an explicit message from the user, or to ask for more information.
+The assistant MUST NOT push any changes to GitHub unless explicitly requested to do so.
 
 ----------
 

+ 2 - 1
tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_003.log

@@ -173,7 +173,8 @@ Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per resp
 If the assistant is finished with the task you MUST include <finish></finish> in your response.
 IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
-The assistant must avoid apologies and thanks in its responses.
+The assistant MUST NOT apologize to the user or thank the user after running commands or editing files. It should only address the user in response to an explicit message from the user, or to ask for more information.
+The assistant MUST NOT push any changes to GitHub unless explicitly requested to do so.
 
 ----------