Pārlūkot izejas kodu

[Fix] Stop API key from leaking in evaluation outputs. (#3603)

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Raj Maheshwari 1 gadu atpakaļ
vecāks
revīzija
e72dc96d13
2 mainītis faili ar 13 papildinājumiem un 1 dzēšanām
  1. 1 1
      agenthub/codeact_agent/README.md
  2. 12 0
      evaluation/utils/shared.py

+ 1 - 1
agenthub/codeact_agent/README.md

@@ -15,7 +15,7 @@ The conceptual idea is illustrated below. At each turn, the agent can:
 
 To make the CodeAct agent more powerful with only access to `bash` action space, CodeAct agent leverages OpenHands's plugin system:
 - [Jupyter plugin](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/jupyter): for IPython execution via bash command
-- [SWE-agent tool plugin](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/swe_agent_commands): Powerful bash command line tools for software development tasks introduced by [swe-agent](https://github.com/princeton-nlp/swe-agent).
+- [Agent Skills plugin](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/agent_skills): Powerful bash command line tools for software development tasks introduced by [swe-agent](https://github.com/princeton-nlp/swe-agent).
 
 ## Demo
 

+ 12 - 0
evaluation/utils/shared.py

@@ -32,6 +32,12 @@ class EvalMetadata(BaseModel):
     data_split: str | None = None
     details: dict[str, Any] | None = None
 
+    def model_dump(self, *args, **kwargs):
+        dumped_dict = super().model_dump(*args, **kwargs)
+        # avoid leaking sensitive information
+        dumped_dict['llm_config'] = self.llm_config.to_safe_dict()
+        return dumped_dict
+
     def model_dump_json(self, *args, **kwargs):
         dumped = super().model_dump_json(*args, **kwargs)
         dumped_dict = json.loads(dumped)
@@ -58,6 +64,12 @@ class EvalOutput(BaseModel):
     # Optionally save the input test instance
     instance: dict[str, Any] | None = None
 
+    def model_dump(self, *args, **kwargs):
+        dumped_dict = super().model_dump(*args, **kwargs)
+        # Apply custom serialization for metadata (to avoid leaking sensitive information)
+        dumped_dict['metadata'] = self.metadata.model_dump()
+        return dumped_dict
+
     def model_dump_json(self, *args, **kwargs):
         dumped = super().model_dump_json(*args, **kwargs)
         dumped_dict = json.loads(dumped)