瀏覽代碼

Fix issue #4629: [Bug]: Replace claude-3-5-sonnet-20240620 with claude-3-5-sonnet-20241022 (#4631)

Co-authored-by: Graham Neubig <neubig@gmail.com>
OpenHands 1 年之前
父節點
當前提交
866ba6e3b2

+ 1 - 1
docs/i18n/fr/docusaurus-plugin-content-docs/current/usage/how-to/cli-mode.md

@@ -37,7 +37,7 @@ WORKSPACE_BASE=$(pwd)/workspace
 2. Définissez `LLM_MODEL` sur le modèle que vous souhaitez utiliser :
 
 ```bash
-LLM_MODEL="anthropic/claude-3-5-sonnet-20240620"
+LLM_MODEL="anthropic/claude-3-5-sonnet-20241022"
 ```
 
 3. Définissez `LLM_API_KEY` sur votre clé API :

+ 3 - 1
docs/i18n/fr/docusaurus-plugin-content-docs/current/usage/how-to/evaluation-harness.md

@@ -14,7 +14,8 @@ Voici un exemple de fichier de configuration que vous pouvez utiliser pour défi
 ```toml
 [llm]
 # IMPORTANT : ajoutez votre clé API ici et définissez le modèle que vous souhaitez évaluer
-model = "claude-3-5-sonnet-20240620"
+model = "claude-3-5-sonnet-20241022"
+
 api_key = "sk-XXX"
 
 [llm.eval_gpt4_1106_preview_llm]
@@ -278,3 +279,4 @@ Cette fonction fait ce qui suit :
 3. Si l'agent a fait plusieurs tentatives, il lui donne la possibilité d'abandonner
 
 En utilisant cette fonction, vous pouvez garantir un comportement cohérent sur plusieurs exécutions d'évaluation et empêcher l'agent de rester bloqué en attendant une entrée humaine.
+

+ 1 - 1
docs/i18n/fr/docusaurus-plugin-content-docs/current/usage/how-to/headless-mode.md

@@ -31,7 +31,7 @@ WORKSPACE_BASE=$(pwd)/workspace
 2. Définissez `LLM_MODEL` sur le modèle que vous voulez utiliser :
 
 ```bash
-LLM_MODEL="anthropic/claude-3-5-sonnet-20240620"
+LLM_MODEL="anthropic/claude-3-5-sonnet-20241022"
 ```
 
 3. Définissez `LLM_API_KEY` sur votre clé API :

+ 1 - 1
docs/i18n/zh-Hans/docusaurus-plugin-content-docs/current/usage/how-to/cli-mode.md

@@ -36,7 +36,7 @@ WORKSPACE_BASE=$(pwd)/workspace
 2. 将 `LLM_MODEL` 设置为你要使用的模型:
 
 ```bash
-LLM_MODEL="anthropic/claude-3-5-sonnet-20240620"
+LLM_MODEL="anthropic/claude-3-5-sonnet-20241022"
 ```
 
 3. 将 `LLM_API_KEY` 设置为你的 API 密钥:

+ 1 - 1
docs/i18n/zh-Hans/docusaurus-plugin-content-docs/current/usage/how-to/evaluation-harness.md

@@ -12,7 +12,7 @@
 ```toml
 [llm]
 # 重要:在此处添加您的 API 密钥,并将模型设置为您要评估的模型
-model = "claude-3-5-sonnet-20240620"
+model = "claude-3-5-sonnet-20241022"
 api_key = "sk-XXX"
 
 [llm.eval_gpt4_1106_preview_llm]

+ 3 - 1
docs/i18n/zh-Hans/docusaurus-plugin-content-docs/current/usage/how-to/headless-mode.md

@@ -32,7 +32,8 @@ WORKSPACE_BASE=$(pwd)/workspace
 2. 将 `LLM_MODEL` 设置为你要使用的模型:
 
 ```bash
-LLM_MODEL="anthropic/claude-3-5-sonnet-20240620"
+LLM_MODEL="anthropic/claude-3-5-sonnet-20241022"
+
 ```
 
 3. 将 `LLM_API_KEY` 设置为你的 API 密钥:
@@ -57,3 +58,4 @@ docker run -it \
     ghcr.io/all-hands-ai/openhands:0.11 \
     python -m openhands.core.main -t "write a bash script that prints hi"
 ```
+

+ 3 - 1
docs/modules/usage/how-to/cli-mode.md

@@ -35,7 +35,8 @@ WORKSPACE_BASE=$(pwd)/workspace
 2. Set `LLM_MODEL` to the model you want to use:
 
 ```bash
-LLM_MODEL="anthropic/claude-3-5-sonnet-20240620"
+LLM_MODEL="anthropic/claude-3-5-sonnet-20241022"
+
 ```
 
 3. Set `LLM_API_KEY` to your API key:
@@ -106,3 +107,4 @@ Expected Output:
 ```bash
 🤖 An error occurred. Please try again.
 ```
+

+ 1 - 1
docs/modules/usage/how-to/evaluation-harness.md

@@ -12,7 +12,7 @@ Here's an example configuration file you can use to define and use multiple LLMs
 ```toml
 [llm]
 # IMPORTANT: add your API key here, and set the model to the one you want to evaluate
-model = "claude-3-5-sonnet-20240620"
+model = "claude-3-5-sonnet-20241022"
 api_key = "sk-XXX"
 
 [llm.eval_gpt4_1106_preview_llm]

+ 3 - 1
docs/modules/usage/how-to/headless-mode.md

@@ -29,7 +29,8 @@ WORKSPACE_BASE=$(pwd)/workspace
 2. Set `LLM_MODEL` to the model you want to use:
 
 ```bash
-LLM_MODEL="anthropic/claude-3-5-sonnet-20240620"
+LLM_MODEL="anthropic/claude-3-5-sonnet-20241022"
+
 ```
 
 3. Set `LLM_API_KEY` to your API key:
@@ -54,3 +55,4 @@ docker run -it \
     docker.all-hands.dev/all-hands-ai/openhands:0.11 \
     python -m openhands.core.main -t "write a bash script that prints hi"
 ```
+

+ 3 - 2
frontend/__tests__/utils/extractModelAndProvider.test.ts

@@ -59,9 +59,9 @@ describe("extractModelAndProvider", () => {
       separator: "/",
     });
 
-    expect(extractModelAndProvider("claude-3-5-sonnet-20240620")).toEqual({
+    expect(extractModelAndProvider("claude-3-5-sonnet-20241022")).toEqual({
       provider: "anthropic",
-      model: "claude-3-5-sonnet-20240620",
+      model: "claude-3-5-sonnet-20241022",
       separator: "/",
     });
 
@@ -78,3 +78,4 @@ describe("extractModelAndProvider", () => {
     });
   });
 });
+

+ 3 - 2
frontend/__tests__/utils/organizeModelsAndProviders.test.ts

@@ -15,7 +15,7 @@ test("organizeModelsAndProviders", () => {
     "gpt-4o",
     "together-ai-21.1b-41b",
     "gpt-4o-mini",
-    "claude-3-5-sonnet-20240620",
+    "claude-3-5-sonnet-20241022",
     "claude-3-haiku-20240307",
     "claude-2",
     "claude-2.1",
@@ -51,7 +51,7 @@ test("organizeModelsAndProviders", () => {
     anthropic: {
       separator: "/",
       models: [
-        "claude-3-5-sonnet-20240620",
+        "claude-3-5-sonnet-20241022",
         "claude-3-haiku-20240307",
         "claude-2",
         "claude-2.1",
@@ -63,3 +63,4 @@ test("organizeModelsAndProviders", () => {
     },
   });
 });
+

+ 6 - 2
frontend/src/utils/verified-models.ts

@@ -1,6 +1,10 @@
 // Here are the list of verified models and providers that we know work well with OpenHands.
 export const VERIFIED_PROVIDERS = ["openai", "azure", "anthropic"];
-export const VERIFIED_MODELS = ["gpt-4o", "claude-3-5-sonnet-20240620"];
+export const VERIFIED_MODELS = [
+  "gpt-4o",
+  "claude-3-5-sonnet-20240620",
+  "claude-3-5-sonnet-20241022",
+];
 
 // LiteLLM does not return OpenAI models with the provider, so we list them here to set them ourselves for consistency
 // (e.g., they return `gpt-4o` instead of `openai/gpt-4o`)
@@ -15,7 +19,7 @@ export const VERIFIED_OPENAI_MODELS = [
 ];
 
 // LiteLLM does not return the compatible Anthropic models with the provider, so we list them here to set them ourselves
-// (e.g., they return `claude-3-5-sonnet-20240620` instead of `anthropic/claude-3-5-sonnet-20240620`)
+// (e.g., they return `claude-3-5-sonnet-20241022` instead of `anthropic/claude-3-5-sonnet-20241022`)
 export const VERIFIED_ANTHROPIC_MODELS = [
   "claude-2",
   "claude-2.1",

+ 1 - 1
openhands/llm/llm.py

@@ -47,8 +47,8 @@ LLM_RETRY_EXCEPTIONS: tuple[type[Exception], ...] = (
 # cache prompt supporting models
 # remove this when we gemini and deepseek are supported
 CACHE_PROMPT_SUPPORTED_MODELS = [
-    'claude-3-5-sonnet-20240620',
     'claude-3-5-sonnet-20241022',
+    'claude-3-5-sonnet-20240620',
     'claude-3-haiku-20240307',
     'claude-3-opus-20240229',
 ]

+ 1 - 1
tests/unit/test_prompt_caching.py

@@ -14,7 +14,7 @@ from openhands.storage import get_file_store
 @pytest.fixture
 def mock_llm():
     llm = Mock(spec=LLM)
-    llm.config = LLMConfig(model='claude-3-5-sonnet-20240620', caching_prompt=True)
+    llm.config = LLMConfig(model='claude-3-5-sonnet-20241022', caching_prompt=True)
     llm.is_caching_prompt_active.return_value = True
     return llm