Boxuan Li 5ed80b5c32 [doc] Fix link in TheAgentCompany benchmark's README.md (#5848) 1 سال پیش
..
EDA 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
agent_bench 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
aider_bench 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
biocoder 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
bird 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
browsing_delegation 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
commit0_bench bfb191b5c7 Fix issue #5739: [Bug]: Move ./evaluation/swe_bench/scripts/cleanup_remote_runtime.sh to general eval utils (#5740) 1 سال پیش
discoverybench 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
gaia 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
gorilla 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
gpqa 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
humanevalfix 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
logic_reasoning 21948fa81b Fix issue #5735: [Bug]: Inconsistent command line arguments in evaluation directory (#5736) 1 سال پیش
miniwob 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
mint 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
ml_bench 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش
scienceagentbench 21948fa81b Fix issue #5735: [Bug]: Inconsistent command line arguments in evaluation directory (#5736) 1 سال پیش
swe_bench 8975fcd714 Fix issue #5748: Rename "Ran a Jupyter Command" to "Ran a Python Command" in UI (#5749) 1 سال پیش
the_agent_company 5ed80b5c32 [doc] Fix link in TheAgentCompany benchmark's README.md (#5848) 1 سال پیش
toolqa 21948fa81b Fix issue #5735: [Bug]: Inconsistent command line arguments in evaluation directory (#5736) 1 سال پیش
webarena 3297e4d5a8 Use litellm's modify params (#5636) 1 سال پیش