Ketan Ramaneti
|
852c90f64a
[fix eval] Fix issues with miniwob remote runtime evaluation (#5001)
|
преди 1 година |
Ketan Ramaneti
|
42b49e6c43
[fix eval] Fix issues with aider_bench remote runtime evaluation (#5000)
|
преди 1 година |
Robert Brennan
|
17f4c6e1a9
Refactor sessions a bit, and fix issue where runtimes get killed (#4900)
|
преди 1 година |
Engel Nyst
|
eeb2342509
Refactor history/event stream (#3808)
|
преди 1 година |
Xingyao Wang
|
966da7b7c8
feat(agent, CodeAct 2.2): native CodeAct support for Browsing (#4667)
|
преди 1 година |
Xingyao Wang
|
1f23dc89b6
fix(eval): add runtime.connect to all eval harness (#4565)
|
преди 1 година |
Xingyao Wang
|
2d5b360505
refactor: re-organize different runtime implementations into an impl folder (#4346)
|
преди 1 година |
Xingyao Wang
|
b23c7aab5a
[eval] stop set sid in eval (#4311)
|
преди 1 година |
Aditya Bharat Soni
|
0809d26f4d
fix: Allow evaluation benchmarks to pass image urls in run_controller() instead of simply passing strings (#4100)
|
преди 1 година |
Xingyao Wang
|
090c911a50
(refactor) Make `Runtime` class synchronous (#3661)
|
преди 1 година |
Graham Neubig
|
f9088766e8
Allow setting of runtime container image (#3573)
|
преди 1 година |
Robert Brennan
|
01ae22ef57
Rename OpenDevin to OpenHands (#3472)
|
преди 1 година |
Xingyao Wang
|
b30a2dd87a
completely remove update_source_code (#3280)
|
преди 1 година |
Xingyao Wang
|
31b244f95e
[Refactor, Evaluation] Refactor and clean up evaluation harness to remove global config and use EventStreamRuntime (#3230)
|
преди 1 година |
Xingyao Wang
|
001195a3ea
reduce the duplication in run_controller (#3217)
|
преди 1 година |
Xingyao Wang
|
4f0a454ed6
[Arch] Support integration tests using EventStream Runtime (#3184)
|
преди 1 година |
tobitege
|
70dd705418
Fix: apply config arguments for miniwob get_sandbox() from loaded config (#3198)
|
преди 1 година |
Graham Neubig
|
275ea706cf
Remove remaining global config (#3099)
|
преди 1 година |
Xingyao Wang
|
da17665cab
fix: make max_budget_per_task optional in `run_agent_controller` (#3071)
|
преди 1 година |
Xingyao Wang
|
cf910dfa9d
fix eval api_key leak in metadata; fix llm config in run infer (#2998)
|
преди 1 година |
Engel Nyst
|
d37b2973b2
Refactoring: event stream based agent history (#2709)
|
преди 1 година |
Graham Neubig
|
d0384cafdd
Two fixes to swe bench eval (#2831)
|
преди 1 година |
Xingyao Wang
|
f6dc89b41a
[Evaluation] Simplify eval & and multi-processing related fixes (#2810)
|
преди 1 година |
Graham Neubig
|
a081935fd8
Simplify eval code (#2775)
|
преди 1 година |
Graham Neubig
|
ffd3c7144c
Remove global args (#2760)
|
преди 1 година |
Engel Nyst
|
2d9bb56763
Add ability to restore the cli session (optional) (#2699)
|
преди 1 година |
Engel Nyst
|
874b4c9075
CLI concurrency (#2695)
|
преди 1 година |
RainRat
|
745ae42a72
fix typos (#2352)
|
преди 1 година |
Frank Xu
|
48151bdbb0
[feat] WebArena benchmark, MiniWoB++ benchmark and related arch changes (#2170)
|
преди 1 година |