Robert Brennan
|
01ae22ef57
Rename OpenDevin to OpenHands (#3472)
|
hai 1 ano |
Xingyao Wang
|
31b244f95e
[Refactor, Evaluation] Refactor and clean up evaluation harness to remove global config and use EventStreamRuntime (#3230)
|
hai 1 ano |
Xingyao Wang
|
6b16a5da0b
[Eval,Arch] Update GPTQ eval and add `headless_mode` for Controller (#2994)
|
hai 1 ano |
Xingyao Wang
|
ff6ddc831f
fix: runtime test for mac (#3005)
|
hai 1 ano |
Boxuan Li
|
c68478f470
Customize LLM config per agent (#2756)
|
hai 1 ano |
Jiayi Pan
|
917d96e06f
Fix doc error in evals (#2654)
|
hai 1 ano |
Boxuan Li
|
6f235937cf
Evaluation time travel: allow evaluation on a specific version (#2356)
|
hai 1 ano |
Jaskirat Singh
|
e8307608c2
Support gpqa benchmark evaluation (#2080)
|
hai 1 ano |