| .. |
|
compare_outputs.py
|
da548d308c
[agent] LLM-based editing (#3985)
|
1 year ago |
|
convert_oh_folder_to_swebench_submission.sh
|
ae13171194
feat(agent): CodeAct with function calling (#4537)
|
1 year ago |
|
convert_oh_output_to_md.py
|
4dfc7a7ef0
[Eval] Add a more lightweight / easier-to-use SWE-Bench output visualizer (#4360)
|
1 year ago |
|
convert_oh_output_to_swe_json.py
|
b13ed017d8
[eval] add git patch post-processing for SWE-Bench eval_infer (#3980)
|
1 year ago |
|
download_gold_patch.py
|
5d7f2fd4ae
[eval] Allow evaluation of SWE-Bench patches on `RemoteRuntime` (#3927)
|
1 year ago |
|
summarize_outputs.py
|
6d19c93d19
[eval] add evaluation workflow (#4489)
|
1 year ago |
|
update_output_with_eval.py
|
245334e89d
[eval] improve update output script for swe-bench (#4180)
|
1 year ago |