Xingyao Wang 6d19c93d19 [eval] add evaluation workflow (#4489) 1 year ago
..
compare_outputs.py da548d308c [agent] LLM-based editing (#3985) 1 year ago
convert_oh_folder_to_swebench_submission.sh ae13171194 feat(agent): CodeAct with function calling (#4537) 1 year ago
convert_oh_output_to_md.py 4dfc7a7ef0 [Eval] Add a more lightweight / easier-to-use SWE-Bench output visualizer (#4360) 1 year ago
convert_oh_output_to_swe_json.py b13ed017d8 [eval] add git patch post-processing for SWE-Bench eval_infer (#3980) 1 year ago
download_gold_patch.py 5d7f2fd4ae [eval] Allow evaluation of SWE-Bench patches on `RemoteRuntime` (#3927) 1 year ago
summarize_outputs.py 6d19c93d19 [eval] add evaluation workflow (#4489) 1 year ago
update_output_with_eval.py 245334e89d [eval] improve update output script for swe-bench (#4180) 1 year ago