Commit History

Автор SHA1 Съобщение Дата
  Boxuan Li 6f235937cf Evaluation time travel: allow evaluation on a specific version (#2356) преди 1 година
  Xingyao Wang 11a2d1682d Minor SWE-Bench inference config tweak (#2381) преди 1 година
  Xingyao Wang a6ba6c5277 Add SWEBench-docker eval (#2085) преди 1 година
  tobitege 5776474dcf Fix SWE-Bench README typos (#2250) преди 1 година
  Boxuan Li 4d14b44a9a SWE-bench: Add summarise utility script to view passed/failed task IDs (#2137) преди 1 година
  Xingyao Wang 2c0a2dbc61 fix yet another swe_bench issue (#2069) преди 1 година
  Xingyao Wang 5114230e53 Some SWE-Bench infer fixes and improvements (#2065) преди 1 година
  Xingyao Wang 6ff50ed369 Fix SWE-Bench evaluation due to setuptools version (#1995) преди 1 година
  Boxuan Li 4add8a5595 SWE-bench: Allow selection of tasks (#1935) преди 1 година
  Boxuan Li b845a38169 Small improvements & fixes to SWE-Bench (#1874) преди 1 година
  Xingyao Wang b2fdb963b6 Add detailed tutorial for adding new evaluation benchmarks (#1827) преди 1 година
  Boxuan Li a57a213c7c Turn off auto linting by default, and on for swe_bench (#1861) преди 1 година
  Xingyao Wang 0fdbe1ee93 Update README.md (#1825) преди 1 година
  Xingyao Wang 2406b901df feat(SWE-Bench environment) integrate SWE-Bench sandbox (#1468) преди 1 година