|
|
hace 1 año | |
|---|---|---|
| .. | ||
| regression | hace 1 año | |
| static | hace 1 año | |
| swe_bench | hace 1 año | |
| README.md | hace 1 año | |
| TUTORIAL.md | hace 1 año | |
| __init__.py | hace 1 año | |
This folder contains code and resources to run experiments and evaluations.
To better organize the evaluation folder, we should follow the rules below:
evaluation/swe_bench should contain
all the preprocessing/evaluation/analysis scripts.evaluation/swe_benchCheck this huggingface space for visualization of existing experimental results.
You can start your own fork of our huggingface evaluation outputs and submit a PR of your evaluation results to our hosted huggingface repo via PR following the guide here.