run the following command in a conda env with CUDA etc.
Linux:
curl -fsSL https://ollama.com/install.sh | sh
Windows or macOS:
Ollama model names can be found here (See example below)
Once you have found the model you want to use copy the command and run it in your conda env.
Example of llama2 q4 quantized:
conda activate <env_name>
ollama run llama2:13b-chat-q4_K_M
you can check which models you have downloaded like this:
~$ ollama list
NAME ID SIZE MODIFIED
llama2:latest 78e26419b446 3.8 GB 6 weeks ago
mistral:7b-instruct-v0.2-q4_K_M eb14864c7427 4.4 GB 2 weeks ago
starcoder2:latest f67ae0f64584 1.7 GB 19 hours ago
This command starts up the ollama server that is on port 11434
This will show the requests in CLI
conda activate <env_name>
ollama serve
or
This will run with no output in the background
sudo systemctl start ollama
If you see something like this:
Error: listen tcp 127.0.0.1:11434: bind: address already in use
This is not an error it just means the server is already running
To stop the server use:
sudo systemctl stop ollama
For more info go here
git clone git@github.com:OpenDevin/OpenDevin.git
or
git clone git@github.com:<YOUR-USERNAME>/OpenDevin.git
then
cd OpenDevin
make build
make setup-config
make setup-config you will see a generated file OpenDevin/config.toml.Open this file and modify it to your needs based on this template:
LLM_API_KEY="ollama"
LLM_MODEL="ollama/<model_name>"
LLM_EMBEDDING_MODEL="local"
LLM_BASE_URL="http://localhost:<port_number>"
WORKSPACE_DIR="./workspace"
Notes:
"ollama"localhost11434 unless you set itmodel_name needs to be the entire model name
LLM_MODEL="ollama/llama2:13b-chat-q4_K_M"At this point everything should be set up and working properly.
make build in your terminal ~/OpenDevin/make run in your terminalmake start-backendmake start-frontendhttp://localhost:3001/ with your local model running!