This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
I was excited to see LLaVA support is being merged into llama.cpp thanks to the excellent work conducted by monatis.
I wanted to experiment with this myself and I used the following process on my Apple M1 32GB.
First build llama.cpp with llava support:
git clone https://github.com/ggerganov/llama.cpp.git && cd llama.cpp
git checkout llava
mkdir build && cd build
cmake .. && cmake --build . --config Release
mkdir -p ~/.ai/bin/llava
cp bin/llava bin/ggml-metal.metal ~/.ai/bin/llava
Then download llava models from huggingface. If the f16 model is too big, then download a quant that is suitable for your system.
mkdir -p ~/.ai/models/llava && cd ~/.ai/models/llava
wget https://huggingface.co/mys/ggml_llava-v1.5-7b/resolve/main/ggml-model-f16.gguf
wget https://huggingface.co/mys/ggml_llava-v1.5-7b/resolve/main/mmproj-model-f16.gguf
Finally, I run it to describe an image called input-picture.jpg
cd ~/.ai/models/llava
~/.ai/bin/llava/llava ggml-model-f16.gguf mmproj-model-f16.gguf ~/Desktop/input-picture.jpg
The model repo on hugging face is here: https://huggingface.co/mys/ggml_llava-v1.5-7b/tree/main
The llava branch on llama.cpp is here: https://github.com/ggerganov/llama.cpp/tree/llava
This reddit thread got me started down this rabbit hole.
This worked on my system. If you normally use a different process to build llama.cpp on your system, then just compile llama.cpp as you usually would. For example, a CUDA system won't care about Metal code - so you should adjust accordingly.
Subreddit
Post Details
- Posted
- 1 year ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/LocalLLaMA/...