Quick start example for LLaVA: generate image descriptions with llama.cpp (monatis llava branch)

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

Post Flair (click to view more posts with a particular flair)

Tutorial | Guide

Post Body

I was excited to see LLaVA support is being merged into llama.cpp thanks to the excellent work conducted by monatis.

I wanted to experiment with this myself and I used the following process on my Apple M1 32GB.

First build llama.cpp with llava support:

git clone https://github.com/ggerganov/llama.cpp.git && cd llama.cpp
git checkout llava
mkdir build && cd build
cmake .. && cmake --build . --config Release
mkdir -p ~/.ai/bin/llava
cp bin/llava bin/ggml-metal.metal ~/.ai/bin/llava

Then download llava models from huggingface. If the f16 model is too big, then download a quant that is suitable for your system.

mkdir -p ~/.ai/models/llava && cd ~/.ai/models/llava
wget https://huggingface.co/mys/ggml_llava-v1.5-7b/resolve/main/ggml-model-f16.gguf
wget https://huggingface.co/mys/ggml_llava-v1.5-7b/resolve/main/mmproj-model-f16.gguf

Finally, I run it to describe an image called input-picture.jpg

cd ~/.ai/models/llava
~/.ai/bin/llava/llava ggml-model-f16.gguf mmproj-model-f16.gguf ~/Desktop/input-picture.jpg

The model repo on hugging face is here: https://huggingface.co/mys/ggml_llava-v1.5-7b/tree/main

The llava branch on llama.cpp is here: https://github.com/ggerganov/llama.cpp/tree/llava

This reddit thread got me started down this rabbit hole.

This worked on my system. If you normally use a different process to build llama.cpp on your system, then just compile llama.cpp as you usually would. For example, a CUDA system won't care about Metal code - so you should adjust accordingly.

Author

User Disabled

Account Strength

Disabled 10 months ago

Account Age

14 years

Verified Email

Yes

Verified Flair

Total Karma

4,750

Link Karma

857

Comment Karma

3,893

Profile updated: 1 month ago

Posts updated: 11 months ago

iandennismiller

Subreddit

r/LocalLLaMA

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.

Posted: 1 year ago
Reddit URL: View post on reddit.com
External URL: reddit.com/r/LocalLLaMA/...