KoboldCpp 1.67 released - Integrated whisper.cpp and quantized KV cache

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

219

Post Flair (click to view more posts with a particular flair)

Resources

Post Body

Please watch with sound

KoboldCpp 1.67 has now integrated whisper.cpp functionality, providing two new Speech-To-Text endpoints `/api/extra/transcribe` used by KoboldCpp, and the OpenAI compatible drop-in `/v1/audio/transcriptions`. Both endpoints accept payloads as .wav file uploads (max 32MB), or base64 encoded wave data.

Kobold Lite can now also utilize the microphone when enabled in settings panel. You can use Push-To-Talk (PTT) or automatic Voice Activity Detection (VAD) aka Hands Free Mode, everything runs locally within your browser including resampling and wav format conversion, and interfaces directly with the KoboldCpp transcription endpoint.

Special thanks to ggerganov and all the developers of whisper.cpp, without which none of this would have been possible.

Additionally, the Quantized KV Cache enhancements from llama.cpp have also been merged, and can now be used in KoboldCpp. Note that using the quantized KV option requires flash attention enabled and context shift disabled.

The setup shown in the video can be run fully offline on a single device.

Text Generation = MistRP 7B (KoboldCpp)
Image Generation = SD 1.5 PicX Real (KoboldCpp)
Speech To Text = whisper-base.en-q5_1 (KoboldCpp)
Image Recognition = mistral-7b-mmproj-v1.5-Q4_1 (KoboldCpp)
Text To Speech = XTTSv2 with custom sample (XTTS API Server)

See full changelog here: https://github.com/LostRuins/koboldcpp/releases/latest

Duplicate Posts

2 posts with the exact same title by 1 other authors

View Details

Author

Account Strength

90%

Account Age

5 years

Verified Email

Yes

Verified Flair

Total Karma

2,752

Link Karma

1,526

Comment Karma

1,218

Profile updated: 1 day ago

HadesThrowaway

Subreddit

r/LocalLLaMA

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.

Posted: 5 months ago
Reddit URL: View post on reddit.com
External URL: reddit.com/r/LocalLLaMA/...