-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embeddings models? #43
Comments
I'm interested in this too. Meanwhile I'm using Facebook StarSpace for sentence embeddings. It's very fast and easy to use. The official version, however, has some issues which are fixed here: https://github.com/rekola/StarSpace |
See the work here: ggml-org/llama.cpp#282 |
That's cool! But does LLaMA have a tiny version similar to OpenAI Ada to avoid wasting resources? I don't think most use-cases need anything more then BERT, which inference would be quite cool to see in GGML. |
Sounds interesting. Does StarSpace have a pretrained general model? One use-case I had was converting Whisper VTT files into paragraph-split transcripts, which worked by comparing the similarity of each sentence to the previous one, inserting two line-breaks if a threshold is met. Maybe this could even become an official Whisper demo if something like this would be there. |
It doesn't. So far, I've trained a Finnish model using social media data, and I will be testing a multi-lingual model next.
That's interesting. I'll have to try that when I need paragraph vectors. |
I've implemented BERT in ggml here: https://github.com/skeskinen/bert.cpp |
It works amazingly! Being such a tiny model, I've always wondered if the reason it was so sluggish on my laptop was just a ton of Python bloat. Turns out that guess was indeed correct! bert.cpp could go though each sentence in a full video 15 minute transcript in less than 3 seconds, while the Python version spent that amount of time on a single sentence. Thanks a lot! Perhaps it could be integrated within whisper.cpp? I was planning on using it in my tiny Python script that transcribed a video through Whisper, generated vector embedding, compared each pair of sentences to each other, and if they're different enough, split them into paragraphs. Perhaps something like this can become an official example? Ether using this naive method or this more accurate one that I've found in a Medium article. I think going from a video to a formatted blog post without sending a byte of data into the cloud could help a ton of people and would make for a cool demo. |
* Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <[email protected]>
Is it possible to use GGML for faster and more portable calculating of sentence embeddings? That might make for a useful offline text search tool.
The text was updated successfully, but these errors were encountered: