Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ttok adds a 20 second delay when using llm-gpt4all when offline #10

Open
learning4life opened this issue Nov 20, 2023 · 1 comment
Open

Comments

@learning4life
Copy link

ttok adds a 20 second delay when using llm-gpt4all in offline compared to online tested on mistral-7b-instruct-v0.

This is easily seen when looking at the CPU usage after asking the model a question as shown in this video:

llm-gpt4all.offline.and.no.CPU.usage.before.20.seconds.after.question.asked.to.mistral-7b-instruct-v0.webm

In the video the CPU usage spikes when the model is asked a question using a custom plugin to Datasette that also uses ttok to log tokens used and hereafter no CPU usage until after the 20 second mark.

This delay is not visible when online.

Code used:
Dockerfile:

FROM python:3.11
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
# Download mistral-7b-instruct-v0 
RUN llm -m mistral-7b-instruct-v0 "Fun fact about AI?" --no-log --no-stream  
# Set default model
RUN llm models default mistral-7b-instruct-v0
# Fix no internet bug using https://github.com/simonw/llm-gpt4all/pull/18
COPY llm_gpt4all.py /usr/local/lib/python3.11/site-packages/

requirements.txt:

datasette
llm 
llm-gpt4all
ttok

ttok and llm-gpt4all is at version 0.2

@learning4life
Copy link
Author

I am unable to determine the course of this bug and not sure if it relates to ttok or llm-gpt4all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant