You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the video the CPU usage spikes when the model is asked a question using a custom plugin to Datasette that also uses ttok to log tokens used and hereafter no CPU usage until after the 20 second mark.
This delay is not visible when online.
Code used:
Dockerfile:
FROM python:3.11
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
# Download mistral-7b-instruct-v0 RUN llm -m mistral-7b-instruct-v0 "Fun fact about AI?" --no-log --no-stream
# Set default modelRUN llm models default mistral-7b-instruct-v0
# Fix no internet bug using https://github.com/simonw/llm-gpt4all/pull/18COPY llm_gpt4all.py /usr/local/lib/python3.11/site-packages/
requirements.txt:
datasette
llm
llm-gpt4all
ttok
ttok and llm-gpt4all is at version 0.2
The text was updated successfully, but these errors were encountered:
ttok adds a 20 second delay when using llm-gpt4all in offline compared to online tested on mistral-7b-instruct-v0.
This is easily seen when looking at the CPU usage after asking the model a question as shown in this video:
llm-gpt4all.offline.and.no.CPU.usage.before.20.seconds.after.question.asked.to.mistral-7b-instruct-v0.webm
In the video the CPU usage spikes when the model is asked a question using a custom plugin to Datasette that also uses ttok to log tokens used and hereafter no CPU usage until after the 20 second mark.
This delay is not visible when online.
Code used:
Dockerfile:
requirements.txt:
ttok and llm-gpt4all is at version 0.2
The text was updated successfully, but these errors were encountered: