-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the Chat Mode with some tricks and considerations #353
Comments
I'm not sure it's a good idea to remove this, when we'll use llama.ccp on a front end, we could use this \end{code} as a flag to close a panel. For example
A bit like chatgpt actually |
@BadisG Yeah i agree that's an issue, however LLaMA misuses that flag in conversations that have nothing to do with code and after printing the flag it starts outputting nonsense. It hapens very often (like every 8 or 10 dialog lines) |
This is the same observation I have since the update of the PR #252. |
I think this might be related with the recent discussions we've been having about the EOS token being generated and messing up the output (#333)? In any case, I don't think it's a good idea to detect specific strings in a hardcoded way (if that's what's being suggested). But it would be nice to be able to configure the sampler to detect certain words in order to do arbitrary actions. For example, using a flag, like: And more in line with what @Belluxx suggested, perhaps it could be useful to have a flag that allows the sampler to throw away certain sequences we don't want generated, like: |
Would it be possible to make message appear once it's ready rather than gradually as AI generates tokens kinda like how you get full message when someone messages you on discord rather than seeing as they write it |
The problem is that sometimes the sampler gets stuck in a loop and you would never see any text until it ran out of memory. If tha wasn't a problem, this could be done by printing out the newly generated text whenever we are about to ask for user input. Instead of printing the newly generated words, they should be saved in a string for this. The conditions for this behavior should be that we have both interactive mode enabled and a reverse prompt. |
I think I'm going to throw together a personal hack that throws characters like |
Haha. This is a bit overkill but it works.
To be fair this isn't foolproof because there's token ids for when characters are placed next to each other and tokenized together. |
That's amazing thanks @rabidcopy, i will test it asap. Another issue that i noticed is that sometimes LLaMA stops mid generation and you need to press enter to make it continue, however it adds a |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I noticed that often the interactive mode (used as a chat with for example the
chat-with-bob.txt
initial prompt) fails due to LLaMA trying to escape the chat (mainly with the expression\end{code}
).To avoid that it is possible to pass the argument
-r "\end{code}"
but since the expression doesn't get removed from the chat, LLaMA interprets it as the end of the chat, and all the previous dialog context (including what's insidechat-with-bob.txt
) gets lost and LLaMA starts to behave weirdly.So it would be cool to have a
--chat
option that detects expressions like\end{code}
, removing them from the context and forcefully appendingUser:
at the end of the chat so that it can continue without losing context.The text was updated successfully, but these errors were encountered: