Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the Chat Mode with some tricks and considerations #353

Closed
Belluxx opened this issue Mar 21, 2023 · 13 comments
Closed

Improve the Chat Mode with some tricks and considerations #353

Belluxx opened this issue Mar 21, 2023 · 13 comments
Labels
enhancement New feature or request stale

Comments

@Belluxx
Copy link

Belluxx commented Mar 21, 2023

I noticed that often the interactive mode (used as a chat with for example the chat-with-bob.txt initial prompt) fails due to LLaMA trying to escape the chat (mainly with the expression \end{code}).

To avoid that it is possible to pass the argument -r "\end{code}" but since the expression doesn't get removed from the chat, LLaMA interprets it as the end of the chat, and all the previous dialog context (including what's inside chat-with-bob.txt) gets lost and LLaMA starts to behave weirdly.

So it would be cool to have a --chat option that detects expressions like \end{code}, removing them from the context and forcefully appending User: at the end of the chat so that it can continue without losing context.

@BadisG
Copy link

BadisG commented Mar 21, 2023

I'm not sure it's a good idea to remove this, when we'll use llama.ccp on a front end, we could use this \end{code} as a flag to close a panel.

For example

llama writes the code
\end{code} -> would serve as a flag to close the panel

A bit like chatgpt actually

@Belluxx
Copy link
Author

Belluxx commented Mar 21, 2023

@BadisG Yeah i agree that's an issue, however LLaMA misuses that flag in conversations that have nothing to do with code and after printing the flag it starts outputting nonsense.

It hapens very often (like every 8 or 10 dialog lines)

@PriNova
Copy link

PriNova commented Mar 21, 2023

I noticed that often the interactive mode (used as a chat with for example the chat-with-bob.txt initial prompt) fails due to LLaMA trying to escape the chat (mainly with the expression \end{code}).

To avoid that it is possible to pass the argument -r "\end{code}" but since the expression doesn't get removed from the chat, LLaMA interprets it as the end of the chat, and all the previous dialog context (including what's inside chat-with-bob.txt) gets lost and LLaMA starts to behave weirdly.

So it would be cool to have a --chat option that detects expressions like \end{code}, removing them from the context and forcefully appending User: at the end of the chat so that it can continue without losing context.

This is the same observation I have since the update of the PR #252.
I don't know why this happens but I guess there is an issue with the implementation of this PR.

@gjmulder gjmulder added the enhancement New feature or request label Mar 21, 2023
@Belluxx
Copy link
Author

Belluxx commented Mar 21, 2023

@PriNova so you didn't have the issue before #252?
Have you tried using the older version of the repo to see if the issue disappears?

@tjohnman
Copy link
Contributor

I think this might be related with the recent discussions we've been having about the EOS token being generated and messing up the output (#333)?

In any case, I don't think it's a good idea to detect specific strings in a hardcoded way (if that's what's being suggested). But it would be nice to be able to configure the sampler to detect certain words in order to do arbitrary actions.

For example, using a flag, like: --extract "\end{code}" "<CLI command that signals some other piece of software to do something>". This, in combination with clever prompting, could be used to "teach" a chatbot to do things with real side effects. The output of that command could even be piped back into the model so that it can access real world data.

And more in line with what @Belluxx suggested, perhaps it could be useful to have a flag that allows the sampler to throw away certain sequences we don't want generated, like: --ignore "\end{code}".

@PriNova
Copy link

PriNova commented Mar 21, 2023

@PriNova so you didn't have the issue before #252? Have you tried using the older version of the repo to see if the issue disappears?

No, it works like charm. Even with the --ignore-eos flag on the older models. The conversations did not spit out this }[/end of code] thing or something else.

@x02Sylvie
Copy link

Would it be possible to make message appear once it's ready rather than gradually as AI generates tokens

kinda like how you get full message when someone messages you on discord rather than seeing as they write it

@tjohnman
Copy link
Contributor

tjohnman commented Mar 21, 2023

Would it be possible to make message appear once it's ready rather than gradually as AI generates tokens

kinda like how you get full message when someone messages you on discord rather than seeing as they write it

The problem is that sometimes the sampler gets stuck in a loop and you would never see any text until it ran out of memory.

If tha wasn't a problem, this could be done by printing out the newly generated text whenever we are about to ask for user input. Instead of printing the newly generated words, they should be saved in a string for this. The conditions for this behavior should be that we have both interactive mode enabled and a reverse prompt.

@rabidcopy
Copy link
Contributor

rabidcopy commented Mar 21, 2023

I think I'm going to throw together a personal hack that throws characters like #()\/[]{} out of sampling by piggybacking off the -ignore-eos code. I've been getting a lot more random code garbage in my sessions.

@rabidcopy
Copy link
Contributor

rabidcopy commented Mar 21, 2023

Haha. This is a bit overkill but it works.

You:Hi
Assistant: Do you need anything?
You: Can you type a backslash?
Assistant: Sure thing! 
You: A forward slash?
Assistant: No problem! 
You: A hashtag?
Assistant: Of course! 
You: can you repeat [debug]?
Assistant: Absolutely! 
You:how about wrap a quote in paranthesis?
Assistant: With pleasure! 
You:how about a short 4 word quote about a dog?
Assistant: No worries! 
You:no paranthesis
Assistant: What would you like me to do now?
You: give me a short 4 word quote about a dog, wrap it in quotation marks
Assistant: "My Dog is so cute!" 
You: That was 5 words but thank you.
static const int EOS_TOKEN_ID = 2;
static const int HASH_TOKEN_ID = 396;
static const int FORWARDSLASH_TOKEN_ID = 847;
static const int BACKSLASH_TOKEN_ID = 320;
static const int LEFTBRACKET_TOKEN_ID = 518;
static const int RIGHTBRACKET_TOKEN_ID = 4514;
static const int LEFTCURLY_TOKEN_ID = 426;
static const int RIGHTCURLY_TOKEN_ID = 500;
static const int LEFT_TOKEN_ID = 313;
static const int RIGHT_TOKEN_ID = 1723;
                if (params.ignore_eos) {
                    // set the logit of the eos token to zero to avoid sampling it
                    logits[logits.size() - n_vocab + EOS_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + HASH_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + LEFT_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + RIGHT_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + LEFTCURLY_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + RIGHTCURLY_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + LEFTBRACKET_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + RIGHTBRACKET_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + BACKSLASH_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + FORWARDSLASH_TOKEN_ID] = 0;
                }

To be fair this isn't foolproof because there's token ids for when characters are placed next to each other and tokenized together.

@Belluxx
Copy link
Author

Belluxx commented Mar 21, 2023

That's amazing thanks @rabidcopy, i will test it asap.

Another issue that i noticed is that sometimes LLaMA stops mid generation and you need to press enter to make it continue, however it adds a \n to the chat that confuses the network and makes it completely exit from the chat and produce garbage. Did it happen to you?

@renatocron
Copy link

That's is happening to me as well, but I thought it was some kind of hallucination ,because the answer was hilarious, somewhat like a troll would respond

image

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

8 participants