Improve the Chat Mode with some tricks and considerations #353

Belluxx · 2023-03-21T13:44:45Z

I noticed that often the interactive mode (used as a chat with for example the chat-with-bob.txt initial prompt) fails due to LLaMA trying to escape the chat (mainly with the expression \end{code}).

To avoid that it is possible to pass the argument -r "\end{code}" but since the expression doesn't get removed from the chat, LLaMA interprets it as the end of the chat, and all the previous dialog context (including what's inside chat-with-bob.txt) gets lost and LLaMA starts to behave weirdly.

So it would be cool to have a --chat option that detects expressions like \end{code}, removing them from the context and forcefully appending User: at the end of the chat so that it can continue without losing context.

The text was updated successfully, but these errors were encountered:

BadisG · 2023-03-21T14:00:47Z

I'm not sure it's a good idea to remove this, when we'll use llama.ccp on a front end, we could use this \end{code} as a flag to close a panel.

For example

llama writes the code
\end{code} -> would serve as a flag to close the panel

A bit like chatgpt actually

Belluxx · 2023-03-21T14:09:27Z

@BadisG Yeah i agree that's an issue, however LLaMA misuses that flag in conversations that have nothing to do with code and after printing the flag it starts outputting nonsense.

It hapens very often (like every 8 or 10 dialog lines)

PriNova · 2023-03-21T14:10:21Z

I noticed that often the interactive mode (used as a chat with for example the chat-with-bob.txt initial prompt) fails due to LLaMA trying to escape the chat (mainly with the expression \end{code}).

To avoid that it is possible to pass the argument -r "\end{code}" but since the expression doesn't get removed from the chat, LLaMA interprets it as the end of the chat, and all the previous dialog context (including what's inside chat-with-bob.txt) gets lost and LLaMA starts to behave weirdly.

So it would be cool to have a --chat option that detects expressions like \end{code}, removing them from the context and forcefully appending User: at the end of the chat so that it can continue without losing context.

This is the same observation I have since the update of the PR #252.
I don't know why this happens but I guess there is an issue with the implementation of this PR.

Belluxx · 2023-03-21T15:02:11Z

@PriNova so you didn't have the issue before #252?
Have you tried using the older version of the repo to see if the issue disappears?

tjohnman · 2023-03-21T16:36:29Z

I think this might be related with the recent discussions we've been having about the EOS token being generated and messing up the output (#333)?

In any case, I don't think it's a good idea to detect specific strings in a hardcoded way (if that's what's being suggested). But it would be nice to be able to configure the sampler to detect certain words in order to do arbitrary actions.

For example, using a flag, like: --extract "\end{code}" "<CLI command that signals some other piece of software to do something>". This, in combination with clever prompting, could be used to "teach" a chatbot to do things with real side effects. The output of that command could even be piped back into the model so that it can access real world data.

And more in line with what @Belluxx suggested, perhaps it could be useful to have a flag that allows the sampler to throw away certain sequences we don't want generated, like: --ignore "\end{code}".

PriNova · 2023-03-21T16:47:25Z

@PriNova so you didn't have the issue before #252? Have you tried using the older version of the repo to see if the issue disappears?

No, it works like charm. Even with the --ignore-eos flag on the older models. The conversations did not spit out this }[/end of code] thing or something else.

x02Sylvie · 2023-03-21T17:47:26Z

Would it be possible to make message appear once it's ready rather than gradually as AI generates tokens

kinda like how you get full message when someone messages you on discord rather than seeing as they write it

tjohnman · 2023-03-21T18:22:43Z

Would it be possible to make message appear once it's ready rather than gradually as AI generates tokens

kinda like how you get full message when someone messages you on discord rather than seeing as they write it

The problem is that sometimes the sampler gets stuck in a loop and you would never see any text until it ran out of memory.

If tha wasn't a problem, this could be done by printing out the newly generated text whenever we are about to ask for user input. Instead of printing the newly generated words, they should be saved in a string for this. The conditions for this behavior should be that we have both interactive mode enabled and a reverse prompt.

rabidcopy · 2023-03-21T20:45:47Z

I think I'm going to throw together a personal hack that throws characters like #()\/[]{} out of sampling by piggybacking off the -ignore-eos code. I've been getting a lot more random code garbage in my sessions.

rabidcopy · 2023-03-21T21:02:56Z

Haha. This is a bit overkill but it works.

You:Hi
Assistant: Do you need anything?
You: Can you type a backslash?
Assistant: Sure thing! 
You: A forward slash?
Assistant: No problem! 
You: A hashtag?
Assistant: Of course! 
You: can you repeat [debug]?
Assistant: Absolutely! 
You:how about wrap a quote in paranthesis?
Assistant: With pleasure! 
You:how about a short 4 word quote about a dog?
Assistant: No worries! 
You:no paranthesis
Assistant: What would you like me to do now?
You: give me a short 4 word quote about a dog, wrap it in quotation marks
Assistant: "My Dog is so cute!" 
You: That was 5 words but thank you.

static const int EOS_TOKEN_ID = 2;
static const int HASH_TOKEN_ID = 396;
static const int FORWARDSLASH_TOKEN_ID = 847;
static const int BACKSLASH_TOKEN_ID = 320;
static const int LEFTBRACKET_TOKEN_ID = 518;
static const int RIGHTBRACKET_TOKEN_ID = 4514;
static const int LEFTCURLY_TOKEN_ID = 426;
static const int RIGHTCURLY_TOKEN_ID = 500;
static const int LEFT_TOKEN_ID = 313;
static const int RIGHT_TOKEN_ID = 1723;

                if (params.ignore_eos) {
                    // set the logit of the eos token to zero to avoid sampling it
                    logits[logits.size() - n_vocab + EOS_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + HASH_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + LEFT_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + RIGHT_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + LEFTCURLY_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + RIGHTCURLY_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + LEFTBRACKET_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + RIGHTBRACKET_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + BACKSLASH_TOKEN_ID] = 0;
                    logits[logits.size() - n_vocab + FORWARDSLASH_TOKEN_ID] = 0;
                }

To be fair this isn't foolproof because there's token ids for when characters are placed next to each other and tokenized together.

Belluxx · 2023-03-21T21:55:25Z

That's amazing thanks @rabidcopy, i will test it asap.

Another issue that i noticed is that sometimes LLaMA stops mid generation and you need to press enter to make it continue, however it adds a \n to the chat that confuses the network and makes it completely exit from the chat and produce garbage. Did it happen to you?

renatocron · 2023-03-22T01:35:06Z

That's is happening to me as well, but I thought it was some kind of hallucination ,because the answer was hilarious, somewhat like a troll would respond

github-actions · 2024-04-10T01:08:05Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

gjmulder added the enhancement New feature or request label Mar 21, 2023

github-actions bot added the stale label Mar 25, 2024

github-actions bot closed this as completed Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the Chat Mode with some tricks and considerations #353

Improve the Chat Mode with some tricks and considerations #353

Belluxx commented Mar 21, 2023

BadisG commented Mar 21, 2023 •

edited

Loading

Belluxx commented Mar 21, 2023

PriNova commented Mar 21, 2023

Belluxx commented Mar 21, 2023

tjohnman commented Mar 21, 2023

PriNova commented Mar 21, 2023 •

edited

Loading

x02Sylvie commented Mar 21, 2023

tjohnman commented Mar 21, 2023 •

edited

Loading

rabidcopy commented Mar 21, 2023 •

edited

Loading

rabidcopy commented Mar 21, 2023 •

edited

Loading

Belluxx commented Mar 21, 2023

renatocron commented Mar 22, 2023

github-actions bot commented Apr 10, 2024

Improve the Chat Mode with some tricks and considerations #353

Improve the Chat Mode with some tricks and considerations #353

Comments

Belluxx commented Mar 21, 2023

BadisG commented Mar 21, 2023 • edited Loading

Belluxx commented Mar 21, 2023

PriNova commented Mar 21, 2023

Belluxx commented Mar 21, 2023

tjohnman commented Mar 21, 2023

PriNova commented Mar 21, 2023 • edited Loading

x02Sylvie commented Mar 21, 2023

tjohnman commented Mar 21, 2023 • edited Loading

rabidcopy commented Mar 21, 2023 • edited Loading

rabidcopy commented Mar 21, 2023 • edited Loading

Belluxx commented Mar 21, 2023

renatocron commented Mar 22, 2023

github-actions bot commented Apr 10, 2024

BadisG commented Mar 21, 2023 •

edited

Loading

PriNova commented Mar 21, 2023 •

edited

Loading

tjohnman commented Mar 21, 2023 •

edited

Loading

rabidcopy commented Mar 21, 2023 •

edited

Loading

rabidcopy commented Mar 21, 2023 •

edited

Loading