Agent should send interrupt on push_aggregation or `EmulateUserStartedSpeakingFrame` ? #1258

seanmuirhead · 2025-02-20T19:21:56Z

Have some cases where the agent is saying the same thing twice in a row without the use getting the chance to speak. I've been able to replicate, and I think it is because we are not pushing StartInterruptionFrames when we push EmulateUserStartedSpeakingFrame and when we are pushing an aggregation from the LLMUserContextAggregator?

The text was updated successfully, but these errors were encountered:

aconchillo · 2025-02-21T06:42:27Z

Have some cases where the agent is saying the same thing twice in a row without the use getting the chance to speak. I've been able to replicate, and I think it is because we are not pushing StartInterruptionFrames when we push EmulateUserStartedSpeakingFrame and when we are pushing an aggregation from the LLMUserContextAggregator?

In theory, if a EmulateUserStartedSpeakingFrame is pushed upstream this will immediately cause an interruption: StartInterruptionFrame + UserStartedSpeakingFrame.

Can you describe Pipecat version and what processors do you have in your pipeline? For example, do you use a STTMuteFilter?

seanmuirhead · 2025-02-21T18:07:24Z

In theory, if a EmulateUserStartedSpeakingFrame is pushed upstream this will immediately cause an interruption: StartInterruptionFrame + UserStartedSpeakingFrame.

Ah makes sense

Can you describe Pipecat version and what processors do you have in your pipeline? For example, do you use a STTMuteFilter?

Version: 0.0.57

Yes we do use the STTMuteFilter, here is our Pipeline:

pipeline = Pipeline(
        [
            transport.input(),
            stt_mute_filter,  # Should always be before STT
            stt,
            context_aggregator.user(),
            llm,
            tts,
            custom_frame_processor,
            transport.output(),
            context_aggregator.assistant(),
        ],
    )

    task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))

The custom_frame_processor controls the stt_mute_filter sometimes, depending on a variety of TTS events

aconchillo linked a pull request Feb 21, 2025 that will close this issue

services: fix some TTS websocket service interruption handling #1272

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent should send interrupt on push_aggregation or `EmulateUserStartedSpeakingFrame` ? #1258

Agent should send interrupt on push_aggregation or `EmulateUserStartedSpeakingFrame` ? #1258

seanmuirhead commented Feb 20, 2025

aconchillo commented Feb 21, 2025

seanmuirhead commented Feb 21, 2025

Agent should send interrupt on push_aggregation or EmulateUserStartedSpeakingFrame ? #1258

Agent should send interrupt on push_aggregation or EmulateUserStartedSpeakingFrame ? #1258

Comments

seanmuirhead commented Feb 20, 2025

aconchillo commented Feb 21, 2025

seanmuirhead commented Feb 21, 2025

Agent should send interrupt on push_aggregation or `EmulateUserStartedSpeakingFrame` ? #1258

Agent should send interrupt on push_aggregation or `EmulateUserStartedSpeakingFrame` ? #1258