Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent should send interrupt on push_aggregation or EmulateUserStartedSpeakingFrame ? #1258

Open
seanmuirhead opened this issue Feb 20, 2025 · 2 comments · May be fixed by #1272
Open

Agent should send interrupt on push_aggregation or EmulateUserStartedSpeakingFrame ? #1258

seanmuirhead opened this issue Feb 20, 2025 · 2 comments · May be fixed by #1272

Comments

@seanmuirhead
Copy link

Have some cases where the agent is saying the same thing twice in a row without the use getting the chance to speak. I've been able to replicate, and I think it is because we are not pushing StartInterruptionFrames when we push EmulateUserStartedSpeakingFrame and when we are pushing an aggregation from the LLMUserContextAggregator?

@aconchillo
Copy link
Contributor

Have some cases where the agent is saying the same thing twice in a row without the use getting the chance to speak. I've been able to replicate, and I think it is because we are not pushing StartInterruptionFrames when we push EmulateUserStartedSpeakingFrame and when we are pushing an aggregation from the LLMUserContextAggregator?

In theory, if a EmulateUserStartedSpeakingFrame is pushed upstream this will immediately cause an interruption: StartInterruptionFrame + UserStartedSpeakingFrame.

Can you describe Pipecat version and what processors do you have in your pipeline? For example, do you use a STTMuteFilter?

@seanmuirhead
Copy link
Author

In theory, if a EmulateUserStartedSpeakingFrame is pushed upstream this will immediately cause an interruption: StartInterruptionFrame + UserStartedSpeakingFrame.

Ah makes sense

Can you describe Pipecat version and what processors do you have in your pipeline? For example, do you use a STTMuteFilter?

Version: 0.0.57

Yes we do use the STTMuteFilter, here is our Pipeline:

pipeline = Pipeline(
        [
            transport.input(),
            stt_mute_filter,  # Should always be before STT
            stt,
            context_aggregator.user(),
            llm,
            tts,
            custom_frame_processor,
            transport.output(),
            context_aggregator.assistant(),
        ],
    )

    task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))

The custom_frame_processor controls the stt_mute_filter sometimes, depending on a variety of TTS events

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants