Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛Webserver: exclusive/non-exclusive RabbitMQ consumers are deleting each other, and also probably replacing each other #5415

Conversation

sanderegg
Copy link
Member

@sanderegg sanderegg commented Mar 5, 2024

What do these changes do?

After some Graylog analysis it came up that the following issue callback would happen quite often. It was found that both _rabbitmq_nonexclusive_queue_consumers and _rabbitmq_exclusive_queue_consumers were using the same app[APP_RABBITMQ_CONSUMERS_KEY] to store their respective consumers. E.G. on initialization the last one would replace the first one, and on deletion as well, thus the errors shown in Graylog. Not sure if this could also have some influence on the lost logs issues.

Another issue was also identified when a user is logging out (also visible in e2e tests), and for this an additional log output was added.

Unhandled exception:
Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/channel.py", line 182, in rpc
    result = await countdown(self.rpc_frames.get())
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/tools.py", line 95, in __call__
    return await coro
  File "/usr/local/lib/python3.10/asyncio/queues.py", line 159, in get
    await getter
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/abc.py", line 44, in __inner
    return await self.task
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.10/site-packages/servicelib/logging_utils.py", line 232, in log_catch
    yield
  File "/home/scu/.venv/lib/python3.10/site-packages/simcore_service_webserver/notifications/_rabbitmq_exclusive_queue_consumers.py", line 204, in _unsubscribe_from_rabbitmq
    await logged_gather(
  File "/home/scu/.venv/lib/python3.10/site-packages/servicelib/utils.py", line 141, in logged_gather
    raise error
  File "/home/scu/.venv/lib/python3.10/site-packages/servicelib/rabbitmq/_client.py", line 298, in unsubscribe
    queue = await channel.get_queue(queue_name)
  File "/home/scu/.venv/lib/python3.10/site-packages/aio_pika/channel.py", line 375, in get_queue
    return await self.declare_queue(name=name, passive=True)
  File "/home/scu/.venv/lib/python3.10/site-packages/aio_pika/robust_channel.py", line 233, in declare_queue
    queue: RobustQueue = await super().declare_queue(   # type: ignore
  File "/home/scu/.venv/lib/python3.10/site-packages/aio_pika/channel.py", line 350, in declare_queue
    await queue.declare(timeout=timeout)
  File "/home/scu/.venv/lib/python3.10/site-packages/aio_pika/queue.py", line 86, in declare
    self.declaration_result = await channel.queue_declare(
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/channel.py", line 858, in queue_declare
    return await self.rpc(
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/base.py", line 164, in wrap
    return await self.create_task(func(self, *args, **kwargs))
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/abc.py", line 46, in __inner
    raise self._exception from e
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 650, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/abc.py", line 46, in __inner
    raise self._exception from e
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/channel.py", line 441, in _reader
    await hook(frame)
  File "/home/scu/.venv/lib/python3.10/site-packages/aiormq/channel.py", line 410, in _on_close_frame
    raise exc
aiormq.exceptions.ChannelNotFoundEntity: NOT_FOUND - no queue 'simcore.services.instrumentation' in vhost '/'

Related issue/s

How to test

Dev Checklist

DevOps Checklist

@sanderegg sanderegg added the a:webserver issue related to the webserver service label Mar 5, 2024
@sanderegg sanderegg added this to the Schoggilebe milestone Mar 5, 2024
@sanderegg sanderegg self-assigned this Mar 5, 2024
@sanderegg sanderegg requested a review from bisgaard-itis March 5, 2024 21:33
Copy link

sonarqubecloud bot commented Mar 5, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch!

Copy link
Contributor

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! this might be also connected with sometimes missing logs?

@sanderegg sanderegg merged commit 0041075 into ITISFoundation:master Mar 6, 2024
50 checks passed
@sanderegg sanderegg deleted the bugfix/rabbitmq_consumers_get_deleted branch March 6, 2024 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:webserver issue related to the webserver service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants