Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Failed to handle stream account_data: TypeError: '<' not supported between instances of 'str' and 'NoneType' #7617

Closed
richvdh opened this issue Jun 2, 2020 · 2 comments · Fixed by #7656
Assignees
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication)

Comments

@richvdh
Copy link
Member

richvdh commented Jun 2, 2020

possibly this happens if we have global account data and per-room account data with the same stream_id?

TypeError: '<' not supported between instances of 'str' and 'NoneType'
  File "synapse/metrics/background_process_metrics.py", line 213, in run
    return (yield result)
  File "twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "synapse/replication/tcp/resource.py", line 146, in _run_notifier_loop
    updates, current_token, limited = await stream.get_updates()
  File "synapse/replication/tcp/streams/_base.py", line 161, in get_updates
    self.local_instance_name, self.last_token, current_token
  File "synapse/replication/tcp/streams/_base.py", line 186, in get_updates_since
    instance_name, from_token, upto_token, _STREAM_UPDATE_TARGET_ROW_COUNT,
  File "synapse/replication/tcp/streams/_base.py", line 604, in _update_function
    updates = list(heapq.merge(room_rows, global_rows))
  File "heapq.py", line 353, in merge
    _heapify(h)
@cremesk
Copy link
Contributor

cremesk commented Jun 4, 2020

i see the same issue in my synapse log.

@sndrsmnk
Copy link

sndrsmnk commented Jun 5, 2020

Yes. Seeing the same.

Some more logging for the Googles:

2020-06-05 23:41:25,607 - synapse.access.http.9093 - 311 - INFO - POST-141761 - - - 9093 - {None} Processed request: 0.010sec/0.001sec (0.001sec, 0.000sec) (0.001sec/0.004sec/1) 4B 200 "POST /_synapse/replication/fed_send_edu/m.presence/YFkxZcrwmx HTTP/1.1" "Synapse/1.14.0
" [0 dbevts]
2020-06-05 23:41:25,611 - synapse.replication.tcp.resource - 158 - INFO - replication_notifier-31487 - Streaming: presence -> 30072754
2020-06-05 23:41:25,619 - synapse.replication.tcp.resource - 149 - INFO - replication_notifier-31487 - Failed to handle stream account_data
2020-06-05 23:41:25,619 - synapse.metrics.background_process_metrics - 215 - ERROR - replication_notifier-31487 - Background process 'replication_notifier' threw an exception
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: [ .. participants, m.fully_read data .. ]

During handling of the above exception, another exception occurred:

[ .. and then the trace shown above .. ]

@erikjohnston erikjohnston self-assigned this Jun 8, 2020
erikjohnston added a commit that referenced this issue Jun 8, 2020
If the same stream ID was used in both global and room account data then
the getting udpates for the replication stream would fail due to
`heapq.merge(..)` trying to compare a `str` with a `None`. (This is
because you'd have two rows like `(534, '!room')` and `(534, None)` from
the room and global account data tables).

Fix is just to order by stream ID, since we don't rely on the ordering
beyond that. The bug where stream IDs can be reused should be fixed now,
so this case shouldn't happen going forward.

Fixes #7617
erikjohnston added a commit that referenced this issue Jun 9, 2020
* Ensure account data stream IDs are unique.

The account data stream is shared between three tables, and the maximum
allocated ID was tracked in a dedicated table. Updating the max ID
happened outside the transaction that allocated the ID, leading to a
race where if the server was restarted then the same ID could be
allocated but the max ID failed to be updated, leading it to be reused.

The ID generators have support for tracking across multiple tables, so
we may as well use that instead of a dedicated table.

* Fix bug in account data replication stream.

If the same stream ID was used in both global and room account data then
the getting updates for the replication stream would fail due to
`heapq.merge(..)` trying to compare a `str` with a `None`. (This is
because you'd have two rows like `(534, '!room')` and `(534, None)` from
the room and global account data tables).

Fix is just to order by stream ID, since we don't rely on the ordering
beyond that. The bug where stream IDs can be reused should be fixed now,
so this case shouldn't happen going forward.

Fixes #7617
phil-flex pushed a commit to phil-flex/synapse that referenced this issue Jun 16, 2020
* Ensure account data stream IDs are unique.

The account data stream is shared between three tables, and the maximum
allocated ID was tracked in a dedicated table. Updating the max ID
happened outside the transaction that allocated the ID, leading to a
race where if the server was restarted then the same ID could be
allocated but the max ID failed to be updated, leading it to be reused.

The ID generators have support for tracking across multiple tables, so
we may as well use that instead of a dedicated table.

* Fix bug in account data replication stream.

If the same stream ID was used in both global and room account data then
the getting updates for the replication stream would fail due to
`heapq.merge(..)` trying to compare a `str` with a `None`. (This is
because you'd have two rows like `(534, '!room')` and `(534, None)` from
the room and global account data tables).

Fix is just to order by stream ID, since we don't rely on the ordering
beyond that. The bug where stream IDs can be reused should be fixed now,
so this case shouldn't happen going forward.

Fixes matrix-org#7617
@richvdh richvdh added the A-Workers Problems related to running Synapse in Worker Mode (or replication) label Feb 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants