-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v4 - requests against connections with unexpectedly closed sockets don't get fulfilled after reconnect #1789
Comments
This comment has been minimized.
This comment has been minimized.
@dhensby which version of npm are you using? because the project uses npm workspaces, which were introduced in npm v7 Regarding the issue itself - "When a socket closed unexpectedly, all the commands that were already sent will reject as they might have been executed on the server. The rest will remain queued in memory until a new socket is established. If the client is closed—either by returning an error from reconnectStrategy or by manually calling .disconnect()—they will be rejected." (https://github.com/redis/node-redis/blob/master/docs/FAQ.md#what-happens-when-the-network-goes-down) |
@leibale - thanks for the pointer, I upgraded NPM and it has built the tools now. I thought I had done this (as I noticed the github workflow doing it) but clearly not... Regarding the issue:
If what you cite is intended behaviour, then it's a pretty big regression from v3 and the logic that the request shouldn't be sent to the server because it may have already been fulfilled doesn't make sense if the socket was already closed before the request was sent and especially makes no sense for read operations (read operations could not have been fulfilled by the server, at leat not as far as the client/application is concerned, if the connection is down). |
My bad, I've missed the "hanging" part... 🤦♂️ This is not the intended behavior, and I was thinking that this test covers this, but I guess I'm missing something.. do you have a reproduction of that bug? |
I'm not sure I've been battling a little to get the test suite running locally, but I can run it enough to try to create a test now. I just need a way to stub/force the socket to emit an error event as needed 🤔 |
Hi @cit68. @leibale has written a subsequent comment indicating that the quote does not apply here. This is a bug and there are some changes in behaviour since v3, I'm not clear on if they are intentional. The main changes for when the connection has been reset (disconnected due to server-side timeout) then:
This issue isn't pertaining to those changes as they could well be intentional (though my personal preference would be that there was no error event and there was a reconnecting event emitted, because I don't want to have to filter "false" errors from the error listener). The issue here is that the request that triggers the reconnection logic in these circumstances never resolves. |
@leibale Sorry!!! The reason is that I had put a wrong label in the "reconnect" handler so I thought it was not possible to have a successful pong and write / read command without any reconnection and created the previous comment. But really there is an automatic reconnection before these commands, so my comment made no sense to exist. Thanks for the reply. But reading your point n°2 |
Ah, ok - Maybe I'm mis-remembering. I haven't looked at this for a couple of weeks. |
I've recently upgraded from v3 to v4 and noticed that we're seeing
client.get()
calls that are hanging. I've not managed to pin-point the exact area where the problem is originating (compiled typescript doesn't make this easy) but we are noticing this in conjunction with connections that are being closed (presumably by the sever) after inactivity.Cursory investigations appear to point to connections where the socket has been closed by the server (after ~10 minutes of inactivity) which are then resulting in a re-connection attempts (which is ultimately successful) but the queued
get
request is never being fulfilled.I've remedied this by adding a
ping
keep-alive (as suggested here, though implemented in a slightly less naive way) and the problem has gone away (we're no longer seeing closed connections and no longer seeing unfulfilled requests).This was not a problem in v3, the connection may timeout but a reconnect was made and the
get
call fulfilled successfully.Possibly related issues:
I'll spend a bit of time trying to reproduce this in a test, but ultimately I've papered over this issue by implementing a ping-based keep-alive for my client.
Other comments:
In v3 this situation (attempting to make a request when the underlying stream was closed) resulted in a
reconnecting
event and noerror
event. In v4 we are seeing noreconnecting
event and we are seeing anerror
event. I'm not sure if this is an intended change, but it makes discerning genuine errors and errors that can be handled by the built in re-connection strategy harder.Environment:
The text was updated successfully, but these errors were encountered: