Socket closed unexpectedly #2276

richardsun2021 · 2022-09-29T07:26:23Z

 this.redisClient = redis.createClient({
			url: `redis://${redisHost}:${redisPort}`,
			socket: {
				connectTimeout: 60000,
				keepAlive: 60000,
				reconnectStrategy: (attempts) => {
					logger.log(`Redis reconnecting attempt ${attempts}`);
					if (attempts == 1) {
						console.log(`${this.constructor.name} failed to connect to ${redisHost}:${redisPort}. Reconnecting...`);
					}
					return 500;
				},
			},
			password: ******
		});
		this.redisClient.on('error', err => logger.error(err, "Redis v4 client error"));
		this.redisClient.on('connect', () => logger.log('Redis v4 is connect'));
		this.redisClient.on('reconnecting', () => logger.log('Redis v4 is reconnecting'));
		this.redisClient.on('ready', () => logger.log('Redis v4 is ready'));
		this.redisClient.connect();

between my app service and redis service, there is an aws NLB-network load balancer.
I have four redis client created, and every 2 minutes will get this error:
"error: Socket closed unexpectedly"

Environment:

Node.js Version: "16.16"
Redis Server Version: "6.2"
Node Redis Version: "4.3.1"
Platform: aws Ecs

The text was updated successfully, but these errors were encountered:

abhi12299 · 2022-10-02T15:03:55Z

Same issue here.. running Azure Redis Cache with TLS results in random "Socket closed unexpectedly" errors every 5-10mins after starting my application (on AKS). This error also occurs when running the app locally on my machine (no proxies/load balancers).

schmod · 2022-10-04T13:54:14Z

Try setting connectTimeout: false

I think this library is mis-using Node's Socket.setTimeout() method, and is tearing the connection down after any idle period of 60 seconds or more.

yaroslavputria · 2022-10-06T12:37:44Z

Try setting connectTimeout: false

@schmod, do you mean connectTimeout: 0? - socket.connectTimeout is number.

richardsun2021 · 2022-10-10T08:38:18Z

it works fine for me locally, pre-production test found this issue.

Same issue here.. running Azure Redis Cache with TLS results in random "Socket closed unexpectedly" errors every 5-10mins after starting my application (on AKS). This error also occurs when running the app locally on my machine (no proxies/load balancers).

abhi12299 · 2022-10-13T06:52:00Z

Switched to ioredis for now.. this library is not suitable for production use until the issue is fixed.

richardsun2021 · 2022-10-17T02:14:21Z

Switched to ioredis for now.. this library is not suitable for production use until the issue is fixed.

while 3.* is stable, but no bloom module, for me would like to switch to 3.* and implement bloom by other lib.

shawnlknight · 2022-10-25T19:13:00Z

I am also seeing this issue and will not upgrade to v4 until it is resolved. Any word on this?!?

leibale · 2022-10-25T19:39:25Z

Hi everyone, sorry for the dealy, I was on vacation..

I wasn't able to reproduce the error with "docker run redis/redis-stack-server", can you please help me reproduce the issue so I'll be able to fix it?

schmod · 2022-10-25T19:42:24Z

I'll reiterate my previous comment about connectTimeout – the current implementation calls:

if (this.#options.connectTimeout) {
   socket.setTimeout(this.#options.connectTimeout, () => socket.destroy(new ConnectionTimeoutError()));
}

However, socket.setTimeout isn't a connection timeout – it's a timeout for any activity on the socket. Let it sit idle for too long, and it'll self-destruct.

From the Node docs:

Sets the socket to timeout after timeout milliseconds of inactivity on the socket. By default net.Socket do not have a timeout.

When an idle timeout is triggered the socket will receive a 'timeout' event but the connection will not be severed. The user must manually call socket.end() or socket.destroy() to end the connection.

I'm not sure if this is the cause of the issues described in this thread, but it's almost certainly a bug.

leibale · 2022-10-25T20:15:36Z

@schmod the timeout is reset to 0 after the connect event, see this line

abhi12299 · 2022-10-26T05:37:18Z

@schmod for some context, I tried to ping the redis server in my health check endpoint, so that there's no inactivity (the health check endpoint is called every 5s). But the connection still terminates after a few minutes (~5-10mins).
This only happened when I connected to Azure Redis Cache and not when connecting to local redis instance.

PS, I also talked to Microsoft support team for the same but there wasn't any issue with the connection options and setup.

leibale · 2022-10-27T18:30:21Z

see #1598

abhi12299 · 2022-10-27T19:23:25Z

@leibale I had already tried the ping approach, but this still doesn't work.. I guess there's some issue with the library because I've migrated to ioredis and it works perfectly fine. No drops whatsoever.

leibale · 2022-11-01T22:40:21Z

I've tried to reproduce the issue with "Azure Cache for Redis", but didn't manage to... if one of you can make a call with me and show me the issue that will be very helpful :)

edit:
I managed to reproduce it, will keep you posted

leibale · 2022-11-01T23:22:14Z

Might be a duplicate of #1598, see https://learn.microsoft.com/en-us/azure/azure-cache-for-redis/cache-best-practices-connection#idle-timeout

josh803316 · 2022-11-03T16:51:23Z

We are running into this too:

We have an Elasticache instance in AWS (Engine: 6.2.6, Instance Size: cache.t3.micro)
We are connecting to it via a Fargate node 18 NestJS service using "cache-manager": "^5.1.1","cache-manager-redis-store": "^3.0.1", and "redis": "^4.3.1"
For the most part the service runs fine, but every couple of days it crashes with the Socket closed unexpectedly error but then the worst part is that it seems to not recover and we have to restart the service to get it working properly again.
In our case the crash/socket hangup seems related to accessing a large JSON value (800KB) which could def be a big part of the issue, but would still like to understand how to make this resilient.

@leibale I'm not sure if it's related to the same problem, we have all the default client settings (i.e. None are set outside of the URL). What would you recommend for recovery/reconnect? Does it silently stop and this pingInterval fix will help?

richardsun2021 · 2022-11-07T16:48:34Z

@leibale could you please release this bug fix ASAP? we are pending for this feature.

leibale · 2022-11-08T08:25:05Z

@richardsun2021 in the meantime you can use this code to do (almost) the exact thing:

setInterval(client => {
  client.ping((err) => {
    if (err) console.error('Redis keepalive error', err);
  });
}, 9*1000*60);
// azure is closing the socket after 10m of idle time, pinging every 9m seems like a reasonable choice

@josh803316 it should reconnect after this error, do you have a listener to error (see #2302)?
AFAIK AWS honors TCP keep-alive, I don't think pingInteval will help in your case..

josh803316 · 2022-11-08T16:56:56Z

@leibale Yes, I do have an error listener which was how (I presume) I saw the error in the logs, but I've upgraded the instance size of the elasticache node and set listeners as follows and haven't seen the crash/issue since the change. My assumption is that the request for the large (almost 1MB) json key/value pair was taking more than the 5 second default timeout (on occasion) and so it would hit the socket timeout but then not recover... just don't know why it wouldn't recover/reconnect as you mentioned:

    const client = cacheManager.store.getClient();
    // Only had this before
    client.on('error', (error) => {
      this.logger.error(error);
    });
    ////////////////////////////////////////
    client.on('connect', () => {
      this.logger.error('client is connected');
    });
    client.on('reconnecting', () => {
      this.logger.error('client is reconnecting');
    });
    client.on('ready', () => {
      this.logger.error('client is ready');
    });

* fix #1598 fix #2276 - add `pingInterval` to client config * setPingTimer on ready (instead of on connect) * use isReady (instead of isOpen) and fix test * Update client-configuration.md

thorep · 2023-06-18T15:44:57Z

I am facing this exact issue now.. any solution?

abhi12299 · 2023-06-18T15:51:58Z

@thorep if the solutions discussed above don't work, just use ioredis. Setting connection timeout to 0 didn't work for me, so I migrated to ioredis and it just works.

leibale · 2023-06-18T18:43:01Z

@thorep try to use pingInterval and set it to less than 5*60*1000

emanuellarini · 2023-11-10T03:30:08Z

@thorep try to use pingInterval and set it to less than 5*60*1000

I am using redis with elasticache from AWS
Your solution worked for me although I set it to 10 * 1000
When running locally it disconnected me completely after 20s so to be safer I think 10s it's fine, right?
Btw I also tried use ioredis and I didn't need to configure anything, just worked fine
Is there any consequences on adjusting that value? if not, i'll stick with node-redis, otherwise I think its best to move to ioredis

leibale · 2023-11-10T03:58:14Z

@emanuellarini pingInterval will send a PING command to the server to make sure the server will not close the socket. For example, pingInterval of 10 * 1000 will ping the server every 10s. The only consequence is more pings to the server..

You should check the timeout option (see here) you have set on your instance and then set the pingInterval to a value that is smaller than that by at least a few seconds.

If you want, we can have a quick call.. :)

abdelhalim97 · 2023-12-28T18:51:08Z

@leibale thank u i can confirm: using redis in azure and setting pingInterval to 9 minutes fixes the problem
but i have a couple of questions for u if u dont mind

why the azure kills the connection if i m already using some redis operations on the database, shouldnt the 10 minutes cooldown get reseted every redis operation?

also whats the difference between pingInterval and keepAlive?
because i spent too much using keepAlive and wondering why it still crushes

laocoi · 2024-01-04T10:32:10Z

I got the same issue (Socket Closed Unexpectedly) but only with the remote Redis servers, so I figured out that maybe there is a problem with the TLS. So I tried to use redis:// instead of rediss://, and yes it was working stable.

iampato · 2024-11-26T14:05:11Z

Shorter ping interval did the trick:

 pingInterval: 10 * 1000,

richardsun2021 added the Bug label Sep 29, 2022

ivanpepelko mentioned this issue Oct 21, 2022

Reconnect doesn't work without on('error') listener #2302

Closed

leibale added a commit to leibale/node-redis that referenced this issue Nov 2, 2022

fix redis#1598 fix redis#2276 - add pingInterval to client config

b76035b

leibale mentioned this issue Nov 10, 2022

Ping interval #2321

Merged

leibale closed this as completed in #2321 Nov 10, 2022

leibale added a commit that referenced this issue Nov 10, 2022

Ping interval (#2321)

3c2f7ab

* fix #1598 fix #2276 - add `pingInterval` to client config * setPingTimer on ready (instead of on connect) * use isReady (instead of isOpen) and fix test * Update client-configuration.md

LouisFettet mentioned this issue Aug 1, 2023

Render Contentful assets la-ldaf/ldaf-site#279

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Socket closed unexpectedly #2276

Socket closed unexpectedly #2276

richardsun2021 commented Sep 29, 2022 •

edited

Loading

abhi12299 commented Oct 2, 2022

schmod commented Oct 4, 2022

yaroslavputria commented Oct 6, 2022

richardsun2021 commented Oct 10, 2022 •

edited

Loading

abhi12299 commented Oct 13, 2022

richardsun2021 commented Oct 17, 2022

shawnlknight commented Oct 25, 2022

leibale commented Oct 25, 2022

schmod commented Oct 25, 2022 •

edited

Loading

leibale commented Oct 25, 2022

abhi12299 commented Oct 26, 2022 •

edited

Loading

leibale commented Oct 27, 2022

abhi12299 commented Oct 27, 2022

leibale commented Nov 1, 2022 •

edited

Loading

leibale commented Nov 1, 2022

josh803316 commented Nov 3, 2022

richardsun2021 commented Nov 7, 2022

leibale commented Nov 8, 2022 •

edited

Loading

josh803316 commented Nov 8, 2022 •

edited

Loading

thorep commented Jun 18, 2023

abhi12299 commented Jun 18, 2023

leibale commented Jun 18, 2023

emanuellarini commented Nov 10, 2023

leibale commented Nov 10, 2023 •

edited

Loading

abdelhalim97 commented Dec 28, 2023

laocoi commented Jan 4, 2024

iampato commented Nov 26, 2024

Socket closed unexpectedly #2276

Socket closed unexpectedly #2276

Comments

richardsun2021 commented Sep 29, 2022 • edited Loading

abhi12299 commented Oct 2, 2022

schmod commented Oct 4, 2022

yaroslavputria commented Oct 6, 2022

richardsun2021 commented Oct 10, 2022 • edited Loading

abhi12299 commented Oct 13, 2022

richardsun2021 commented Oct 17, 2022

shawnlknight commented Oct 25, 2022

leibale commented Oct 25, 2022

schmod commented Oct 25, 2022 • edited Loading

leibale commented Oct 25, 2022

abhi12299 commented Oct 26, 2022 • edited Loading

leibale commented Oct 27, 2022

abhi12299 commented Oct 27, 2022

leibale commented Nov 1, 2022 • edited Loading

leibale commented Nov 1, 2022

josh803316 commented Nov 3, 2022

richardsun2021 commented Nov 7, 2022

leibale commented Nov 8, 2022 • edited Loading

josh803316 commented Nov 8, 2022 • edited Loading

thorep commented Jun 18, 2023

abhi12299 commented Jun 18, 2023

leibale commented Jun 18, 2023

emanuellarini commented Nov 10, 2023

leibale commented Nov 10, 2023 • edited Loading

abdelhalim97 commented Dec 28, 2023

laocoi commented Jan 4, 2024

iampato commented Nov 26, 2024

richardsun2021 commented Sep 29, 2022 •

edited

Loading

richardsun2021 commented Oct 10, 2022 •

edited

Loading

schmod commented Oct 25, 2022 •

edited

Loading

abhi12299 commented Oct 26, 2022 •

edited

Loading

leibale commented Nov 1, 2022 •

edited

Loading

leibale commented Nov 8, 2022 •

edited

Loading

josh803316 commented Nov 8, 2022 •

edited

Loading

leibale commented Nov 10, 2023 •

edited

Loading