-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
com.alipay.sofa.jraft.rhea.errors.LeaderNotAvailableException: The leader is not available. #298
Comments
服务端切主了,要看下服务端的日志 |
服务端什么情况下会切主? |
一个应用里面启动了三个节点 想先测试用的 跟这个有关系吗 scan查询几十次主节点就挂了? |
从这行日志看,总共三个节点,挂掉两个了,半数以上节点挂掉后集群是不可用的,需要看下另外两台为什么挂掉 |
建议观察下 GC 情况 |
deadNodes=127.0.0.1:7182,127.0.0.1:7183 这两个节点进程是否还在? 在的话 'kill -s SIGUSR2 pid' 看一下并把 jraft 生成的状态信息文件发一下看看 (文件名以 node_describe.log 和 node_metrics.log 以及 rheakv_metrics.log 为前缀,每个进程三个文件) |
10-17 13:43:57.457 WARN [com.alipay.sofa.jraft.core.NodeImpl] - Node <rhea_mqtt--1/127.0.0.1:7181> PreVote to 127.0.0.1:7182 error: Status[ENOENT<1012>: Peer id not found: 127.0.0.1:7182]. |
7181: -- rheakv -- Counters -------------------------------------------------------------------- -- rheakv -- Histograms ------------------------------------------------------------------ -- rheakv -- Meters ---------------------------------------------------------------------- -- rheakv -- Timers ---------------------------------------------------------------------- nodeId: <mqttRpcGroup/127.0.0.1:7181> -- <rhea_mqtt--1/127.0.0.1:7181> 10/17/19 1:50:40 PM ============================================================ -- <rhea_mqtt--1/127.0.0.1:7181> -- Gauges ---------------------------------------------------------------------- -- <rhea_mqtt--1/127.0.0.1:7181> -- Histograms ------------------------------------------------------------------ -- <rhea_mqtt--1/127.0.0.1:7181> -- Timers ---------------------------------------------------------------------- |
7182 -- rheakv -- Histograms ------------------------------------------------------------------ -- rheakv -- Meters ---------------------------------------------------------------------- -- rheakv -- Timers ---------------------------------------------------------------------- nodeId: <mqttRpcGroup/127.0.0.1:7182> -- <rhea_mqtt--1/127.0.0.1:7182> 10/17/19 1:48:18 PM ============================================================ -- <rhea_mqtt--1/127.0.0.1:7182> -- Gauges ---------------------------------------------------------------------- -- <rhea_mqtt--1/127.0.0.1:7182> -- Histograms ------------------------------------------------------------------ -- <rhea_mqtt--1/127.0.0.1:7182> -- Timers ---------------------------------------------------------------------- |
7183 -- rheakv -- Histograms ------------------------------------------------------------------ -- rheakv -- Meters ---------------------------------------------------------------------- -- rheakv -- Timers ---------------------------------------------------------------------- nodeId: <mqttRpcGroup/127.0.0.1:7183> -- <rhea_mqtt--1/127.0.0.1:7183> 10/17/19 1:48:23 PM ============================================================ -- <rhea_mqtt--1/127.0.0.1:7183> -- Gauges ---------------------------------------------------------------------- -- <rhea_mqtt--1/127.0.0.1:7183> -- Histograms ------------------------------------------------------------------ -- <rhea_mqtt--1/127.0.0.1:7183> -- Timers ---------------------------------------------------------------------- |
端口进程还在 不过端口服务应该挂了 |
nodeId: <mqttRpcGroup/127.0.0.1:7181> 为什么有两个 raft group? 你每个节点起了两个 jraft 服务? |
是的 起了一个我自己业务用的 感觉不相干 之前也以为是这个问题 不起也还是报这个错 我去掉再测一下 看下 |
@zhangjun050754 端口冲突了吧? |
去掉了 没用 依然挂掉 数据量小没问题 数据量一大查询几次就挂掉了
|
start 和 end 范围很大? 那肯定是内存爆了, Full gc 导致失去响应,心跳超时当然 step down。 建议你观察下 gc, jstat 看看。 |
大范围的数据 scan,建议使用 iterator
它是流式查询的方式,而 scan 是会将所有 key/value 都查询回来才返回。 |
scan 了很大范围吧? 看下 gc 吧,大范围查询可以改为 iterator |
看不出来有问题啊 我先试一下iterator |
95% <= 245185.71 milliseconds 你的 scan p95 都 245 秒了,看下 gc 啊,另外 scan 多大范围也说一下 |
不到700m的heap你不能这么玩,你得看下gc log |
没其他问题先关闭了 |
CompletableFuture<List> scan = rheaKVStore.scan(startKey, endKey, readOnlySafe, returnValue);
并发查询会报错。。
Caused by: com.alipay.sofa.jraft.rhea.errors.LeaderNotAvailableException: The leader is not available.
java.util.concurrent.ExecutionException: com.alipay.sofa.jraft.rhea.errors.LeaderNotAvailableException: The leader is not available.
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at cn.recallcode.iot.mqtt.server.store.cache.SubscribeNotWildcardCache.searchFutureList(SubscribeNotWildcardCache.java:78)
at cn.recallcode.iot.mqtt.server.store.service.impl.SubscribeStoreService.sendPublishMessage(SubscribeStoreService.java:108)
at cn.recallcode.iot.mqtt.server.broker.protocol.Publish.sendPublishMessage(Publish.java:111)
at cn.recallcode.iot.mqtt.server.broker.protocol.Publish.processPublish(Publish.java:65)
at cn.recallcode.iot.mqtt.server.broker.handler.BrokerHandler.channelRead0(BrokerHandler.java:44)
at cn.recallcode.iot.mqtt.server.broker.handler.BrokerHandler.channelRead0(BrokerHandler.java:1)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.netty.handler.codec.http.websocketx.Utf8FrameValidator.channelRead(Utf8FrameValidator.java:77)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.http.websocketx.WebSocketServerProtocolHandler$1.channelRead(WebSocketServerProtocolHandler.java:211)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:628)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:563)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.alipay.sofa.jraft.rhea.errors.LeaderNotAvailableException: The leader is not available.
The text was updated successfully, but these errors were encountered: