Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat) add thread pool metric report #339

Merged
merged 16 commits into from
Nov 23, 2019

Conversation

fengjiachun
Copy link
Contributor

@fengjiachun fengjiachun commented Nov 14, 2019

Motivation:

add thread pool metric report on kill -s SIGUSR2 $pid

Modification:

Result:

thread pool's metrics example:

`

19-11-14 10:40:12 ==============================================================

-- Timers ----------------------------------------------------------------------
threadPool.JRAFT_CLOSURE_EXECUTOR
count = 1574
mean rate = 17.69 calls/second
1-minute rate = 16.24 calls/second
5-minute rate = 11.43 calls/second
15-minute rate = 10.13 calls/second
min = 0.06 milliseconds
max = 2.63 milliseconds
mean = 0.21 milliseconds
stddev = 0.13 milliseconds
median = 0.17 milliseconds
75% <= 0.22 milliseconds
95% <= 0.40 milliseconds
98% <= 0.48 milliseconds
99% <= 0.61 milliseconds
99.9% <= 1.60 milliseconds
threadPool.JRaft-RPC-Processor
count = 2
mean rate = 0.02 calls/second
1-minute rate = 0.11 calls/second
5-minute rate = 0.31 calls/second
15-minute rate = 0.37 calls/second
min = 21.48 milliseconds
max = 31.71 milliseconds
mean = 26.59 milliseconds
stddev = 5.12 milliseconds
median = 31.71 milliseconds
75% <= 31.71 milliseconds
95% <= 31.71 milliseconds
98% <= 31.71 milliseconds
99% <= 31.71 milliseconds
99.9% <= 31.71 milliseconds
threadPool.rheakv-cli-rpc-executor
count = 2
mean rate = 0.18 calls/second
1-minute rate = 0.20 calls/second
5-minute rate = 0.20 calls/second
15-minute rate = 0.20 calls/second
min = 0.18 milliseconds
max = 2.56 milliseconds
mean = 1.30 milliseconds
stddev = 1.19 milliseconds
median = 0.18 milliseconds
75% <= 2.56 milliseconds
95% <= 2.56 milliseconds
98% <= 2.56 milliseconds
99% <= 2.56 milliseconds
99.9% <= 2.56 milliseconds
threadPool.rheakv-kv-store-rpc-executor
count = 36
mean rate = 3.18 calls/second
1-minute rate = 2.93 calls/second
5-minute rate = 2.83 calls/second
15-minute rate = 2.81 calls/second
min = 0.09 milliseconds
max = 82.06 milliseconds
mean = 3.10 milliseconds
stddev = 13.20 milliseconds
median = 0.15 milliseconds
75% <= 0.32 milliseconds
95% <= 15.53 milliseconds
98% <= 82.06 milliseconds
99% <= 82.06 milliseconds
99.9% <= 82.06 milliseconds
`

@fengjiachun fengjiachun requested a review from masaimu November 14, 2019 02:47
@fengjiachun
Copy link
Contributor Author

@killme2008 可以看下这个 pr,除了 ci 超过 50 分钟,暂时没再发现 ci hang 住的情况


LOG.info("Printing thread pools metrics with signal: {} to file: {}.", signalName, file);

try (final PrintStream out = new PrintStream(new FileOutputStream(file, true))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new FileOutputStream(file, true) 也应该抽成局部变量吧,保证关闭,用逗号隔开

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个没问题的,PrintStream 在关闭时也会关闭它内部的 outputstream,会顺带把 FileOutputStream 关掉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不应该依赖实现的,从代码角度,应该确保每个打开的都关闭

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

算是利用了一个java IO Stream 的一个规范,在其他场景,我不应该这么做。

java 的 IO Stream 设计为装饰着模式,一层包一层,外层的 stream 有责任且必须关闭自己以及任何和此流相关的所有系统资源

以下内容来自 OutputStream close 接口的注释说明:
Closes this output stream and releases any system resources associated with this stream.

Copy link
Contributor Author

@fengjiachun fengjiachun Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这样做有一个好处,不用考虑各个被嵌套的流的关闭顺序,比如不用考虑先 flush BufferedOutputStream 再 close FileOutputStream 这种先后逻辑

@@ -354,6 +354,10 @@ public synchronized void join() throws InterruptedException {
if (this.shutdownLatch != null) {
this.shutdownLatch.await();
this.disruptor.shutdown();
if (this.afterShutdown != null) {
this.afterShutdown.run(Status.OK());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个位置为什么调整? 需要放到 thread 里异步跑吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afterShutdown 原本在 fsm 的 doShutdown 里面,这有一个问题,就是 doShutdown 被调用后 fsm 实质上并没有真的被关闭,afterShutdown 里负责关闭 rocksdb,这样 rocksdb 会被提前关掉,从而导致 fsm 在还有往 rocksdb 写入的情况下 rocksdb 关掉导致的异常

@fengjiachun fengjiachun merged commit 44eccb7 into master Nov 23, 2019
@fengjiachun fengjiachun deleted the feat/thread_pool_metric_report branch November 23, 2019 18:43
luozhiyun993 pushed a commit to luozhiyun993/sofa-jraft that referenced this pull request Nov 25, 2019
* 'master' of https://github.com/sofastack/sofa-jraft:
  (feat) Remove multiple of 10ms for windows platform. (sofastack#351)
  (feat) add thread pool metric report (sofastack#339)
  [ISSUE#347]upgrade the com.fasterxml.jackson.core:jackson-databind to version 2.9.10.1. (sofastack#352)
  (feat) Support readIndex from learner (sofastack#343)
  (feat) counter example (sofastack#318)
  minor fixes (sofastack#336)
  Impl learner for jraft, sofastack#8 (sofastack#312)
  (fix) Declare rpcClient to be volatile, fix sofastack#319 (sofastack#323)
  (feat) add current leader id to node describe (sofastack#322)
  (fix) fix remove metric failed on replicator destroy (sofastack#325)
  (fix) raft options copy (sofastack#321)
  (feat) contains key api (sofastack#302)
  (feat) add rocksdb table_format_config and statistics (sofastack#295)
  feat/async snapshot (sofastack#287)
  fix/state listener (sofastack#285)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants