-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: unsigned int overflow when master-slave sync data #2746
fix: unsigned int overflow when master-slave sync data #2746
Conversation
WalkthroughThe recent update introduces error handling enhancements and synchronization improvements for the Rsync process in the database replication system. A new method, Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (5)
- include/pika_rm.h (1 hunks)
- include/rsync_client.h (2 hunks)
- src/pika_rm.cc (1 hunks)
- src/pika_server.cc (1 hunks)
- src/rsync_client.cc (4 hunks)
Additional comments not posted (7)
include/rsync_client.h (2)
65-67
: The implementation ofIsErrorStopped()
correctly uses atomic operations to ensure thread safety. Good job adhering to concurrency standards.
98-98
: Initialization oferror_stopped_
tofalse
is appropriate for setting the default state of error handling. Well done.include/pika_rm.h (1)
120-120
: Replacement ofIsRsyncRunning()
withIsRsyncErrorStopped()
enhances the error handling capabilities of the system. This change should help in better management of Rsync process failures.src/rsync_client.cc (3)
51-53
: Conditional logging based onerror_stopped_
in theCopy
method is a good practice as it avoids unnecessary log entries when the system is in an error state.
63-63
: Explicitly settingerror_stopped_
tofalse
in theInit
method is crucial for correctly resetting the error state during initialization. This ensures that any previous error states do not carry over undesirably.
149-151
: Consistent use of conditional logging based onerror_stopped_
in theThreadMain
method aligns with the approach in theCopy
method, optimizing performance during error states.src/pika_server.cc (1)
810-810
: Ensure correct type casting and comparison logic inTryDBSync
.The use of
static_cast<int32_t>
ensures that the comparison betweentop
andbgsave_info.offset.b_offset.filenum
adheres to the correct data types, preventing potential bugs related to integer overflow or underflow. Good update for robustness.
当初实现rsync_client时,master在给某一个slave同步数据的过程中,生成了新的dump,期望的执行流程如下:
期望的流程现在有几个问题需要修改:
|
When rsync_client was first implemented, the master generated a new dump during the process of synchronizing data to a slave. The expected execution process is as follows:
The desired process now has several issues that need to be modified:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/pika_server.cc (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/pika_server.cc
OK, 状态流转另提PR处理 |
OK, status transfer will be dealt with separately by PR. |
55de8b3
into
OpenAtomFoundation:unstable
…e master within a short time (#2746) * use int64_t instead of int32_t --------- Co-authored-by: cjh <[email protected]>
…e master within a short time (#2746) * use int64_t instead of int32_t --------- Co-authored-by: cjh <[email protected]>
…e master within a short time (OpenAtomFoundation#2746) * use int64_t instead of int32_t --------- Co-authored-by: cjh <[email protected]>
…e master within a short time (OpenAtomFoundation#2746) * use int64_t instead of int32_t --------- Co-authored-by: cjh <[email protected]>
…e master within a short time (OpenAtomFoundation#2746) * use int64_t instead of int32_t --------- Co-authored-by: cjh <[email protected]>
这个PR修复了Issue #2742
Issue 问题表述:线上针对一个Pika实例进行扩容时,短时间内让多个slave节点连接到该实例的Master节点上时,出现了从节点没有做全量同步,数据还没拉过来,后面就成功建立增量连接的情况。
具体梳理:主节点日志上发现这几次建联请求触发了多次bgSave,从节点日志梳理发现在Slave侧的RsyncClient在拉取文件的时候拉到一半会出现主返回的RsyncResp的code为非kOK,且从节点会出现提示Master端有了新snapshot uuid的WARNING日志:W20240618` 14:57:06.052378 19221 rsync_client.cc:218] receive newer dump, reset state to STOP...
反常的地方:
该PR中进行的修复: 这里计算时都转为int64_t。
This PR fixes Issue #2742
Issue Description: When expanding a Pika instance online, multiple slave nodes are connected to the Master node of the instance in a short period of time, resulting in the slave nodes not performing a full synchronization. The data is not fully pulled over before an incremental connection is successfully established.
Detailed Analysis: The master node logs show that these connection requests triggered multiple bgSaves. The slave node logs revealed that the RsyncClient on the slave side failed to pull files correctly, with the RsyncResp code from the master being non-kOK halfway through the process. Additionally, there were WARNING logs indicating that a new snapshot uuid had appeared on the master:
W20240618 14:57:06.052378 19221 rsync_client.cc:218] receive newer dump, reset state to STOP...
Abnormal Points:
Fix: Convert both values to int64_t during the calculation.
Summary by CodeRabbit
New Features
Improvements