Skip to content
This repository has been archived by the owner on Mar 22, 2023. It is now read-only.

Commit/persist batch processing #245

Merged
merged 2 commits into from
Aug 3, 2022
Merged

Conversation

igchor
Copy link
Contributor

@igchor igchor commented Jul 13, 2022

TODO:

  • benchmark this for different batch sizes
  • more tests?

This change is Reviewable

@igchor igchor mentioned this pull request Jul 13, 2022
5 tasks
@igchor igchor force-pushed the optimization branch 2 times, most recently from 8b36364 to 219babb Compare July 13, 2022 12:39
@codecov-commenter
Copy link

Codecov Report

Merging #245 (219babb) into master (52282bc) will decrease coverage by 0.90%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #245      +/-   ##
==========================================
- Coverage   89.48%   88.58%   -0.91%     
==========================================
  Files           9        9              
  Lines         742      762      +20     
  Branches      134      141       +7     
==========================================
+ Hits          664      675      +11     
- Misses         47       51       +4     
- Partials       31       36       +5     
Flag Coverage Δ
tests_gcc_debug_cpp17 88.58% <ø> (-0.91%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pmemstream/src/libpmemstream.c 86.81% <0.00%> (-1.36%) ⬇️
pmemstream/src/iterator.c 91.81% <0.00%> (-0.91%) ⬇️
pmemstream/src/libpmemstream_internal.h 100.00% <0.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Copy link
Member

@lukaszstolarczuk lukaszstolarczuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @igchor)

a discussion (no related file):
can we add a test (without a custom data mover) to test it? E.g., start 1000 async appends and then call wait_* after a various number of entries (quite like your latest benchmark update 😉 )



src/libpmemstream.c line 723 at r3 (raw file):

				 (void *)num_committed_timestamps, 0);
	assert(ret == 0 || ret == ENOMEM);
	if (ret) {

can we somehow, easily enforce tests to enter this path...?


src/libpmemstream.c line 740 at r3 (raw file):

	void *value = critnib_remove(stream->ready_timestamps, *committed_timestamp);
	while (value && *committed_timestamp + (uint64_t)value <= upto_timestamp) {
		/* This is done only by a single thread (the one which manged to remove the item from critnib). */

managed (misspell)


src/libpmemstream.c line 743 at r3 (raw file):

		pmemstream_increase_committed_timestamp(stream, (uint64_t)value);

		value = critnib_remove(stream->ready_timestamps, *committed_timestamp);

don't we try to remove the same key from the critnib here?


src/libpmemstream.c line 782 at r3 (raw file):

		/* If we depend on some other threads see if they already finished and if yes, increase
		 * committed_timestamp. */
		pmemstream_process_ready_timestamp_batches(data->stream, &committed_timestamp, data->last_timestamp);

do we really check if they already finished...?

Copy link
Contributor Author

@igchor igchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @igchor and @lukaszstolarczuk)


src/libpmemstream.c line 723 at r3 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

can we somehow, easily enforce tests to enter this path...?

Not without fault injection I think


src/libpmemstream.c line 740 at r3 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

managed (misspell)

Done.


src/libpmemstream.c line 743 at r3 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

don't we try to remove the same key from the critnib here?

Of course, good catch


src/libpmemstream.c line 782 at r3 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

do we really check if they already finished...?

Yes, if they finished they either increased the committed_timestamp already (first branch in the if below) or inserted the batch description to critnib (second branch). If second option happened, we will increase the committed_timestamp (in proper order). If first option happened, nothing to do for us.

Copy link
Contributor Author

@igchor igchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 3 files reviewed, 5 unresolved discussions (waiting on @lukaszstolarczuk)

a discussion (no related file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

can we add a test (without a custom data mover) to test it? E.g., start 1000 async appends and then call wait_* after a various number of entries (quite like your latest benchmark update 😉 )

Done.


Copy link
Member

@lukaszstolarczuk lukaszstolarczuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r4, all commit messages.
Reviewable status: 2 of 3 files reviewed, 4 unresolved discussions (waiting on @igchor and @lukaszstolarczuk)


tests/unittest/concurrent_async_wait.cpp line 64 at r4 (raw file):

				for (auto &future_sequence : futures) {
					std::mt19937_64 g(*rc::gen::arbitrary<size_t>());

you can use init_random from random_helpers.hpp instead

// while generating seed with RC is nice, it'd be harder to reproduce an issue...? not sure how shrinking will work with this approach...


tests/unittest/concurrent_async_wait.cpp line 77 at r4 (raw file):

				std::vector<std::string> all_extra_data;
				all_extra_data.reserve(extra_data_size);
				for (auto &entry_sequency : extra_data) {

entry_sequency -> entry_sequence

Copy link
Member

@lukaszstolarczuk lukaszstolarczuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r4.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @igchor)

Copy link
Contributor Author

@igchor igchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @igchor and @lukaszstolarczuk)


tests/unittest/concurrent_async_wait.cpp line 64 at r4 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

you can use init_random from random_helpers.hpp instead

// while generating seed with RC is nice, it'd be harder to reproduce an issue...? not sure how shrinking will work with this approach...

Why it won't work? RC provides you with it's own seed to reproduce the results.

igchor added 2 commits August 1, 2022 06:40
to only acquire PMEMSTREAM_TIMESTAMP_PROCESSING_BATCH
timestamps at once. This allows other threads to help
process remaining timestamps, increasing concurrency.
Append different data in each thread. Also, prepare multiple
futures per thread and poll them in random order.
@igchor igchor changed the title WIP: commit/persist batch processing Commit/persist batch processing Aug 1, 2022
@igchor igchor marked this pull request as ready for review August 1, 2022 12:39
Copy link
Member

@lukaszstolarczuk lukaszstolarczuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r5, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @igchor)

Copy link
Contributor

@KFilipek KFilipek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r5, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @igchor)

Copy link

@karczex karczex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @igchor)

@lukaszstolarczuk lukaszstolarczuk merged commit b13197c into pmem:master Aug 3, 2022
@igchor igchor deleted the optimization branch August 3, 2022 12:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants