Non-naive implementation of `VecDeque.append` #52553

Pazzaz · 2018-07-19T21:39:49Z

Replaces the old, simple implementation with a more manual (and unsafe 😱) one. I've added 1 more test and verified that it covers all 6 code paths in the function.

This new implementation was about 60% faster than the old naive one when I tried benchmarking it.

rust-highfive · 2018-07-19T21:39:52Z

r? @shepmaster

(rust_highfive has picked a reviewer for you, use r? to override)

shepmaster · 2018-07-20T00:47:50Z

when I tried benchmarking it.

Which benchmarks did you use? Are there some existing or did you write your own? If you wrote new ones, it seems like those should be added (but I'm not 100% sure on that).

Either way, it'd be good to see the benchmarks (and the raw data) to check that they are valid.

shepmaster · 2018-07-20T00:49:28Z

src/liballoc/tests/vec_deque.rs

+        a.drain(a_pop_front..);
+        b.drain(..b_pop_back);
+        b.drain(b_pop_front..);
+        let checked = a.iter().chain(b.iter()).map(|&x| x).collect::<Vec<usize>>();


.map(|&x| x) -> .cloned()

shepmaster · 2018-07-20T00:50:25Z

src/liballoc/tests/vec_deque.rs

@@ -928,6 +928,60 @@ fn test_append() {
    assert_eq!(a.iter().cloned().collect::<Vec<_>>(), []);
 }

+#[test]
+fn test_append_advanced() {
+    fn check(


Check what? Tests are a form of documentation. All tests "check" something, that's rather their purpose.

shepmaster · 2018-07-20T00:50:46Z

src/liballoc/tests/vec_deque.rs

@@ -928,6 +928,60 @@ fn test_append() {
    assert_eq!(a.iter().cloned().collect::<Vec<_>>(), []);
 }

+#[test]
+fn test_append_advanced() {


"advanced" isn't really a great descriptor. What makes it advanced?

shepmaster · 2018-07-20T00:54:09Z

src/liballoc/tests/vec_deque.rs

+        a_push_front: usize,
+        a_pop_front: usize,
+        b_push_front: usize,
+        b_pop_front: usize


This type of argument list is about at the top of the list for "easiest to pass the wrong arguments". There's a huge mass of arguments, subtly intertwined, all of the same type.

Instead, WDYT about

#[derive(Debug, Default)] struct SomeUsefulName { a_push_back: usize, a_pop_back: usize, b_push_back: usize, b_pop_back: usize, a_push_front: usize, a_pop_front: usize, b_push_front: usize, b_pop_front: usize }

check(SomeUsefulName { a_push, a_pop, b_push, b_pop, ..SomeUsefulName::default()});

shepmaster · 2018-07-20T00:55:17Z

src/liballoc/tests/vec_deque.rs

+        assert_eq!(a, checked);
+        assert!(b.is_empty());
+    }
+    for a_push in 0..17 {


Why 17. Preferably there's a constant with an evocative name.

Maybe even 0..=SOME_CONSTANT if 16 happens to be the reason...

shepmaster · 2018-07-20T00:59:03Z

src/liballoc/collections/vec_deque.rs

+
+            // When minimizing the amount of calls to `copy_part`, there are
+            // 6 different cases to handle. Whether src and/or dst wrap are 4
+            // combinations and there are 3 distinct cases when they both wrap.


Reading this the first time, I assumed you mathed wrong. Having the number 4 and then the number 3 made me think that there would be 7 cases. On rereading it, I see my mistake. I don't have a suggestion for how to reword, but maybe someone will.

That being said, what benefit does the math in this comment bring?

shepmaster · 2018-07-20T01:02:36Z

src/liballoc/collections/vec_deque.rs

+                        //  3 3 H           1 1 1 2
+                        let src_2 = dst_before_wrap - src_before_wrap;
+                        let dst_start_2 = dst_start_1 + src_before_wrap;
+                        let src_3 = src_total - dst_before_wrap;


Not a fan of the _1, _2, _3 suffixes. How do I know to use _1 vs _2? It's nice that there's a picture, but it's a shame that the code cannot stand on its own without it, especially knowing how code and comments frequently diverge.

shepmaster · 2018-07-20T01:05:53Z

src/liballoc/collections/vec_deque.rs

-        self.extend(other.drain(..));
+        // Copy from src[i1..i1 + len] to dst[i2..i2 + len].
+        // Does not check if the ranges are valid.
+        unsafe fn copy_part<T>(i1: usize, i2: usize, len: usize, src: &[T], dst: &mut [T]) {


I'm missing why this part needs to be unsafe. Can't we build slices and use [T]::copy_from_slice?

Ah, because T doesn't necessarily implement Copy, of course. However, it feels odd to "reimplement" a half-slice on top of a slice due to the i1 / i2.

i1 and i2 are actually... offsets(?) into src and dst respectively? Those should be renamed to at least have src and dst in there.

shepmaster · 2018-07-20T01:10:31Z

src/liballoc/collections/vec_deque.rs

+            // 6 different cases to handle. Whether src and/or dst wrap are 4
+            // combinations and there are 3 distinct cases when they both wrap.
+            // 6 = 3 + 1 + 1 + 1
+            match (src_wraps, dst_wraps) {


Pure speculation, but I wonder if there could be a method to encapsulate this logic, something along the lines of

fn source_parts(&self) -> (&[T], &[T]); // might exist as `as_slices` unsafe fn destination_parts(&mut self) -> (&mut [T], &mut [T]);

This code will still need to exist and handle the same cases to minimize the copies.

shepmaster · 2018-07-20T01:14:23Z

src/liballoc/collections/vec_deque.rs

+                }
+            }
+        };
+        other.clear();


Presumably, until this point we are in a Bad State where a given element actually exists twice, once in both collections, right? I believe that means the panic safety of this function needs to be documented to help prevent the next person to touching it from shooting themselves in the foot.

shepmaster · 2018-07-20T01:19:22Z

and verified that it covers all 6 code paths in the function

If that's important, there should be comments in the test showcasing which part(s) of the test exercise which conditions to avoid losing that coverage.

Pazzaz · 2018-07-23T09:35:48Z

OK, I've pushed some changes that address most things you commented on. Instead of working with offsets in append, slices are now used to make it easier to understand and less error-prone. (thanks for the idea with "destination parts")
The test was also reworked entirely to make it cover more cases.

Which benchmarks did you use?

This one. I didn't use the standard bench harness as the VecDeques have to be cloned in the loop and that would force the bench to measure .clone() too.

pietroalbini · 2018-07-30T17:11:26Z

Ping from triage @shepmaster! This PR needs your review.

shepmaster · 2018-08-06T02:25:13Z

Thanks for putting up with my feedback! Now that I've done what I can, I'll pass it off to a better member of the team to actually make a decision!

r? @SimonSapin

SimonSapin · 2018-08-07T19:04:08Z

I’ll need more time than I have right now to properly review the code (especially because unsafe). But in the meantime: it’d be nice to have the benchmark included in the PR. Try adding it as-is (with a main() function and no #[bench] attribute) to src/liballoc/benches/vec_deque_append.rs, then in src/liballoc/Cargo.toml you should find this section:

[[bench]]
name = "collectionsbenches"
path = "../liballoc/benches/lib.rs"

Below it, add this:

[[bench]]
name = "vec_deque_append_bench"
path = "../liballoc/benches/vec_deque_append.rs"
harness = false

(harness = false is what makes the file be compiled as a "normal" executable rather than with the built in bench / test harness.)

Then you should be able to run it with something like ./x.py bench src/liballoc --test-args "--bench vec_deque_append"

@rust-lang/infra Is there any automation that parses the ouptut of ./x.py bench and would be thrown off by a non-standard-harness benchmark like this?

Mark-Simulacrum · 2018-08-07T19:13:36Z

Nothing today parses the benchmark output.

Pazzaz · 2018-08-10T20:42:10Z

@SimonSapin The benchmark has been added. The only changes I did were to make it more similar to the default bench harness (use the median instead of the mean and force the use of nanoseconds).

SimonSapin · 2018-08-14T15:20:30Z

src/liballoc/collections/vec_deque.rs

+            let (before, after) = buf.split_at_mut(head);
+            (after, before)
+        } else {
+            RingSlices::ring_slices(buf, tail, head)


This is worth a comment explaining why using ring_slices with the arguments swapped provides the desired behavior.

SimonSapin · 2018-08-14T15:32:45Z

src/liballoc/collections/vec_deque.rs

-        // naive impl
-        self.extend(other.drain(..));
+        // Copies all values from `src_slice` to the start of `dst_slice`.
+        unsafe fn copy_whole_slice<T>(src_slice: &[T], dst_slice: &mut [T]) {


So this is like [T]::copy_from_slice but without T: Copy? Then it is up to callers to be careful to not "duplicate" items and have for example some heap memory be double-freed.

… in fact this code ends up calling other.clear() which will incorrectly drop the items that have just been copied. Instead it should probably do something like other.tail = 0; other.head = 0 or other.tail = other.head.

Consider modifying the test to have not just T = usize, but a type with a Drop impl that does something non-trivial, ideally detect double drops. Perhaps T = Box<usize> is enough, if you can verify that CI runs tests with ASAN.

A new test that checks for double drops similar to the Vec tests has been added.

SimonSapin · 2018-08-14T15:43:06Z

src/liballoc/collections/vec_deque.rs

+            //     6. `src` and `dst` are discontiguous
+            //        + dst_high is the same size as src_high
+            let src_contiguous = src_low.is_empty();
+            let dst_contiguous = dst_high.len() >= src_total;


This took me a while to figure out.

The wording of the comment above suggests that “dst is contiguous” is a property of dst alone (namely that the pair of empty/unused slices are in fact one), similar to “src is contiguous”.

However the definition of this dst_contiguous value is not that. Instead, it is “we will only need to write into a contiguous portion of dst”. This does not match the comment or the name of the variable. Consider rewording both.

SimonSapin · 2018-08-14T15:51:40Z

src/liballoc/collections/vec_deque.rs

+        // it is important we clear the old values from `other`...
+        other.clear();
+        // and that we update `head` as the last step, making the values accessible in `self`.
+        self.head = new_head;


With clear() replaced per above comment there is no need to delay the write to self.head, since there is no arbitrary destructor code being called, only shallow copies (moves). Inlining it in each 6 cases might be clearer. (The reset of other can also be moved before the copies.)

The assignment of self.head and the reset of other both need to be done after dst and src are retrieved and the borrow checker complains if I do the assignments while they are in scope as self and other are borrowed.

bors · 2018-08-17T21:24:19Z

💔 Test failed - status-travis

rust-highfive · 2018-08-17T21:24:20Z

Your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

kennytm · 2018-08-18T04:18:40Z

@bors retry

bors · 2018-08-18T04:58:29Z

⌛ Testing commit b063bd4 with merge 4ceb23dfca0457fb399a72b3f7526b54a83b7476...

bors · 2018-08-18T06:50:06Z

💔 Test failed - status-travis

rust-highfive · 2018-08-18T06:50:07Z

The job x86_64-gnu-aux of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[01:47:50] 
[01:47:50] testing https://github.com/BurntSushi/xsv
[01:47:51] Initialized empty Git repository in /checkout/obj/build/ct/xsv/.git/
[01:47:51] fatal: Could not parse object '66956b6bfd62d6ac767a6b6499c982eae20a2c9f'.
[01:48:11] fatal: unable to access 'https://github.com/BurntSushi/xsv/': Could not resolve host: github.com
[01:48:11] thread 'main' panicked at 'assertion failed: status.success()', tools/cargotest/main.rs:128:13
[01:48:11] 
[01:48:11] 
[01:48:11] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/cargotest" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "/checkout/obj/build/ct"
[01:48:11] expected success, got: exit code: 101
[01:48:11] expected success, got: exit code: 101
[01:48:11] 
[01:48:11] 
[01:48:11] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/test/pretty src/test/run-pass/pretty src/test/run-fail/pretty src/test/run-pass-valgrind/pretty src/test/run-pass-fulldeps/pretty src/test/run-fail-fulldeps/pretty src/tools/cargo src/tools/cargotest
[01:48:11] Build completed unsuccessfully in 0:45:19
[01:48:11] make: *** [check-aux] Error 1
[01:48:11] Makefile:60: recipe for target 'check-aux' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:00497e4e
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
---
travis_time:end:1a5fffba:start=1534574918091265952,finish=1534574918097774891,duration=6508939
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:0743de88
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:02594940
travis_time:start:02594940
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:14a46a05
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

pietroalbini · 2018-08-18T08:30:57Z

@bors retry travis-ci/travis-ci#9696

bors · 2018-08-18T08:56:26Z

⌛ Testing commit b063bd4 with merge d5b6b95...

Non-naive implementation of `VecDeque.append` Replaces the old, simple implementation with a more manual (and **unsafe** 😱) one. I've added 1 more test and verified that it covers all 6 code paths in the function. This new implementation was about 60% faster than the old naive one when I tried benchmarking it.

bors · 2018-08-18T11:04:16Z

☀️ Test successful - status-appveyor, status-travis
Approved by: SimonSapin
Pushing d5b6b95 to master...

…onSapin" This partially reverts commit d5b6b95, reversing changes made to 6b1ff19. Fixes rust-lang#53529. Cc: rust-lang#53564.

@SimonSapin

Reoptimize VecDeque::append ~Unfortunately, I don't know if these changes fix the unsoundness mentioned in #53529, so it is stil a WIP. This is also completely untested. The VecDeque code contains other unsound code: one example : [reading unitialized memory](https://play.rust-lang.org/?gist=6ff47551769af61fd8adc45c44010887&version=nightly&mode=release&edition=2015) (detected by MIRI), so I think this code will need a bigger refactor to make it clearer and safer.~ Note: this is based on #53571. r? @SimonSapin Cc: #53529 #52553 @yorickpeterse @jonas-schievink @Pazzaz @shepmaster.

Optimize VecDeque::append Optimize `VecDeque::append` to do unsafe copy rather than iterating through each element. On my `Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz`, the benchmark shows 37% improvements: ``` Master: custom-bench vec_deque_append 583164 ns/iter custom-bench vec_deque_append 550040 ns/iter Patched: custom-bench vec_deque_append 349204 ns/iter custom-bench vec_deque_append 368164 ns/iter ``` Additional notes on the context: this is the third attempt to implement a non-trivial version of `VecDeque::append`, the last two are reverted due to unsoundness or regression, see: - rust-lang#52553, reverted in rust-lang#53571 - rust-lang#53564, reverted in rust-lang#54851 Both cases are covered by existing tests. Signed-off-by: tabokie <[email protected]>

Non-naive implementation for VecDeque.append

24bc854

rust-highfive assigned shepmaster Jul 19, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 19, 2018

shepmaster reviewed Jul 20, 2018

View reviewed changes

Pazzaz added 2 commits July 22, 2018 22:15

Simplify vecdeque append test

9f1fdec

Make VecDeque append safer and easier to understand

6ebd62b

rust-highfive assigned SimonSapin and unassigned shepmaster Aug 6, 2018

Add benchmark for VecDeque append

894c9ca

SimonSapin reviewed Aug 14, 2018

View reviewed changes

Pazzaz added 2 commits August 14, 2018 20:54

Don't drop values in other, just move the tail

8d3554c

Clarify dst condition

ae0f254

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Aug 17, 2018

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 18, 2018

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Aug 18, 2018

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 18, 2018

bors merged commit b063bd4 into rust-lang:master Aug 18, 2018

jonas-schievink mentioned this pull request Aug 20, 2018

Rust nightly 2018-08-17 or later causes random segmentation faults or panics #53529

Closed

MaloJaffre mentioned this pull request Aug 21, 2018

Reoptimize VecDeque::append #53564

Merged

tabokie mentioned this pull request Sep 7, 2021

Optimize VecDeque::append #88717

Merged

Non-naive implementation of VecDeque.append #52553

Non-naive implementation of VecDeque.append #52553

Conversation

Pazzaz commented Jul 19, 2018

rust-highfive commented Jul 19, 2018

shepmaster commented Jul 20, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shepmaster Jul 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shepmaster commented Jul 20, 2018

Pazzaz commented Jul 23, 2018

pietroalbini commented Jul 30, 2018

shepmaster commented Aug 6, 2018

SimonSapin commented Aug 7, 2018

Mark-Simulacrum commented Aug 7, 2018

Pazzaz commented Aug 10, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Aug 17, 2018

rust-highfive commented Aug 17, 2018

kennytm commented Aug 18, 2018

bors commented Aug 18, 2018

bors commented Aug 18, 2018

rust-highfive commented Aug 18, 2018

pietroalbini commented Aug 18, 2018

bors commented Aug 18, 2018

bors commented Aug 18, 2018

Non-naive implementation of `VecDeque.append` #52553

Non-naive implementation of `VecDeque.append` #52553

shepmaster Jul 20, 2018 •

edited

Loading