Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display of integers without raw pointers and without overflowing_literals #135265

Merged
merged 2 commits into from
Feb 4, 2025

Conversation

pascaldekloe
Copy link
Contributor

@pascaldekloe pascaldekloe commented Jan 8, 2025

The benchmarks as is measure formatting speed of literals. The first commit black_box-es input to simulate runtime speed instead.

The second commit replaces unsafe pointer optimizations with plain array indices. The performance is equivalent on Apple M1. Needs peer review on Intel.

Happy to do the 128-bit version too if such change is welcome.

@rustbot
Copy link
Collaborator

rustbot commented Jan 8, 2025

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Noratrieb (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 8, 2025
@pascaldekloe pascaldekloe changed the title Fmt int speed Display of integers without unsafe pointer Jan 8, 2025
@tgross35
Copy link
Contributor

tgross35 commented Jan 8, 2025

r? tgross35 since I'm doing some work in this area anyway

You should cd src/etc/test-float-parse and run cargo +stage0 run --release, those tests are technically for parsing but do invoke the print routines quite a bit. I'll take a closer look at the changes soon.

@rustbot rustbot assigned tgross35 and unassigned Noratrieb Jan 8, 2025
@pascaldekloe
Copy link
Contributor Author

No problem if this one gets rejected in favour of anything else @tgross35. I was just pleased to see the compiler doing this well.

@tgross35
Copy link
Contributor

tgross35 commented Jan 8, 2025

You should cd src/etc/test-float-parse and run cargo +stage0 run --release, those tests are technically for parsing but do invoke the print routines quite a bit. I'll take a closer look at the changes soon.

Actually ignore this, I was thinking this was in flt2dec and not integers.

No problem if this one gets rejected in favour of anything else @tgross35. I was just pleased to see the compiler doing this well.

No conflict :) Replacing unsafe code is always nice if there are no drawbacks.

@tgross35
Copy link
Contributor

tgross35 commented Jan 8, 2025

It looks like there may be some small regressions here, the new code doesn't elide a couple calls to panic_bounds_check https://rust.godbolt.org/z/hjqhqsnbb. You could probably play around on godbolt a bit to figure out where this is coming from, a call to hint::assert_unchecked in the right place would likely make this disappear.

Soon it will be possible to write_copy_of_slice on the MaybeUninit slices, I don't expect this to do anything for optimization but it would allow keeping the "write two bytes at a time" feature from the original #129259.

@tgross35 tgross35 changed the title Display of integers without unsafe pointer Display of integers without raw pointer Jan 10, 2025
@tgross35 tgross35 changed the title Display of integers without raw pointer Display of integers without raw pointers Jan 10, 2025
@pascaldekloe
Copy link
Contributor Author

What a gem that Compiler Explorer is @tgross35. I think the assembly looks good now.

B.t.w., I tried bit shifting entries from the lookup table into a 64-bit register and write per 8 digits at once. The compiler "optimizes" such intermediate register away, and it goes for 16-bit copies instead. Maybe those small writes are actually faster. I don't know. 😂

@pascaldekloe
Copy link
Contributor Author

I found a way without compiler hinting which is even faster than the unsafe code as is. It requires buffers of an even byte-size. Needs a little more work for all integer sizes.

https://github.com/pascaldekloe/b10/blob/e7d9813c1f74dc9137216f1b271303fb056e90ad/src/lib.rs#L1723

@tgross35
Copy link
Contributor

What a gem that Compiler Explorer is @tgross35.

I am by no means a perf expert but I still can't imagine chasing down optimizations without it :)

I found a way without compiler hinting which is even faster than the unsafe code as is. It requires buffers of an even byte-size. Needs a little more work for all integer sizes.

https://github.com/pascaldekloe/b10/blob/e7d9813c1f74dc9137216f1b271303fb056e90ad/src/lib.rs#L1723

Are you planning to update the PR with something like this? At a surface level the patch seems pretty reasonable but I didn't look too deep yet, I'll hold off on reviewing if you have something faster coming.

@pascaldekloe
Copy link
Contributor Author

Yes, will give it a try @tgross35. Same surprise here as you're having. 🙂

@tgross35
Copy link
Contributor

@rustbot author

(just comment @rustbot review to update the labels when this is ready for a look)

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 13, 2025
@pascaldekloe
Copy link
Contributor Author

The speed gain is because (1) the loop decision no longer depends on the digit calculation, (2) it produces fewer instructions, and (3) it no longer needs a check for special case zero, which has a "significant" leading zero.

Unfortunately the results are less optimal for the two benchmark cases in the core library: either zero or maximum digits in a loop. Apparently the branch prediction benefit does not outweight the per-4-digit tweak in such scenario.

So now we have the explicit assertions as hint assertions too. Can you have a look @tgross35? @rustbot review

https://rust.godbolt.org/z/h97MacKcc

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 13, 2025
@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 14, 2025
@rustbot

This comment has been minimized.

@rustbot rustbot removed has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 14, 2025
@bors
Copy link
Contributor

bors commented Feb 3, 2025

⌛ Testing commit 821bc70 with merge 60f48c4...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Feb 3, 2025

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 3, 2025
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 3, 2025
@tgross35
Copy link
Contributor

tgross35 commented Feb 3, 2025

@bors r=tgross35,ChrisDenton

@bors
Copy link
Contributor

bors commented Feb 3, 2025

📌 Commit ebeaf2e has been approved by tgross35,ChrisDenton

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 3, 2025
@workingjubilee workingjubilee added the CI-spurious-fail-mingw CI spurious failure: target env mingw label Feb 3, 2025
@bors
Copy link
Contributor

bors commented Feb 4, 2025

⌛ Testing commit ebeaf2e with merge 019fc4d...

@bors
Copy link
Contributor

bors commented Feb 4, 2025

☀️ Test successful - checks-actions
Approved by: tgross35,ChrisDenton
Pushing 019fc4d to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 4, 2025
@bors bors merged commit 019fc4d into rust-lang:master Feb 4, 2025
7 checks passed
@rustbot rustbot added this to the 1.86.0 milestone Feb 4, 2025
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (019fc4d): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.5% [-0.6%, -0.5%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.5% [-0.6%, -0.5%] 2

Max RSS (memory usage)

Results (secondary 2.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.7% [2.7%, 2.7%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results (primary 1.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [1.0%, 1.0%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.0% [1.0%, 1.0%] 1

Binary size

Results (primary -0.2%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.4%, -0.0%] 12
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 38
All ❌✅ (primary) -0.2% [-0.4%, -0.0%] 12

Bootstrap: 778.131s -> 778.858s (0.09%)
Artifact size: 328.78 MiB -> 328.82 MiB (0.01%)

carolynzech added a commit to carolynzech/kani that referenced this pull request Feb 10, 2025
- Regression for delayed UB for slices: rust-lang/rust#135265
carolynzech added a commit to carolynzech/kani that referenced this pull request Feb 10, 2025
- Regression for delayed UB for slices: rust-lang/rust#135265
github-merge-queue bot pushed a commit to model-checking/kani that referenced this pull request Feb 11, 2025
Upgrade toolchain to 2/10.

I **highly recommend** reviewing this PR commit-by-commit. The
description in each commit message links to the upstream PRs that
prompted those particular changes.

## Callouts
- 2/1 had a lot of formatting changes. I split the commits for that day
into formatting changes and functionality changes accordingly.
- 2/5 introduced a regression in our delayed UB instrumentation, so I
made a new fixme test. See #3881 for details.


## Culprit PRs:
rust-lang/rust#134424 
rust-lang/rust#130514
rust-lang/rust#135748
rust-lang/rust#136590
rust-lang/rust#135318
rust-lang/rust#135265

rust-lang/rust@bcb8565
rust-lang/rust#136471
rust-lang/rust#136645

Resolves #3863

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-spurious-fail-mingw CI spurious failure: target env mingw CI-spurious-fail-msvc CI spurious failure: target env msvc merged-by-bors This PR was explicitly merged by bors. rla-silenced Silences rust-log-analyzer postings to the PR it's added on. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants