Optimize varint encoding #767

MelonShooter · 2022-11-28T08:43:06Z

This optimizes varint encoding by turning the loop into a for loop which causes the compiler to see that it can unroll the loop.

Bench:

varint/small/encode     time:   [61.577 ns 61.786 ns 62.016 ns]
                        change: [-18.793% -17.983% -17.024%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

varint/medium/encode    time:   [312.28 ns 313.01 ns 313.78 ns]
                        change: [-13.850% -13.002% -12.209%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

varint/large/encode     time:   [533.79 ns 535.11 ns 536.43 ns]
                        change: [-26.542% -25.921% -25.275%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  3 (3.00%) high severe

varint/mixed/encode     time:   [308.34 ns 309.11 ns 309.87 ns]
                        change: [-23.418% -22.569% -21.718%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

LucioFranco · 2022-12-12T18:21:29Z

I am not seeing the same improvements on my laptop

varint/small/encode     time:   [254.51 ns 255.08 ns 255.78 ns]
                        change: [+4.9943% +5.3430% +5.6646%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  6 (6.00%) high mild
  8 (8.00%) high severe

varint/small/decode     time:   [230.34 ns 230.96 ns 231.62 ns]
                        change: [+0.0577% +0.4059% +0.7424%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

varint/small/encoded_len
                        time:   [57.304 ns 57.385 ns 57.477 ns]
                        change: [-0.3950% -0.1263% +0.1243%] (p = 0.35 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe

varint/medium/encode    time:   [1.2231 us 1.2262 us 1.2301 us]
                        change: [+5.3350% +5.6709% +6.0007%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe

varint/medium/decode    time:   [240.14 ns 240.56 ns 241.05 ns]
                        change: [-0.8120% -0.5223% -0.2217%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe

varint/medium/encoded_len
                        time:   [57.344 ns 57.443 ns 57.556 ns]
                        change: [-0.3119% -0.0467% +0.2367%] (p = 0.73 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

varint/large/encode     time:   [2.3195 us 2.3250 us 2.3310 us]
                        change: [-1.1808% -0.8771% -0.5842%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high mild

varint/large/decode     time:   [352.08 ns 352.89 ns 353.81 ns]
                        change: [-0.3626% -0.0670% +0.2361%] (p = 0.66 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

varint/large/encoded_len
                        time:   [57.386 ns 57.487 ns 57.598 ns]
                        change: [-0.5971% -0.2721% +0.0337%] (p = 0.10 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild

varint/mixed/encode     time:   [1.4749 us 1.4777 us 1.4808 us]
                        change: [-2.2308% -1.9438% -1.6452%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

varint/mixed/decode     time:   [286.76 ns 287.35 ns 288.00 ns]
                        change: [-0.4404% -0.1303% +0.1851%] (p = 0.41 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  11 (11.00%) high mild
  1 (1.00%) high severe

varint/mixed/encoded_len
                        time:   [57.407 ns 57.537 ns 57.689 ns]
                        change: [-0.5136% -0.1899% +0.1612%] (p = 0.29 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

LucioFranco · 2022-12-12T18:21:43Z

Could you explain more what system you ran those benchmarks on?

MelonShooter · 2023-08-12T17:24:09Z

Apologies for the late response. I ran this on an Intel i5-10300H on Ubuntu 22.04 through WSL. I don't remember what rustc version I ran the original benchmarks on, but I ran it again just now on rustc 1.71.1 and got similar results. What compiler version and CPU did you use to run those benchmarks?

varint/small/encode     time:   [68.863 ns 69.823 ns 70.857 ns]                                
                        change: [-22.839% -20.072% -16.939%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

varint/small/decode     time:   [160.91 ns 162.97 ns 165.16 ns]                                
                        change: [-1.9325% +1.1842% +4.5143%] (p = 0.48 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

varint/small/encoded_len                                                                            
                        time:   [86.179 ns 88.462 ns 90.912 ns]
                        change: [+0.7213% +4.2934% +7.8878%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

varint/medium/encode    time:   [358.18 ns 364.79 ns 372.55 ns]                                 
                        change: [-19.784% -16.565% -13.286%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

varint/medium/decode    time:   [237.39 ns 241.08 ns 245.41 ns]                                 
                        change: [-21.082% -15.509% -9.7628%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  9 (9.00%) high mild
  2 (2.00%) high severe

varint/medium/encoded_len                                                                            
                        time:   [82.511 ns 83.446 ns 84.489 ns]
                        change: [+0.2021% +3.5902% +7.4506%] (p = 0.05 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

varint/large/encode     time:   [636.18 ns 647.51 ns 659.99 ns]                                 
                        change: [-26.210% -23.732% -20.976%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

varint/large/decode     time:   [357.62 ns 363.08 ns 369.10 ns]                                
                        change: [-4.2180% -0.3207% +3.2406%] (p = 0.87 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

varint/large/encoded_len                                                                            
                        time:   [81.459 ns 82.667 ns 83.991 ns]
                        change: [-4.8438% -1.2792% +2.3932%] (p = 0.49 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

varint/mixed/encode     time:   [348.02 ns 352.78 ns 358.36 ns]                                
                        change: [-26.256% -23.085% -20.160%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

varint/mixed/decode     time:   [237.17 ns 241.66 ns 246.70 ns]                                
                        change: [-4.2677% -1.0296% +2.5542%] (p = 0.58 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

varint/mixed/encoded_len                                                                            
                        time:   [83.815 ns 84.976 ns 86.126 ns]
                        change: [-5.0328% -1.2148% +2.4944%] (p = 0.53 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

caspermeijn · 2024-07-12T07:22:06Z

The current master branch now uses for _ in 0..10 { to tell the compiler there is a maximum of 10 iterations. Can you rerun your benchmark to see if this PR is still an improvement?

Optimize varint encoding

59eae71

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize varint encoding #767

Optimize varint encoding #767

MelonShooter commented Nov 28, 2022

LucioFranco commented Dec 12, 2022

LucioFranco commented Dec 12, 2022

MelonShooter commented Aug 12, 2023 •

edited

Loading

caspermeijn commented Jul 12, 2024

Optimize varint encoding #767

Are you sure you want to change the base?

Optimize varint encoding #767

Conversation

MelonShooter commented Nov 28, 2022

LucioFranco commented Dec 12, 2022

LucioFranco commented Dec 12, 2022

MelonShooter commented Aug 12, 2023 • edited Loading

caspermeijn commented Jul 12, 2024

MelonShooter commented Aug 12, 2023 •

edited

Loading