Make generic over a Field trait #40

rphmeier · 2019-01-08T15:30:41Z

And implement it for galois_8.
After #38 I will add an implementation for galois_16 as well.

coveralls · 2019-01-08T15:49:10Z

Coverage increased (+0.04%) to 96.3% when pulling 938c394 on paritytech:rh-wrap-8 into 0890891 on darrenldl:master.

codecov · 2019-01-08T16:00:30Z

Codecov Report

Merging #40 into master will increase coverage by 0.07%.
The diff coverage is 94.11%.

@@            Coverage Diff            @@
##           master     #40      +/-   ##
=========================================
+ Coverage   96.22%   96.3%   +0.07%     
=========================================
  Files           7       7              
  Lines        2624    2649      +25     
=========================================
+ Hits         2525    2551      +26     
+ Misses         99      98       -1

Impacted Files	Coverage Δ
src/tests/mod.rs	`98% <100%> (ø)`	⬆️
src/matrix.rs	`98.73% <100%> (ø)`	⬆️
src/inversion_tree.rs	`91.66% <75%> (ø)`	⬆️
src/galois_8.rs	`98.99% <92.59%> (-0.64%)`	⬇️
src/lib.rs	`95% <93.75%> (+0.88%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0890891...938c394. Read the comment docs.

rphmeier · 2019-01-09T15:59:13Z

I'd be interested to see if there is a performance hit for GF(2^8) with the generification.

darrenldl · 2019-01-09T16:02:49Z

I can try benchmarking it later.

If there is a significant performance hit, what would be your suggestion though?

darrenldl · 2019-01-10T08:55:00Z

Sorry haven't quite gotten to setting up benchmarking due to time constraint, but I went through the changes and it should be fine, as galois_8::Field implementation of mul_slice and mul_slice_xor are using the lower level ones rather than the higher level default one in lib.rs.

Majority of the execution time during encoding is spent in Galois coding, specifically mul_slice, so even if other parts carry slight overhead, it shouldn't impact encoding phase too much, if noticeable at all.

Matrix is pretty simple to begin with, and matrix generation occupies relatively small portion of the execution time in general due to InversionTree caching matrices. So unless the generification causes InversionTree to cache a lot slower for some reason, overall the performance should be almost the same as before.

rphmeier · 2019-01-11T15:27:07Z

The benchmarks show as much slower, but this is mostly because of the removal of Rayon. I wouldn't imagine that this change would make things slower except in some real failure of monomorphization.

darrenldl · 2019-01-11T15:33:20Z

Huh, so rayon did speed things up a lot.

Do you have the specific numbers?

rphmeier · 2019-01-11T16:25:17Z

Old:

encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 1024
    time taken       : 0.418300462
    byte count       : 5242880000
    MB/s             : 11953.130474907293
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 4096
    time taken       : 0.387275401
    byte count       : 5242880000
    MB/s             : 12910.708986652111
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 8192
    time taken       : 0.387310016
    byte count       : 5242880000
    MB/s             : 12909.555119793236
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 16384
    time taken       : 0.366356906
    byte count       : 5242880000
    MB/s             : 13647.893401523595
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 32768
    time taken       : 0.350995964
    byte count       : 5242880000
    MB/s             : 14245.178044269478
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 65536
    time taken       : 0.354697661
    byte count       : 5242880000
    MB/s             : 14096.512466147895

New:

encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 1024
    time taken       : 5.654385699
    byte count       : 5242880000
    MB/s             : 884.269355888522
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 4096
    time taken       : 5.768529836
    byte count       : 5242880000
    MB/s             : 866.7719752086933
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 8192
    time taken       : 5.745482032
    byte count       : 5242880000
    MB/s             : 870.2490012416769
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 16384
    time taken       : 5.824110237
    byte count       : 5242880000
    MB/s             : 858.500233775709
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 32768
    time taken       : 5.703026328
    byte count       : 5242880000
    MB/s             : 876.7274973730404
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 65536
    time taken       : 5.708100441
    byte count       : 5242880000
    MB/s             : 875.9481462670359

rphmeier · 2019-01-11T16:28:23Z

Totally makes sense that rayon speeds things up IMO.

darrenldl · 2019-01-11T17:07:01Z

Thanks for benchmarking!

Okay yeah, that makes a lot of sense now that I think about it, independent slice coding do allow speed up linear to (parity) shard count.

But yeah, the generification itself seems fine. If we really want, we can benchmark with rayon set to using single thread, then it should show the overhead of generifcation, if any exists at all. But I personally don't feel this is too useful.

rphmeier · 2019-01-11T17:52:18Z

(I was accidentally benchmarking w/o simd acceleration due to the feature change)

Old without rayon:

encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 1024
    time taken       : 0.745146417
    byte count       : 5242880000
    MB/s             : 6710.090642494494
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 4096
    time taken       : 0.746232383
    byte count       : 5242880000
    MB/s             : 6700.325681256317
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 8192
    time taken       : 0.794534702
    byte count       : 5242880000
    MB/s             : 6292.991341239114
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 16384
    time taken       : 0.762313775
    byte count       : 5242880000
    MB/s             : 6558.978945382431
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 32768
    time taken       : 0.883574129
    byte count       : 5242880000
    MB/s             : 5658.834766539436
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 65536
    time taken       : 0.784516589
    byte count       : 5242880000
    MB/s             : 6373.351526413676

New with simd-accel:

encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 1024
    time taken       : 0.814699297
    byte count       : 5242880000
    MB/s             : 6137.233723426179
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 4096
    time taken       : 0.829256454
    byte count       : 5242880000
    MB/s             : 6029.497842171753
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 8192
    time taken       : 0.736849525
    byte count       : 5242880000
    MB/s             : 6785.645956682947
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 16384
    time taken       : 0.733428098
    byte count       : 5242880000
    MB/s             : 6817.30085557753
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 32768
    time taken       : 0.740080925
    byte count       : 5242880000
    MB/s             : 6756.017931417433
encode :
    shards           : 10 / 2
    shard length     : 1048576
    bytes per encode : 65536
    time taken       : 0.744777619
    byte count       : 5242880000
    MB/s             : 6713.413336337112

rphmeier · 2019-01-11T17:52:27Z

In short, seems about the same. Rayon can roughly double throughput in this configuration.

darrenldl · 2019-01-11T17:57:09Z

Sounds about right - 2 parity shards are used in the benchmarks.

So I think this PR looks good, as nothing important is changed. I'm not sure if there's a very good way to test the impl crate::Field for Field { ... } part to make sure the right actual functions are used for the impl other than duplicating all the existing tests, so I'll just open an issue and see if any new idea pops up.

I'll merge this if you think it's okay.

rphmeier · 2019-01-11T17:59:18Z

yes please :)

rphmeier added 3 commits January 9, 2019 16:30

move arithmetic ops and element representation to field

d3dd9d5

export type aliases from galois_8

08f908e

tweak fill_random to support any rng element

938c394

rphmeier force-pushed the rh-wrap-8 branch from 0f93d80 to 938c394 Compare January 9, 2019 15:32

rphmeier mentioned this pull request Jan 9, 2019

Integrate GF16 encoding #43

Merged

darrenldl merged commit ba2389a into rust-rse:master Jan 11, 2019

rphmeier deleted the rh-wrap-8 branch January 11, 2019 19:12

darrenldl mentioned this pull request May 6, 2019

[Question] Implementing Parity Volume Set Specification 2.0 #47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make generic over a Field trait #40

Make generic over a Field trait #40

rphmeier commented Jan 8, 2019

coveralls commented Jan 8, 2019 •

edited

Loading

codecov bot commented Jan 8, 2019 •

edited

Loading

rphmeier commented Jan 9, 2019

darrenldl commented Jan 9, 2019

darrenldl commented Jan 10, 2019 •

edited

Loading

rphmeier commented Jan 11, 2019 •

edited

Loading

darrenldl commented Jan 11, 2019

rphmeier commented Jan 11, 2019

rphmeier commented Jan 11, 2019

darrenldl commented Jan 11, 2019 •

edited

Loading

rphmeier commented Jan 11, 2019

rphmeier commented Jan 11, 2019 •

edited

Loading

darrenldl commented Jan 11, 2019

rphmeier commented Jan 11, 2019

Make generic over a Field trait #40

Make generic over a Field trait #40

Conversation

rphmeier commented Jan 8, 2019

coveralls commented Jan 8, 2019 • edited Loading

codecov bot commented Jan 8, 2019 • edited Loading

Codecov Report

rphmeier commented Jan 9, 2019

darrenldl commented Jan 9, 2019

darrenldl commented Jan 10, 2019 • edited Loading

rphmeier commented Jan 11, 2019 • edited Loading

darrenldl commented Jan 11, 2019

rphmeier commented Jan 11, 2019

rphmeier commented Jan 11, 2019

darrenldl commented Jan 11, 2019 • edited Loading

rphmeier commented Jan 11, 2019

rphmeier commented Jan 11, 2019 • edited Loading

darrenldl commented Jan 11, 2019

rphmeier commented Jan 11, 2019

coveralls commented Jan 8, 2019 •

edited

Loading

codecov bot commented Jan 8, 2019 •

edited

Loading

darrenldl commented Jan 10, 2019 •

edited

Loading

rphmeier commented Jan 11, 2019 •

edited

Loading

darrenldl commented Jan 11, 2019 •

edited

Loading

rphmeier commented Jan 11, 2019 •

edited

Loading