Another Compiler Refactor: Performance & Cleanup #2282

ferd · 2020-05-13T01:48:39Z

This is a big PR, more easily reviewed by commit.

Changeset

Extract artifact tracking to DAG module : I just cleaned up a thing that was a bad abstraction, and re-located artifact tracking within the DAG module. This was necessary when attempting an unsuccessful parallelism effort that didn't yield good results in practice. Also fixes a bug where erl_first_file entries weren't properly tracked in the DAG.
Extract & Generalize parallel queue from compiler where I take @max-au's code to turn module-level compiling parallel and moved it out to a standalone module. I believe the logic to run as many tasks as there are schedulers is worth reusing more internally (it was useful in my failed attempts). I believe this could be useful in further optimization work down the road.
That commit also fixes a few type issues and a latent bug in a bad pattern match that some plugins could have tripped on.
A few low-hanging fruits for a speedup by shifting common code out of a tight loop. In some cases, that could give me ~5-10% speedups on some builds.
Finally there is this big boy of a commit that fixes DAG pruning for extra_src_dirs in large repos. This one is tricky so I'll expand on many paragraphs below.

We run the analysis for an extra_src_dir by making a fake app for it, but send in the app alone for analysis.

In the current master, the DAG pruning routine looks at the DAG and files submitted and goes "this right here is a project with 99% of its apps deleted". It then tries to prune the whole DAG except for some test files. The code "works" simply because there's a false-positive check that makes sure the file is on disk before removing it for the DAG.

Of course this is a lot costlier than just checking the graph, And ends up making extra dir runs where ~80% of the time is spent double-checking the false positives for file deletions.

This commit fixes this by merging in all extra_src fake apps and making them run in a single analysis phase, meaning we only pay the cost of the DAG pruning once for the whole project, making it faster than any sparse repo.

There's also a small patch needed for the root-level extra src dirs; turns out that since the context-handling in the rebar_compiler uses a map to store content, running single-pass analysis clobbered entries for a given app if they had more than one extra_src_dir in there.

I also took the time to clean up the ordering of that file.

Benchmarks

I replicated the troublesome monorepos of larger corporations that we have received reports about. The benchmark mostly focuses on "null" builds, where the app was already built and we just re-compile it a second time. This lets us evaluate the cost of just scanning for changes, without accounting for externalities like hooks or the Erlang compiler.

The problem mentioned was specifically with rebar3 as test compile being really dog slow due to extra_src_dirs

Set up an experiment by doing the following, which creates a project with over 300 apps, even if they contain only 2-3 modules each:

→ rebar3 new umbrella extrapps
→ cd extrapps/apps/
→ for i in {1..300}     
  do
   rebar3 new app app_${i}
   mkdir app_${i}/test
   echo "-module(app_${i}_SUITE). " > app_${i}/test/app_${i}_SUITE.erl
done
→ cd ../

Here are my results as run on my small VPS with 2 cores.

Master:
- rebar3 compile: 7.32s user 1.93s system 165% cpu 5.582 total
- rebar3 as test compile: 44.55s user 10.19s system 149% cpu 36.674 total
This PR:
- rebar3 compile: 7.05s user 2.29s system 160% cpu 5.824 total
- rebar3 as test compile: 10.47s user 2.10s system 128% cpu 9.819 total

Having ~1 second of jitter either way on all runs was pretty predictable. There is no clear change either way on rebar3 compile but with rebar3 as test compile it's obvious that we're now going around 4x faster on that case. It's not a fixed cost though; it grows worse as you add more apps with more modules.

Cleans up some annoying broken abstraction. Also fix erl_first_file artifact tracking, which will avoid seeking to disk by finding the opts in the DAG

Into a standalone module where it can be reused for optimisation work later on.

We run the analysis for an extra_src_dir by making a fake app for it, but send in the app alone for analysis. In there, the DAG pruning routine looks at the DAG and files submitted and goes "this right here is a project with 99% of its apps deleted". It then tries to prune the whole DAG except for some test files. The code "works" simply because there's a false-positive check that makes sure the file is on disk before removing it for the DAG. This ends up making extra runs where ~80% of the time is spent double-checking the false positives for file deletions. This commit fixes this by merging in all extra_src fake apps and making them run in a single analysis phase, meaning we only pay the cost of the DAG pruning once for the whole project, making it faster than any sparse repo. There's also a small patch needed for the root-level extra src dirs; turns out that since the context-handling in the `rebar_compiler` uses a map to store content, running single-pass analysis clobbered entries for a given app if they had more than one extra_src_dir in there. I also took the time to clean up the ordering of that file.

ferd · 2020-05-13T14:53:23Z

Benchmark with last stable rebar3 (3.13.2), which had a faster initial build, but was less good at increments:

./rebar3 compile: 136.16s user 33.45s system 184% cpu 1:28.75 total
./rebar3 as test compile: 153.52s user 37.93s system 182% cpu 1:45.16 total

The current branch is therefore ~20x faster than last stable release on a larger sparse repo for basic incremental builds. Obviously these improvements aren't as visible on regular, smaller repos, but that's still significant since we don't have significant regressions there either.

That's a decent boost for the smaller price to pay on the first build.

src/rebar_prv_compile.erl

src/rebar_compiler.erl

ferd added 4 commits May 12, 2020 15:27

Extract artifact tracking to DAG module

90bead2

Cleans up some annoying broken abstraction. Also fix erl_first_file artifact tracking, which will avoid seeking to disk by finding the opts in the DAG

Extract & Generalize parallel queue from compiler

235543d

Into a standalone module where it can be reused for optimisation work later on.

Low hanging fruits for speedups

9a2898a

ferd requested a review from tsloughter May 13, 2020 01:48

tsloughter approved these changes May 16, 2020

View reviewed changes

src/rebar_prv_compile.erl Outdated Show resolved Hide resolved

src/rebar_compiler.erl Show resolved Hide resolved

formatting output

0e30351

ferd merged commit b6e5f1b into erlang:master May 16, 2020

ferd mentioned this pull request Feb 15, 2022

Maybe wrong pre_hooks order when compiling katt #2682

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Another Compiler Refactor: Performance & Cleanup #2282

Another Compiler Refactor: Performance & Cleanup #2282

ferd commented May 13, 2020 •

edited

Loading

ferd commented May 13, 2020

Another Compiler Refactor: Performance & Cleanup #2282

Another Compiler Refactor: Performance & Cleanup #2282

Conversation

ferd commented May 13, 2020 • edited Loading

Changeset

Benchmarks

ferd commented May 13, 2020

ferd commented May 13, 2020 •

edited

Loading