-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Weighted cutflow #1272
feat: Weighted cutflow #1272
Conversation
|
Aye, those failures are the attempt to use fill_flattened in dask which doesn't currently work, so it will be rolled-back to manual flattening for now |
…edSelection class Adjust result object to not break unpacking of cutflow result, restoring one test; need dask-contrib/dask-awkward#531 Remove unnecessary return Cutflow Add boolean_masks_to_categorical_integers for optimization of histogram fill in cutflow WIP add the optimized version of filling cutflows, with a testable v2 variable temprorarily, and new categorical option for additional axis (needs documentation and example) Rename slice to slc WIP weighted cutflow with commonmask, categorical axes, plot_vars still needs much work and cleanup Untested plot_vars function with commonmask and categorical support, using broadcast_arrays func Fixes for original tests to pass WIP - fixes from tests, expanded print header revert change to masks in compute for npz Fix commonmask handling of initial weighted event counts Some logic fixes for weights in npz file Fix for v2 eager mode Handle delayed cutflow to_npz pathway, Weights does not have a compute method black src/coffea/analysis_tools.py
…low from PackedSelection class weights testing okay, histos next weighted and commonmask tests in place, categorical next stricter tests for weights, passing Try v2 yieldhist categorical tests added for yieldhist Added categorical tests, need one more specific to histograms then cleanup Added cutflow_extended_dak tests black tests/test_analysis_tools.py
377f358
to
44048c3
Compare
…ficant performance difference, fix delayed commonmasked path, remove debug print statement, add scale for all paths of yieldhist, plot_vars; reduce LOC and make more consistent
for more information, see https://pre-commit.ci
…rmitting parallel testing by pytest-xdist; 1 core speed: '52 passed, 729287 warnings in 487.82s (0:08:07)', 4 core speed: '52 passed, 732599 warnings in 148.59s (0:02:28)'
for more information, see https://pre-commit.ci
I'll convert this to ready for review, presuming all the tests (very, very slowly) pass. I tacked on a change to the npz tests to write to temporary directories, this should allow running them in parallel (verified to work on my laptop), cross-reference #1278 The extended cutflow tests technically cover the non-extended tests when all non-dask options are False, but I left the original tests in-place out of paranoia. But, importantly, the dak tests call compute a lot... several dozen times, and so an individual test can be O(1 minute). These tests might benefit from refactoring to follow the ideal dask pattern a bit more: setup graph, execute/compute once. Now the opinionated part:
Fin |
Have we got any idea about the really large time taken for the new tests? |
Could you please mark the tests that use dask, as was done in the recent pr from Andrzej? |
…s by building task graph up-front
From my understanding, marking these tests as with-dask would prevent them running in parallel, and these ones are closer to ideal scaling than not. Let me know if I've misunderstood. Some refactoring has eliminated pretty much all the extraneous computes. There's about a 7x improvement (read reduction) in the dak tests added for this PR, and times for just that test and the whole analysis_tools test suite follow: d98a039==a184c35 corresponds to optimized tests
|
Thanks for the refactoring Nick! It was a mistake when I wrote thoses tests like that (I didn't know the impact back then) and I just never fixed it afterwards. |
While anti-patterning a bit, it's a lot clearer to write the tests with all those computes, though (to play devil's advocate). Note that I've left the original ('non-extended') test alone. On my M4Pro with 1 core, it takes 50s to run both variations of it (optimization on/off), whereas the one I added based on it, the extended_dak test, has 2x2x2x as many variations, hence the painfully long additional time on the github runners. While that's an optimization opportunity, it'll take a similar amount of shuffling around as for the extended_dak test |
It may also make sense to not check both optimizations options at some point. Those were finding some issues in earlier stages where optimization was messing something up but they haven’t found something in a while. I would argue that it’s generally bad to remove tests but maybe it’s fine to always have optimization on in the future. |
Honestly for weights you may even only need to check that it builds the typetracer and the graph without failing. |
Any feedback/change requests on the implementation? For the new tests, I could disable things now (just optimization off?), though at about 19s to run all variations multicore on my Mac (maybe 2x worse on the github runners?) there won't be huge savings there. Disabling it on the older test could save some similar amount of time. Just let me know. This also seems like it could be compiled into a dedicated test optimization pass to assist in improving the scikit-hep integration testing. |
I think it's in a reasonable state at this point, tests are a bit long but we can whittle on this. Forking off the optimization on/off tests into their own jobs is a good idea, but that can also come in another PR. Thanks for the addition! |
@NJManganelli Would you like at some point to extend this to N-1 hists as well? If you ever get some time of course. It will be mostly copy pasting stuff cause the original code is pretty much the same. |
Hi @ikrommyd Yeah, it should be extended there too. As tempting as it is to 'balance things out,' I'll need to wait a while and work on some Upgrade, Analysis, and R&D works first. Thanks @lgray , should I make an issue or two to track these to-dos? Test time reductions, refactorization of analysis tools, extension of N-1... |
Yes, please make a tracking issue! Thanks again! |
This PR adds an (opinionated) take on adding weighted cutflows to coffea, permitting the cutflow to be instantiated with a Weights object and a weights modifier (i.e. a weights variation name, as in
Weights.weights("jesUp")
)Edit: Equivalent in NMinusOne would also be highly desirable, but this PR has become large enough that it should be split off (and incorporate any changes requested in review)