-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rdataframe to awkward #1448
rdataframe to awkward #1448
Conversation
Codecov Report
|
@jpivarski - would the following guarantee that the result does not go away? It's in ROOT src/bindings/pyroot_legacy/_rdf_utils.py import numpy
class ndarray(numpy.ndarray):
"""
A wrapper class that inherits from numpy.ndarray and allows to attach the
result pointer of the `Take` action in an `RDataFrame` event loop to the
collection of values returned by that action. See
https://docs.scipy.org/doc/numpy/user/basics.subclassing.html for more
information on subclassing numpy arrays.
"""
def __new__(cls, numpy_array, result_ptr):
"""
Dunder method invoked at the creation of an instance of this class. It
creates a numpy array with an `RResultPtr` as an additional
attribute.
"""
obj = numpy.asarray(numpy_array).view(cls)
obj.result_ptr = result_ptr
return obj
def __array_finalize__(self, obj):
"""
Dunder method that fills in the instance default `result_ptr` value.
"""
if obj is None: return
self.result_ptr = getattr(obj, "result_ptr", None) |
That class doesn't do anything to attach itself to a RDataSource by itself, but presumably NumPy arrays require special handling because But you're right that RNumpyDS is going to have exactly the same problem that we do: the data is owned by a NumPy object, so that NumPy object has to be kept in scope as long as the RNumpyDS is. Here's the pointer: and maybe that The RNumpyDS does There are no other references to If this is right, then I think that means that ROOT functions that might create or delete RNumpyDS instances are holding the Python GIL. That surprises me, since it would prevent speed-ups in parallel processing. Maybe it's okay because the parallel processing only happens in a fork-join pattern in which the RNumpyDS is never created or destroyed?
|
692b33d
to
9e1ac32
Compare
@jpivarski - it looks like the failing pre-commit is unrelated to this PR; flake8...................................................................Failed
- hook id: flake8
- exit code: 1
src/awkward/config.py:99:5: T201 print found.
src/awkward/_v2/_connect/numba/arrayview.py:16:9: T201 print found.
src/awkward/_v2/_connect/numba/arrayview.py:17:9: T201 print found.
src/awkward/_connect/_numba/arrayview.py:20:9: T201 print found.
src/awkward/_connect/_numba/arrayview.py:21:9: T201 print found. |
|
I think, print here is not ignored anymore: def code_to_function(code, function_name, externals=None, debug=False):
if debug:
print("################### " + function_name) # noqa: T001
print(code) # noqa: T001 |
Apparently, what happened is that the number changed from T001 to T201. I would have thought that those numbers wouldn't change—why else would they be short codes like that? Oh well. Since this might be run under different versions of flake8-print, it's probably safest to ignore both: # noqa: T001,T201 I don't know what the new "T001" is, but we're only losing the protection of this test on a few isolated lines, so it's probably fine. |
…it-hep/awkward into ianna/rdataframe-to-awkward
replaced by #1474 |
No description provided.