-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[primitives] Add Selector primitive system #22635
base: master
Are you sure you want to change the base?
Conversation
31df8fc
to
f69c280
Compare
f69c280
to
c3f780b
Compare
+@sherm1 could I interest you in feature review here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Moving slowly today though.
Reviewable status: LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @jwnimmer-tri)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkpoint.
Reviewed 3 of 4 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: 13 unresolved discussions, LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @jwnimmer-tri)
systems/primitives/selector.h
line 32 at r4 (raw file):
std::vector<int> input_sizes; /** Helper struct for `output_selections`. */
BTW this has a default constructor that produces a legitimate (but odd) value of (0,0). Consider deleting that -- it doesn't seem useful. Or make it something ugly like (-1,-1)?
systems/primitives/selector.h
line 59 at r4 (raw file):
j = output_selections[m][i].input_offset y_m[i] = u_n[j] ``` */
BTW is it required that every output port element is sourced from some input port element? Consider mentioning the policy.
systems/primitives/selector.h
line 74 at r4 (raw file):
/** This system combines multiple vector-valued inputs into multiple vector- valued outputs. The inputs to this system directly feed through to its outputs.
BTW consider mentioning here the very cool feature that this is not just shuffling input ports to output ports but can shuffle individual elements of inputs to individual elements of outputs. One can infer that by reading element documentation for SelectorParams but I think users will start here at Selector doxygen and not get the big picture that would motivate them to dig in to the Params.
systems/primitives/selector.h
line 86 at r4 (raw file):
- y0 - ... - y(M-1)
nit: you used lower case "n" and "m" in the Params doc; should match here. (I like lower case better myself)
systems/primitives/selector.h
line 120 at r4 (raw file):
const SelectorParams params_; std::shared_ptr<const void> compiled_program_;
BTW is this actually intended for sharing? If it is a hack to make Pybind work it should say so.
systems/primitives/selector.h
line 120 at r4 (raw file):
const SelectorParams params_; std::shared_ptr<const void> compiled_program_;
BTW also is void
here better than a forward-declared CompiledProgram
? I'm imagining doing this way makes debugging clunky.
systems/primitives/selector.cc
line 15 at r4 (raw file):
struct CompiledProgram { struct Element { auto operator<=>(const Element&) const = default;
BTW Yay -- spaceship!
systems/primitives/selector.cc
line 25 at r4 (raw file):
using PerOutput = absl::InlinedVector< std::pair<InputPortIndex, absl::InlinedVector<Element, 5>>, 1>; static_assert(sizeof(PerOutput) == 64);
BTW cute, but this seems like premature optimization at first glance. Is there a reason to believe that a plain std::vector of std::vectors wouldn't be adequate? I went down an absl-code-reading rathole trying to see why the size should be 64 but eventually gave up. I (and future readers of this code) wouldn't have felt compelled to do that with straightforward vector of vector.
A few heap allocations at setup doesn't seem like it would be a problem? And cache locality for a single output port would seem to matter more for larger outputs, but for those InlinedVector just turns effectively into std::vector.
systems/primitives/selector.cc
line 93 at r4 (raw file):
params_.output_selections[i]; int output_offset = 0; for (const auto& [input_port_index, input_offset] : output_specs) {
BTW I'm not seeing anything that requires that output_specs covers every element of the output port. Did you want to require that every element has an input source? Or maybe set open outputs to NaN or zero? In any case the chosen policy should be documented somewhere.
Consider also mentioning that this is not a mere shuffling of elements but that any given input element could appear as any number of output elements, and not every input element needs to appear in the output.
systems/primitives/selector.cc
line 101 at r4 (raw file):
} for (auto& [_, elements] : compiled_outputs[i]) { std::sort(elements.begin(), elements.end());
minor: this needs a comment. I think what you're doing here is ordering the elements so that they are grouped by source element order, rather than being in output element order as constructed above, perhaps due to your stated concern about cache locality above. But it's not clear why favoring the input port order over the output order is a good thing. Some clarification is needed. I'd be inclined to document the best practice in the CompiledProgram doc also.
manipulation/schunk_wsg/schunk_wsg_plain_controller.cc
line 127 at r4 (raw file):
params.input_sizes.push_back(2); params.input_names.push_back("desired_grip_state"); params.input_sizes.push_back(2);
BTW this would be prettier and less error prone if it looked more like
params.input_port("desired_mean_finger_state", 2);
or
params.input_port(
{
.name = "desired_mean_finger_state",
.size = 2,
});
manipulation/schunk_wsg/schunk_wsg_plain_controller.cc
line 146 at r4 (raw file):
.input_offset = 1, }, });
BTW that's nice -- much better than a selector matrix!
systems/primitives/test/selector_test.cc
line 24 at r4 (raw file):
} // This gives y0 = u0 where each port is a 1-element vector.
nit:
Suggestion:
the only port
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it! Feature review pass done. Since I made some design suggestions I'll hold off on the stamp in case you like any of them or want to discuss.
Reviewed 4 of 7 files at r1, 1 of 4 files at r3.
Reviewable status: 17 unresolved discussions, LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @jwnimmer-tri)
manipulation/schunk_wsg/schunk_wsg_plain_controller.cc
line 146 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW that's nice -- much better than a selector matrix!
I'm wondering now whether we could get the port name and element info together also, like
params.output_selections.push_back(
"x_tilde_desired",
{
{
.input_port_index = 0,
.input_offset = 0,
}
};
systems/primitives/test/selector_test.cc
line 99 at r4 (raw file):
{ // y0 = <u0[0], u1[0]> OutputSelection{
BTW you could leave out the explicit struct name here like you did in the Schunk example. Maybe better do that here also to prevent cargo culters from assuming it's required? Also tidier and less vertical space consumed.
systems/primitives/test/selector_test.cc
line 192 at r4 (raw file):
TEST_F(SelectorTest, BadInputSize) { // Request one input port but mistakenly provide two input port names.
nit: this comment is from a different test
systems/primitives/test/selector_test.cc
line 266 at r4 (raw file):
OutputSelection{ .input_port_index = 0, },
BTW I'm not sure it's a plus that you can leave the offset unspecified to get a 0.
bindings/pydrake/systems/test/primitives_test.py
line 443 at r4 (raw file):
def test_selector(self, T): dut = Selector_[T](params=self._make_selector_params()) dut.CreateDefaultContext()
BTW just want to confirm that a "doesn't blow up" test is sufficient here. It wasn't obvious to me that we would known whether the Python output_selections data structure is conveyed correctly in to C++. But I'll take your word for it!
c3f780b
to
a75e192
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good ideas, here's the major changes:
(1) Re-jigger the params to prefer helper structs over parallel arrays.
(2) Incorporate a count
into the compiled program, so that spans of contiguous copying don't need so much memory in the CompiledProgram representation (and remove InlinedVector.)
(3) Memorize the input port pointers to avoid repeated checking for out-of-bounds and deprecated ports during every Calc.
(4) Use Eigen::VectorBlock<VectorX<T>> output_block = ...
in the calc function. Calling BasicVector::operator[]
is awful in the object code.
Reviewed 5 of 7 files at r1, 2 of 2 files at r2, 4 of 4 files at r3, 1 of 1 files at r4, 8 of 8 files at r5, all commit messages.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers
systems/primitives/selector.h
line 32 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW this has a default constructor that produces a legitimate (but odd) value of (0,0). Consider deleting that -- it doesn't seem useful. Or make it something ugly like (-1,-1)?
Configuration objects (and IMO most value-semantics objects) should have a default constructor. Deleting it prevents convenient and valid uses cases like instantiating it first and then filling it in later.
Defaulting the input_port_index to be the first input port is an important feature. It lets allows users to not mention it in common case of only a single input port.
That leaves the only question as what the default value of input_offset
should be. I don't really want to use -1
sentinel in case we decide to allow python indexing where negative indices count backwards from the end. We could use an optional<>
to detect uninitialized, but I'm not sure it's a win. If default value of 0 wasn't what the user wanted, then they should have written something more specific. A compiler error is too big of a gun to use for this kind of small logic mistake.
systems/primitives/selector.h
line 59 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW is it required that every output port element is sourced from some input port element? Consider mentioning the policy.
Done
systems/primitives/selector.h
line 86 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
nit: you used lower case "n" and "m" in the Params doc; should match here. (I like lower case better myself)
My mental model is that M and N capitalized are the constants for how many ports exist, and m and n are loop variables that range e.g. m ∈ [0, M).
systems/primitives/selector.cc
line 25 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW cute, but this seems like premature optimization at first glance. Is there a reason to believe that a plain std::vector of std::vectors wouldn't be adequate? I went down an absl-code-reading rathole trying to see why the size should be 64 but eventually gave up. I (and future readers of this code) wouldn't have felt compelled to do that with straightforward vector of vector.
A few heap allocations at setup doesn't seem like it would be a problem? And cache locality for a single output port would seem to matter more for larger outputs, but for those InlinedVector just turns effectively into std::vector.
Yeah, this halfway measure probably isn't worthwhile. I've put it back to normal vectors now. The right answer is to switch them all to be pmr::vector using a shared allocator to obtain the linear layout in memory.
systems/primitives/selector.cc
line 93 at r4 (raw file):
I'm not seeing anything that requires that output_specs covers every element of the output port.
Given how the Params struct works, it's impossible to declare an output without saying which input it came from. Per the other discussion I already added some Doxygen text about this.
We could of course add more cross-checking logic here with DRAKE_DEMAND that our compilation step didn't screw things up, but that didn't seem necessary to me.
systems/primitives/test/selector_test.cc
line 24 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
nit:
There are two ports in total (one input, one output).
systems/primitives/test/selector_test.cc
line 99 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW you could leave out the explicit struct name here like you did in the Schunk example. Maybe better do that here also to prevent cargo culters from assuming it's required? Also tidier and less vertical space consumed.
I actually think putting the names is a better example for users. Having a giant pile of curly braces with no typename hints is optimizing for ease of writing instead of ease of reading. We want skimming the params to be easy without needing to open the definitions to infer what everything is.
systems/primitives/test/selector_test.cc
line 266 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW I'm not sure it's a plus that you can leave the offset unspecified to get a 0.
Meh. The cure is worse than the disease. Preventing errors with the compiler is a brutal tool, and not particularly portable to our users' primary language (Python).
bindings/pydrake/systems/test/primitives_test.py
line 443 at r4 (raw file):
Previously, sherm1 (Michael Sherman) wrote…
BTW just want to confirm that a "doesn't blow up" test is sufficient here. It wasn't obvious to me that we would known whether the Python output_selections data structure is conveyed correctly in to C++. But I'll take your word for it!
This particular test case is fine. It's not possible for the constructor binding to exist but not take the params.
The only question would be if the test_selector_params
test case sufficiently probed the Params object for correctness. I've shored that up by adding a more specific test of repr.
a75e192
to
81af03c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful, thanks! I love the consolidation optimization. Feature
+@sammy-tri for platform review per rotation, please
Reviewed 6 of 8 files at r5, 2 of 2 files at r6, all commit messages.
Reviewable status: 2 unresolved discussions, LGTM missing from assignee sammy-tri(platform)
systems/primitives/selector_internal.h
line 35 at r6 (raw file):
}; /* The group off all recipe elements for a single pair of input and output
typo: off -> of
systems/primitives/selector.cc
line 167 at r6 (raw file):
for (const SelectorProgram::Element& element : elements) { for (int j = 0; j < element.count; ++j) { output_block[element.output_start + j] = input[element.input_start + j];
BTW this is a nice "best practices" optimization! Copying contiguous source into contiguous destination is always the right thing to do and gets better as the vectors get longer.
81af03c
to
5aeece4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: LGTM missing from assignee sammy-tri(platform)
Is your feature request related to a problem? Please describe.
When wiring up all of the various MultibodyPlant input and output ports to controllers, it's often the case that we need to multiplex and/or split up the data on those ports. Sometimes we need to stack per-model q, v, or x together (e.g., arm model + gripper model), sometimes we need to slice out actuated dofs from unactuated dofs, sometimes we need to interleave q with v, sometimes we need to switch between velocity-based indexing and actuator-based indexing, or any number of other permutations.
Each time this comes up, of course it's possible to write a custom LeafSystem to do the exact rearrangement that's needed, or put lots of Multiplexer and Demultiplexer systems in series, or use a SparseMatrixGain to formulate the selection as a matrix product (for each necessary output port, one at a time), but all of that is extremely awkward.
Describe the solution you'd like
We should offer a primitive system that can perform arbitrary vector slicing / selection, with multiple input ports, multiple output ports, and any element of any output port can be mapped to any element of any input port.
Note that we do not need scaling. If there is scaling required, a Gain or MatrixGain or SparseMatrixGain system can be used in front of or behind the Selector.
With a bow to Simulink, we should call this system a
Selector
.Describe alternatives you've considered
Writing copy-paste systems for the acute purpose every time (and binding them in Python, and testing them, and optimizing them for performance one by one, and etc.).
Additional context
Also, by making the constructor
...Params
a serializable first-class object, downstream users can still write customized and easily-tested recipes for preparing the selection recipe they need (only operating on the params object), and then later hand it off to Selector system for efficient execution at runtime. This neatly separates the concerns of "carefully specify and choose the recipe" from "efficiently execute the recipe at runtime". If we find ways to speed up the selection implementation down the road, all users will benefit without needing to copy-paste the fix into their custom code.This change isdata:image/s3,"s3://crabby-images/d0bb7/d0bb7f7625ca5bf5c3cf7a2b7a514cf841ab8395" alt="Reviewable"