Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of btf.Program #522

Closed
ti-mo opened this issue Dec 9, 2021 · 2 comments
Closed

Removal of btf.Program #522

ti-mo opened this issue Dec 9, 2021 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@ti-mo
Copy link
Collaborator

ti-mo commented Dec 9, 2021

Currently, a program's BTF info is kept in a separate structure, pointed to by ebpf.ProgramSpec's BTF field. Now the use case for BTF has grown as substantially as it did, it's time to merge this data structure into the core types for ease of maintenance.

Having this in a separate structure is both convenient (checking BTF presence is checking a single value) and inconvenient in some ways, but overall it's a burden to have the data spread out between two structs, especially when it comes to marshaling BTF info back into binary form.

My initial idea is to move things around as follows:

  • FuncInfo: remove in favor of a btf.TypeID field in ebpf.ProgramSpec
  • LineInfos: resolve file and lineinfo into strings, move to ebpf.ProgramSpec, exported (?)
  • CoreRelos: not quite sure what to do with this yet, could be moved and unexported initially

#508 already made some small steps to remove the need for FuncInfo and LineInfo to resemble their wire formats so closely. Ultimately, we should end up with something like []*ProgramSpec.MarshalFuncInfos().

More suggestions:

@lmb
Copy link
Collaborator

lmb commented Dec 13, 2021

We could solve many of these with flexible metadata in asm.Instruction, then concatenation of asm.Instructions does the right thing by default. There is a proof of concept for per insn metadata in 483ca4a, but I think a real implementation needs to be more flexible, instructionMeta should probably be a map[interface{}]interface{} and work similarly to context.Value. We'd also need copy on write semantics. With that in place:

  • LineInfos could become metadata on asm.Instruction pretty easily.
  • FuncInfo could also become per insn metadata, but this is less straighforward.
  • CoreRelos could also become per insn metadata, but we'd still have to collect all relos into a slice, see coreRelocate.
  • We could solve Map file descriptor is closed after RewriteMaps #515 and LoadAndAssign fails after RewriteMaps #517 by adding metadata that references a *Map. This could also allow NewProgram(CollectionSpec.Programs["foo"]) to resolve maps, which currently returns an error.

We shouldn't support the user creating FuncInfo etc. metadata to start with. FuncInfo, LineInfo is basically kernel wire format, and we shouldn't really be exposing that.

The downside to this schema is that per insn metadata only works well if most instructions don't have metadata, to keep the overhead low. It's also pretty pointer / indirection heavy. However stuffing things like uint64 into an interface{} is probably pretty cheap since we don't need memory beyond the two words for the interface. And for certain things like *Map we can't avoid the pointer in the first place, since we need it to prevent the finalizer from running. Put another way: in terms of efficiency we can't get much better than []FuncInfo, etc. along side asm.Instructions, but that comes at a code complexity cost since we need to keep track of function offsets manually.

@lmb
Copy link
Collaborator

lmb commented Dec 17, 2021

I've put my WIP metadata implementation here: https://github.com/lmb/ebpf/commits/issue-515-instruction-meta

@lmb lmb self-assigned this Mar 31, 2022
@lmb lmb added this to the Export `btf` milestone Apr 26, 2022
lmb added a commit to lmb/ebpf that referenced this issue May 4, 2022
Use per-instruction metadata to store func info, line info and CO-RE relocations.
As a result, simply concatenating two Instruction slices preserves the ext_info
without extra code to keep track of offsets. The simplest way to achieve is to
assign ext_infos per section, before we split into individual functions. This
also avoids having to split ext_infos in the first place.

Storing ext_infos in metadata in turn allows / requires removing code that relies
on stable offsets, the most notable being applying COREFixups. Instead of tracking
which offset a fixup should be applied to we change coreRelocate to instead return
results in the same order as CORERelocations are passed. In the caller we remember
which instruction originated the relocation and apply the fixup directly instead of
iterating all instructions once more.

Instead of storing ExtInfos in Spec we split it off into a separate exported type.
This makes more sense since every BTF ELF has a Spec, but ExtInfos are optional. It
turns out that only the ELF loader doesn't care about ExtInfos in the first place,
so we allow skipping ExtInfos altogether by introducing LoadSpecAndExtInfosFromReader.

Finally we can remove btf.Program since we don't need an intermediate type to hold
metadata anymore.

Fixes cilium#522
lmb added a commit to lmb/ebpf that referenced this issue May 4, 2022
Use per-instruction metadata to store func info, line info and CO-RE relocations.
As a result, simply concatenating two Instruction slices preserves the ext_info
without extra code to keep track of offsets. The simplest way to achieve is to
assign ext_infos per section, before we split into individual functions. This
also avoids having to split ext_infos in the first place.

Storing ext_infos in metadata in turn allows / requires removing code that relies
on stable offsets, the most notable being applying COREFixups. Instead of tracking
which offset a fixup should be applied to we change coreRelocate to instead return
results in the same order as CORERelocations are passed. In the caller we remember
which instruction originated the relocation and apply the fixup directly instead of
iterating all instructions once more.

Instead of storing ExtInfos in Spec we split it off into a separate exported type.
This makes more sense since every BTF ELF has a Spec, but ExtInfos are optional. It
turns out that only the ELF loader doesn't care about ExtInfos in the first place,
so we allow skipping ExtInfos altogether by introducing LoadSpecAndExtInfosFromReader.

Finally we can remove btf.Program since we don't need an intermediate type to hold
metadata anymore.

Fixes cilium#522
lmb added a commit to lmb/ebpf that referenced this issue May 4, 2022
Use per-instruction metadata to store func info, line info and CO-RE relocations.
As a result, simply concatenating two Instruction slices preserves the ext_info
without extra code to keep track of offsets. The simplest way to achieve is to
assign ext_infos per section, before we split into individual functions. This
also avoids having to split ext_infos in the first place.

Storing ext_infos in metadata in turn allows / requires removing code that relies
on stable offsets, the most notable being applying COREFixups. Instead of tracking
which offset a fixup should be applied to we change coreRelocate to instead return
results in the same order as CORERelocations are passed. In the caller we remember
which instruction originated the relocation and apply the fixup directly instead of
iterating all instructions once more.

Instead of storing ExtInfos in Spec we split it off into a separate exported type.
This makes more sense since every BTF ELF has a Spec, but ExtInfos are optional. It
turns out that only the ELF loader doesn't care about ExtInfos in the first place,
so we allow skipping ExtInfos altogether by introducing LoadSpecAndExtInfosFromReader.

Finally we can remove btf.Program since we don't need an intermediate type to hold
metadata anymore.

Fixes cilium#522
@lmb lmb closed this as completed in 8c29b7a May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants