-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-5343] Isolate profiling native extension native symbols #2003
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When Ruby loads a native extension (using `require`), it uses `dlopen(..., RTLD_LAZY | RTLD_GLOBAL)` (https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362). This means that every symbol exposed directly or indirectly by that native extension becomes visible to every other extension in the Ruby process. This can cause issues, see rubyjs/mini_racer#179. Ruby's extension loading mechanism is not configurable -- there's no way to tell it to use different flags when calling `dlopen`. To get around this, this commit introduces introduces another extension (profiling loader) which has only a single responsibility: mimic Ruby's extension loading mechanism, but when calling `dlopen` use a different set of flags. This idea was shamelessly stolen from @lloeki's work in rubyjs/mini_racer#179, big thanks! Note that, that we know of, the profiling native extension only exposes one potentially problematic symbol: `rust_eh_personality` (coming from libddprof/libdatadog). Future versions of Rust have been patched not to expose this (see rust-lang/rust#95604 (comment)) so we may want to revisit the need for this loader in the future, and perhaps delete it if we no longer require its services :) Testing the `ddtrace_profiling_loader.c`'s failure handling paths is a bit non-trivial, so here's how I did it manually (some logs abridged): ```bash # 1️⃣ Validate everything's working fine: $ ruby -v ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-darwin20] $ bundle rake clean compile ... $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil' D, [2022-05-06T12:17:41.554770 #19686] DEBUG -- ddtrace: [ddtrace] (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:340:in `startup!') Profiling started' # 2️⃣ Force library load failure: # (Delete the library version for the Ruby I'm using) $ rm lib/ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil' W, [2022-05-06T12:19:23.891711 #21159] WARN -- ddtrace: [ddtrace] (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to dlopen(/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/../../ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle, 5): image not found' at '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>'' # 3️⃣ Force a library incompatibility: # (Copy library compiled with a different Ruby version) $ cp lib/ddtrace_profiling_native_extension.3.2.0_x86_64-darwin20.bundle lib/ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil' W, [2022-05-06T12:22:24.036956 #21773] WARN -- ddtrace: [ddtrace] (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to library was compiled and linked to a different Ruby version' at '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>' # 4️⃣ Force an initialization failure: # (Restore library clobbered in previous steps) $ bundle rake clean compile ... # (Rename expected initialization function) $ git diff -U0 lib/datadog/profiling/load_native_extension.rb diff --git a/lib/datadog/profiling/load_native_extension.rb b/lib/datadog/profiling/load_native_extension.rb index 5ea4bb32c..af964bade 100644 --- a/lib/datadog/profiling/load_native_extension.rb +++ b/lib/datadog/profiling/load_native_extension.rb @@ -18 +18 @@ full_file_path = "#{__dir__}/../../#{extension_name}.#{RbConfig::CONFIG['DLEXT'] -init_function_name = "Init_#{extension_name.split('.').first}" +init_function_name = "Init_#{extension_name.split('.').first}_kaboom $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil' W, [2022-05-06T12:27:55.706221 #23036] WARN -- ddtrace: [ddtrace] (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to dlsym(0x7fe144151630, Init_ddtrace_profiling_native_extension_kaboom): symbol not found' at '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>'' ```
marcotc
approved these changes
May 6, 2022
lloeki
approved these changes
May 9, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
ivoanjo
added a commit
that referenced
this pull request
Feb 6, 2025
**What does this PR do?** This PR removes the `datadog_profiling_loader` native extension, as it's no longer needed. The profiling native loader was added in #2003 . Quoting from that PR's description: > When Ruby loads a native extension (using `require`), it uses > `dlopen(..., RTLD_LAZY | RTLD_GLOBAL)` > (https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362). > > This means that every symbol exposed directly or indirectly by that > native extension becomes visible to every other extension in the > Ruby process. This can cause issues, see > rubyjs/mini_racer#179 . > > Ruby's extension loading mechanism is not configurable -- there's > no way to tell it to use different flags when calling `dlopen`. > To get around this, this commit introduces introduces another > extension (profiling loader) which has only a single responsibility: > mimic Ruby's extension loading mechanism, but when calling > `dlopen` use a different set of flags. > > This idea was shamelessly stolen from @lloeki's work in > rubyjs/mini_racer#179, big thanks! ...and importantly ends with: > Note that, that we know of, the profiling native extension only > exposes one potentially problematic symbol: > `rust_eh_personality` (coming from libddprof/libdatadog). > > Future versions of Rust have been patched not to expose this > (see rust-lang/rust#95604 (comment)) > so we may want to revisit the need for this loader in the future, > and perhaps delete it if we no longer require its services :) And we have reached the situation predicted in that description: 1. Nowadays libdatadog no longer exposes `rust_eh_personality` 2. We have a test that validates that only expected symbols are exported by the libdatadog library (see DataDog/libdatadog#573 ). Any new symbols that show up would break shipping new libdatadog versions to rubygems.org until we review them. 3. The `libdatadog_api` extension, which we've been shipping for customers since release 2.3.0 back in July 2024 has always been loaded directly without a loader without issues. Thus, I think it's the right time to get rid of the loader. **Motivation:** Having the loader around is not zero cost; we've run into/caused a few issues ( #2250 and #3582 come to mind). It also adds overhead to development: every time we need to rebuild the extensions, it's an extra extension that needs to be prepared, rebuilt, copied, etc. I've been meaning to get rid of the loader for some time now, and this came up again this week so I decided it was time to do it. **Additional Notes:** In the future, we can always ressurrect this approach if we figure out we need it again. We've also discussed internally about proposing upstream to the Ruby VM to maybe add an API to do what the loader was doing, so that we wouldn't need the weird workaround. We haven't yet decided if we're going to do that (help welcome!). **How to test the change?** The loader was responsible for loading the rest of profiling. Thus, any test that uses profiling was also validating the loader and now will validate that we're doing fine without it. TL;DR green CI is good.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When Ruby loads a native extension (using
require
), it usesdlopen(..., RTLD_LAZY | RTLD_GLOBAL)
(https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362).This means that every symbol exposed directly or indirectly by that native extension becomes visible to every other extension in the Ruby process. This can cause issues, see rubyjs/mini_racer#179.
Ruby's extension loading mechanism is not configurable -- there's no way to tell it to use different flags when calling
dlopen
. To get around this, this commit introduces introduces another extension (profiling loader) which has only a single responsibility: mimic Ruby's extension loading mechanism, but when callingdlopen
use a different set of flags.This idea was shamelessly stolen from @lloeki's work in rubyjs/mini_racer#179, big thanks!
Note that, that we know of, the profiling native extension only exposes one potentially problematic symbol:
rust_eh_personality
(coming from libddprof/libdatadog).Future versions of Rust have been patched not to expose this (see rust-lang/rust#95604 (comment)) so we may want to revisit the need for this loader in the future, and perhaps delete it if we no longer require its services :)
Testing the
ddtrace_profiling_loader.c
's failure handling paths is a bit non-trivial, so here's how I did it manually (some logs abridged):