Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-5343] Isolate profiling native extension native symbols #2003

Merged
merged 1 commit into from
May 9, 2022

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented May 6, 2022

When Ruby loads a native extension (using require), it uses dlopen(..., RTLD_LAZY | RTLD_GLOBAL) (https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362).

This means that every symbol exposed directly or indirectly by that native extension becomes visible to every other extension in the Ruby process. This can cause issues, see rubyjs/mini_racer#179.

Ruby's extension loading mechanism is not configurable -- there's no way to tell it to use different flags when calling dlopen. To get around this, this commit introduces introduces another extension (profiling loader) which has only a single responsibility: mimic Ruby's extension loading mechanism, but when calling dlopen use a different set of flags.

This idea was shamelessly stolen from @lloeki's work in rubyjs/mini_racer#179, big thanks!

Note that, that we know of, the profiling native extension only exposes one potentially problematic symbol: rust_eh_personality (coming from libddprof/libdatadog).

Future versions of Rust have been patched not to expose this (see rust-lang/rust#95604 (comment)) so we may want to revisit the need for this loader in the future, and perhaps delete it if we no longer require its services :)

Testing the ddtrace_profiling_loader.c's failure handling paths is a bit non-trivial, so here's how I did it manually (some logs abridged):

 # 1️⃣  Validate everything's working fine:

 $ ruby -v
 ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-darwin20]
 $ bundle rake clean compile
 ...
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 D, [2022-05-06T12:17:41.554770 #19686] DEBUG -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:340:in `startup!') Profiling started'

 # 2️⃣  Force library load failure:

 # (Delete the library version for the Ruby I'm using)
 $ rm lib/ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 W, [2022-05-06T12:19:23.891711 #21159]  WARN -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!')
 Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to
 'Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to
 dlopen(/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/../../ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle, 5):
 image not found' at '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>''

 # 3️⃣  Force a library incompatibility:

 # (Copy library compiled with a different Ruby version)
 $ cp lib/ddtrace_profiling_native_extension.3.2.0_x86_64-darwin20.bundle lib/ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 W, [2022-05-06T12:22:24.036956 #21773]  WARN -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!')
 Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to
 Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to library was compiled and linked to a different Ruby version'
 at '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>'

 # 4️⃣  Force an initialization failure:

 # (Restore library clobbered in previous steps)
 $ bundle rake clean compile
 ...
 # (Rename expected initialization function)
 $ git diff -U0 lib/datadog/profiling/load_native_extension.rb
 diff --git a/lib/datadog/profiling/load_native_extension.rb b/lib/datadog/profiling/load_native_extension.rb
 index 5ea4bb32c..af964bade 100644
 --- a/lib/datadog/profiling/load_native_extension.rb
 +++ b/lib/datadog/profiling/load_native_extension.rb
 @@ -18 +18 @@ full_file_path = "#{__dir__}/../../#{extension_name}.#{RbConfig::CONFIG['DLEXT']
 -init_function_name = "Init_#{extension_name.split('.').first}"
 +init_function_name = "Init_#{extension_name.split('.').first}_kaboom
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 W, [2022-05-06T12:27:55.706221 #23036]  WARN -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!')
 Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to
 'Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to dlsym(0x7fe144151630,
   Init_ddtrace_profiling_native_extension_kaboom): symbol not found' at
 '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>''

When Ruby loads a native extension (using `require`), it uses
`dlopen(..., RTLD_LAZY | RTLD_GLOBAL)`
(https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362).

This means that every symbol exposed directly or indirectly by that native
extension becomes visible to every other extension in the Ruby process. This can
cause issues, see rubyjs/mini_racer#179.

Ruby's extension loading mechanism is not configurable -- there's no way to tell
it to use different flags when calling `dlopen`. To get around this, this
commit introduces introduces another extension (profiling loader) which has only
a single responsibility: mimic Ruby's extension loading mechanism, but when
calling  `dlopen` use a different set of flags.

This idea was shamelessly stolen from @lloeki's work in
rubyjs/mini_racer#179, big thanks!

Note that, that we know of, the profiling native extension only exposes one
potentially problematic symbol: `rust_eh_personality`
(coming from libddprof/libdatadog).

Future versions of Rust have been patched not to expose this
(see rust-lang/rust#95604 (comment)) so we
may want to revisit the need for this loader in the future, and perhaps delete
it if we no longer require its services :)

Testing the `ddtrace_profiling_loader.c`'s failure handling paths is a bit
non-trivial, so here's how I did it manually (some logs abridged):

```bash
 # 1️⃣  Validate everything's working fine:

 $ ruby -v
 ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-darwin20]
 $ bundle rake clean compile
 ...
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 D, [2022-05-06T12:17:41.554770 #19686] DEBUG -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:340:in `startup!') Profiling started'

 # 2️⃣  Force library load failure:

 # (Delete the library version for the Ruby I'm using)
 $ rm lib/ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 W, [2022-05-06T12:19:23.891711 #21159]  WARN -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!')
 Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to
 'Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to
 dlopen(/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/../../ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle, 5):
 image not found' at '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>''

 # 3️⃣  Force a library incompatibility:

 # (Copy library compiled with a different Ruby version)
 $ cp lib/ddtrace_profiling_native_extension.3.2.0_x86_64-darwin20.bundle lib/ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20.bundle
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 W, [2022-05-06T12:22:24.036956 #21773]  WARN -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!')
 Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to
 Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to library was compiled and linked to a different Ruby version'
 at '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>'

 # 4️⃣  Force an initialization failure:

 # (Restore library clobbered in previous steps)
 $ bundle rake clean compile
 ...
 # (Rename expected initialization function)
 $ git diff -U0 lib/datadog/profiling/load_native_extension.rb
 diff --git a/lib/datadog/profiling/load_native_extension.rb b/lib/datadog/profiling/load_native_extension.rb
 index 5ea4bb32c..af964bade 100644
 --- a/lib/datadog/profiling/load_native_extension.rb
 +++ b/lib/datadog/profiling/load_native_extension.rb
 @@ -18 +18 @@ full_file_path = "#{__dir__}/../../#{extension_name}.#{RbConfig::CONFIG['DLEXT']
 -init_function_name = "Init_#{extension_name.split('.').first}"
 +init_function_name = "Init_#{extension_name.split('.').first}_kaboom
 $ DD_TRACE_DEBUG=true DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e 'nil'
 W, [2022-05-06T12:27:55.706221 #23036]  WARN -- ddtrace: [ddtrace]
 (/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/core/configuration/components.rb:345:in `startup!')
 Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to
 'Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-darwin20 due to dlsym(0x7fe144151630,
   Init_ddtrace_profiling_native_extension_kaboom): symbol not found' at
 '/Users/ivo.anjo/datadog/dd-trace-rb/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>''
```
@ivoanjo ivoanjo requested review from a team, lloeki and luhenry May 6, 2022 11:55
Copy link
Member

@lloeki lloeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ivoanjo ivoanjo merged commit 11adfff into master May 9, 2022
@ivoanjo ivoanjo deleted the ivoanjo/prof-5343-isolate-symbols-v2 branch May 9, 2022 13:07
@github-actions github-actions bot added this to the 1.1.0 milestone May 9, 2022
ivoanjo added a commit that referenced this pull request Feb 6, 2025
**What does this PR do?**

This PR removes the `datadog_profiling_loader` native extension, as
it's no longer needed.

The profiling native loader was added in
#2003 .

Quoting from that PR's description:

> When Ruby loads a native extension (using `require`), it uses
> `dlopen(..., RTLD_LAZY | RTLD_GLOBAL)`
> (https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362).
>
> This means that every symbol exposed directly or indirectly by that
> native extension becomes visible to every other extension in the
> Ruby process. This can cause issues, see
> rubyjs/mini_racer#179 .
>
> Ruby's extension loading mechanism is not configurable -- there's
> no way to tell it to use different flags when calling `dlopen`.
> To get around this, this commit introduces introduces another
> extension (profiling loader) which has only a single responsibility:
> mimic Ruby's extension loading mechanism, but when calling
> `dlopen` use a different set of flags.
>
> This idea was shamelessly stolen from @lloeki's work in
> rubyjs/mini_racer#179, big thanks!

...and importantly ends with:

> Note that, that we know of, the profiling native extension only
> exposes one potentially problematic symbol:
> `rust_eh_personality` (coming from libddprof/libdatadog).
>
> Future versions of Rust have been patched not to expose this
> (see rust-lang/rust#95604 (comment))
> so we may want to revisit the need for this loader in the future,
> and perhaps delete it if we no longer require its services :)

And we have reached the situation predicted in that description:

1. Nowadays libdatadog no longer exposes `rust_eh_personality`

2. We have a test that validates that only expected symbols are
   exported by the libdatadog library
   (see DataDog/libdatadog#573 ).

   Any new symbols that show up would break shipping new
   libdatadog versions to rubygems.org until we review them.

3. The `libdatadog_api` extension, which we've been shipping for
   customers since release 2.3.0 back in July 2024 has always been
   loaded directly without a loader without issues.

Thus, I think it's the right time to get rid of the loader.

**Motivation:**

Having the loader around is not zero cost; we've run into/caused
a few issues ( #2250
and #3582 come to mind).

It also adds overhead to development: every time we need to rebuild
the extensions, it's an extra extension that needs to be prepared,
rebuilt, copied, etc.

I've been meaning to get rid of the loader for some time now,
and this came up again this week so I decided it was time to
do it.

**Additional Notes:**

In the future, we can always ressurrect this approach if we
figure out we need it again.

We've also discussed internally about proposing upstream
to the Ruby VM to maybe add an API to do what the loader was
doing, so that we wouldn't need the weird workaround.

We haven't yet decided if we're going to do that (help welcome!).

**How to test the change?**

The loader was responsible for loading the rest of profiling.

Thus, any test that uses profiling was also validating the loader
and now will validate that we're doing fine without it.

TL;DR green CI is good.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants