Upgrade Telemetry to V2 #3551

marcotc · 2024-03-21T23:49:56Z

This PR upgrades ddtrace to send Telemetry in the V2 schema version, instead of V1.

Also, this PR cleans up a lot of the existing telemetry code, simplifying its implementation and increasing type checking.

Done

System tests updated and passing: Upgrade Ruby to telemetry v2 system-tests#2266
Update unit tests.

Additional Notes:

How to test the change?

For Datadog employees:

If this PR touches code that signs or publishes builds or packages, or handles
credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

lib/datadog/core/telemetry/event.rb

marcotc · 2024-03-26T22:11:33Z

lib/datadog/core/configuration.rb

              components
            end
          )
        end

+        components.telemetry.started! if start_telemetry


Telemetry doesn't have to start inside the global. In fact, while we're factoring it, I triggered a deadlock a few times while trying to read global information for telemetry purposes.

marcotc · 2024-03-26T22:12:24Z

lib/datadog/core/configuration/option.rb

@@ -303,10 +303,6 @@ def skip_validation?
          ['true', '1'].include?(ENV.fetch('DD_EXPERIMENTAL_SKIP_CONFIGURATION_VALIDATION', '').strip)
        end

-        # Used for testing


The time has come for someone besides tests to use this data!

marcotc · 2024-03-26T22:13:41Z

integration/apps/rails-five/app/controllers/health_controller.rb

@@ -16,7 +16,6 @@ def detailed_check
      profiler_threads: Thread.list.map(&:name).select { |it| it && it.include?('Profiling') },
      telemetry_enabled: Datadog.configuration.telemetry.enabled,
      telemetry_client_enabled: Datadog.send(:components).telemetry.enabled,
-      telemetry_worker_enabled: Datadog.send(:components).telemetry.worker.enabled?


The internal worker instance is now private: it's an implementation concern.

lib/datadog/core/configuration/settings.rb

delner

Added some questions.

delner · 2024-03-27T14:34:58Z

lib/datadog/core/configuration/option.rb

@@ -8,7 +8,7 @@ module Configuration
      # Represents an instance of an integration configuration option
      # @public_api
      class Option
-        attr_reader :definition
+        attr_reader :definition, :precedence_set


What is precedence_set?

This is the effective precedence of the value this option is currently set to.
For example, if this option was never set, the precedence will be "default".
If the same option is then set by programmatic configuration (Datadog.configure{}), it will change to "programmatic".
Then, again, if it is over, written by remote configuration, it will change to "remote configuration".

Now, the interesting part is that same option is then again remote configuration by programmatic configuration, it will not change value, given remote configuration has a higher precedence.

Let me know if this makes sense. Also, I expanded the in-line documentation of this class regarding precedence.

On another topic, it probably makes sense to simply rename me to precedence, what do you think? I would probably do that in a separate PR as, it will touch many files and related to this PR, but it's a very simple rename to do as a follow up.

Gotcha. Was just curious. No strong opinions at this point, so if you're confident on this, I'm interested to see how it works.

delner · 2024-03-27T14:37:22Z

lib/datadog/core/telemetry/client.rb

@@ -37,21 +42,24 @@ def disable!
        def started!
          return if !@enabled || forked?

-          res = @emitter.request(:'app-started')
+          res = @emitter.request(Event::AppStarted.new)


I like the use of a class. Nit, but does it make sense to initialize an object?

This pattern of initializing an object makes sense for more complex the telemetry events like AppClientConfigurationChange:

dd-trace-rb/lib/datadog/core/telemetry/client.rb

Line 82 in fb2ca4d

@emitter.request(Event::AppClientConfigurationChange.new(changes, 'remote_config'))

AppClientConfigurationChange needs information to be provided in two different contexts: first it needs to know what has changed in the source of that change (changes, 'remote_config'), then later it needs to receive the auto-incrementing seq_id:

dd-trace-rb/lib/datadog/core/telemetry/request.rb

Line 20 in fb2ca4d

payload: event.payload(seq_id),

Because these two sets of information exist in different contexts, I need a way to store the first set (changes, 'remote_config') and then merge it with the second set (seq_id).
So in this case, it makes sense to initialize the class and then call a method accepting a parameter with the remaining information.

That being said, most events are simple, like you mention for AppStarted, and don't actually need two sets of data to build the event, but in order to keep the API consistent, I had to add this flexibility to all events.

Let me know if this makes sense or if you have further suggestions.

I see: sometimes it has state attached. I suppose most of these events don't fire often, but if they ever fired as often as traces (crazy possibly, I know), then I was imagining this would apply more pressure to memory.

Still I do like the event as a parameter from a flexibility point of view.

Feedback is not blocking.

delner · 2024-03-27T14:38:59Z

lib/datadog/core/telemetry/event.rb

      class Event
-        include Telemetry::Collector
+        # Base class for all Telemetry V2 events.
+        class Base


Should this be a class? Or could it be a module? (Given it doesn't introduce state.)

I think the hierarchy here is that events are true children of a base class: an event is a subtype of a Base event.

It can technically work as a module, but I think that it would create the misleading impression that events just so happen to respect the Base interface, when in fact all events are a subtype of a Base event.

I see, this got me thinking about Sandi Metz's take on inheritance/composition. Some notes surrounding her discourse on the subject that I could find:

Deciding Between Inheritance and Composition

Inheritance gives you message delegation for free at the cost of maintaining a class hierarchy. Composition allows objects to have structural independence at the cost of explicit message delegation.

Composition contains far few built-in dependencies than inheritance; it is very often the best choice.

With inheritance, a correctly modeled hierarchy will give you the benefit of propogating changes in the base class to all subclasses. This can also be a disadvantage when the hierarchy is modeled incorrectly, as a dramatic change to the base class due to a change in requirements will break sub-classes. Inheritance = built-in dependencies.

Enormous, broad-reaching changes of behavior can be achieved with very small changes in code. This is true, for better or for worse, whether you come to regret it or not.

Avoid writing frameworks that require users of your code to subclass your objects in order to gain your behavior. Their application's objects may already be arranged in a hierarchy; inheriting from your framework may not be possible.

On that very last point: is there a case where we might want to have independent trees of event behavior? I don't think immediately, no, but that would be a strong case for composition over inheritance. I suppose to me, it would be more a factor of keeping our options open in accordance to the principle of least harm: that composition would be less restraining.

Another way I think about the reason to do one over the other is state (e.g. instance variables).

Does the base class define or maintain state? (Internal or otherwise?)

Does it need some initialization (constructor or otherwise)?

If yes to either, then probably should be inheritance, because modules are not great at maintaining state. Otherwise consider using a module.

Some food for thought. Ultimately, we can always refactor this if it doesn't constitute a breaking change, so no blocker on my side if we can always change later without much pain.

Deciding Between Inheritance and Composition

It seems like the composition that she is discussing is about having a delegator object. And I'm not sure that that applies here because the need is for a consistent interface. There's almost no behaviour being delegated to the base class, is for subclasses to respect interface.

Both classes and modules would fall under inheritance, based on her definition above I believe.

Also, this whole thing is internal and constraint inside a single file, so factoring is trivial, and will likely not even affect unit tests.

delner · 2024-03-27T14:39:47Z

lib/datadog/core/telemetry/event.rb


-        API_VERSION = 'v1'


Is there any versioning of the API within our code?

Yes, it's here:

dd-trace-rb/lib/datadog/core/telemetry/http/ext.rb

Line 18 in fb2ca4d

API_VERSION = 'v2'

delner · 2024-03-27T14:40:54Z

lib/datadog/core/telemetry/http/transport.rb

+              Ext::HEADER_DD_TELEMETRY_API_VERSION => api_version,
+              Ext::HEADER_DD_TELEMETRY_REQUEST_TYPE => request_type,
+              Ext::HEADER_CLIENT_LIBRARY_LANGUAGE => Core::Environment::Ext::LANG,
+              Ext::HEADER_CLIENT_LIBRARY_VERSION => DDTrace::VERSION::STRING,


Just FYI, when ported to 2.x, this will need to be updated.

marcotc · 2024-03-27T21:52:59Z

~~Taking a look at the system test failures right now.~~ Done!

marcotc · 2024-03-27T22:24:26Z

System tests should now pass.

codecov-commenter · 2024-03-27T22:55:08Z

Codecov Report

Attention: Patch coverage is 96.79012% with 13 lines in your changes are missing coverage. Please review.

Project coverage is 98.23%. Comparing base (c097f90) to head (a2db9d6).
Report is 13 commits behind head on master.

❗ Current head a2db9d6 differs from pull request most recent head 0872d29. Consider uploading reports for the commit 0872d29 to get more accurate results

Files	Patch %	Lines
lib/datadog/core/telemetry/event.rb	92.80%	9 Missing ⚠️
spec/datadog/core/telemetry/client_spec.rb	94.02%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3551      +/-   ##
==========================================
- Coverage   98.27%   98.23%   -0.04%     
==========================================
  Files        1274     1254      -20     
  Lines       75157    74260     -897     
  Branches     3539     3521      -18     
==========================================
- Hits        73857    72952     -905     
- Misses       1300     1308       +8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

marcotc · 2024-03-27T23:45:51Z

The last failing system test (test_app_dependencies_loaded_not_sent) will be fixed when DataDog/system-tests#2275 is merged.

github-actions bot added the core Involves Datadog core libraries label Mar 21, 2024

anmarchenko reviewed Mar 22, 2024

View reviewed changes

lib/datadog/core/telemetry/event.rb Show resolved Hide resolved

marcotc force-pushed the telemetry-2.0 branch from 94977fd to 939509f Compare March 26, 2024 19:20

marcotc mentioned this pull request Mar 26, 2024

Upgrade Ruby to telemetry v2 DataDog/system-tests#2266

Merged

9 tasks

marcotc changed the title ~~Migrate telemetry to 2.0~~ Upgrade Telemetry to V2 Mar 26, 2024

marcotc commented Mar 26, 2024

View reviewed changes

Upgrade Telemetry to V2

fb2ca4d

marcotc force-pushed the telemetry-2.0 branch from ce3df39 to fb2ca4d Compare March 26, 2024 22:34

marcotc marked this pull request as ready for review March 26, 2024 22:35

marcotc requested a review from a team as a code owner March 26, 2024 22:35

anmarchenko reviewed Mar 27, 2024

View reviewed changes

lib/datadog/core/configuration/settings.rb Outdated Show resolved Hide resolved

delner reviewed Mar 27, 2024

View reviewed changes

marcotc added 2 commits March 27, 2024 14:24

Address comments

e73065c

Fix Ruby 2.1 test

f775db5

marcotc added 4 commits March 27, 2024 15:09

Fix integration error type

50bde6b

Remove empty values from telemetry

0112964

Make integration.enabled always populated

85f1bf5

Fix test to match schema changes

a2db9d6

Fix Hash#reject! returning nil

0872d29

marcotc requested a review from delner March 27, 2024 23:46

Merge branch 'master' into telemetry-2.0

5f59ba3

delner approved these changes Apr 3, 2024

View reviewed changes

marcotc merged commit 65eabdb into master Apr 3, 2024
207 checks passed

marcotc deleted the telemetry-2.0 branch April 3, 2024 18:54

github-actions bot added this to the 1.22.0 milestone Apr 3, 2024

TonyCTHsu mentioned this pull request Apr 16, 2024

Bump to version 1.22.0 #3591

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade Telemetry to V2 #3551

Upgrade Telemetry to V2 #3551

marcotc commented Mar 21, 2024 •

edited

Loading

marcotc Mar 26, 2024

marcotc Mar 26, 2024 •

edited

Loading

marcotc Mar 26, 2024 •

edited

Loading

delner left a comment

delner Mar 27, 2024

marcotc Mar 27, 2024

delner Mar 28, 2024

delner Mar 27, 2024

marcotc Mar 27, 2024 •

edited

Loading

delner Mar 28, 2024

delner Mar 27, 2024

marcotc Mar 27, 2024

delner Mar 28, 2024 •

edited

Loading

marcotc Mar 28, 2024

marcotc Mar 28, 2024

delner Mar 27, 2024

marcotc Mar 27, 2024

delner Mar 27, 2024

marcotc commented Mar 27, 2024 •

edited

Loading

marcotc commented Mar 27, 2024

codecov-commenter commented Mar 27, 2024

marcotc commented Mar 27, 2024

Upgrade Telemetry to V2 #3551

Upgrade Telemetry to V2 #3551

Conversation

marcotc commented Mar 21, 2024 • edited Loading

Choose a reason for hiding this comment

marcotc Mar 26, 2024 • edited Loading

Choose a reason for hiding this comment

marcotc Mar 26, 2024 • edited Loading

Choose a reason for hiding this comment

delner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcotc Mar 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

delner Mar 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcotc commented Mar 27, 2024 • edited Loading

marcotc commented Mar 27, 2024

codecov-commenter commented Mar 27, 2024

Codecov Report

marcotc commented Mar 27, 2024

marcotc commented Mar 21, 2024 •

edited

Loading

marcotc Mar 26, 2024 •

edited

Loading

marcotc Mar 26, 2024 •

edited

Loading

marcotc Mar 27, 2024 •

edited

Loading

delner Mar 28, 2024 •

edited

Loading

marcotc commented Mar 27, 2024 •

edited

Loading