Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interceptor for collecting client metrics #4021

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

landonxjames
Copy link
Contributor

Motivation and Context

Description

Testing

Checklist

  • For changes to the smithy-rs codegen or runtime crates, I have created a changelog entry Markdown file in the .changelog directory, specifying "client," "server," or both in the applies_to key.
  • For changes to the AWS SDK, generated SDK code, or SDK runtime crates, I have created a changelog entry Markdown file in the .changelog directory, specifying "aws-sdk-rust" in the applies_to key.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

Remove use of SystemTime.elapsed(), fix private type in pub docs, and
style updates
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@@ -53,3 +53,6 @@ pub mod sdk_feature;

/// Smithy support-code for code generated waiters.
pub mod waiters;

/// Interceptor for collecting client metrics.
pub mod metrics;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the metrics module need to be pub (maybe not in this PR but in the future)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale on why it's public?

Copy link
Contributor

@aajtodd aajtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start but I have some correctness/approach concerns.

Also unrelated to metrics, (and this can be a follow on PR) we probably want to try to align some of our spans with the SRA as well before we ship all this.

e.g. the root operation span for all smithy services should include attributes rpc.service and rpc.method (AWS SDK services would also set rpc.system to aws-api which would need to happen in codegen)

@@ -53,3 +53,6 @@ pub mod sdk_feature;

/// Smithy support-code for code generated waiters.
pub mod waiters;

/// Interceptor for collecting client metrics.
pub mod metrics;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale on why it's public?

/// Struct to hold metric data in the ConfigBag
#[derive(Debug, Clone)]
struct MeasurementsContainer {
call_start: SystemTime,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably use a monotonic clock (i.e. Instant)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see SharedTimeSource returns SystemTime...that is unfortunate. 🤔

@@ -275,20 +300,24 @@ pub fn default_plugins(
let behavior_version = params
.behavior_version
.unwrap_or_else(BehaviorVersion::latest);
let service_name = params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is the service name is probably not what we want even though it's what the default is and likely to work (for now anyway).

default_sleep_impl_plugin(),
default_time_source_plugin(),
default_timeout_config_plugin(),
enforce_content_length_runtime_plugin(),
default_stalled_stream_protection_config_plugin_v2(behavior_version),
metrics_runtime_plugin(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to quite work the way we want.

The OperationBuilder isn't used in generated code and we apply default plugins when we create the client (see also here).

At that point we won't have the operation name.

We also want to be very careful to create the instruments and have them live on the client (i.e. we don't want them created per/operation, the attributes (dimensions) recorded with the metric will differentiate operations/services/etc). Default runtime plugins sort of gets us there (though you have to understand it's used differently by codegen than it is say by the runtime (e.g. IMDS cred provider in aws-config).

pub(crate) fn new() -> Result<Self, ObservabilityError> {
let meter = get_telemetry_provider()?
.meter_provider()
.get_meter("MetricsInterceptor", None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scope is wrong, it should be something like the service name

e.g. in Kotlin when the client is instantiated:

    private val telemetryScope = "aws.sdk.kotlin.services.s3"
    private val opMetrics = OperationMetrics(telemetryScope, config.telemetryProvider)

SRA doesn't talk about this too much but we should probably pick a scope that identifies it's the Rust SDK + the service so aws_sdk_s3::Client or something, we can discuss what makes sense though.


impl MetricsInterceptor {
pub(crate) fn new(
service_name: impl Into<Cow<'static, str>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should just make service and operation name always available to the interceptor context so that any interceptor can pull them out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants