-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#620: Add structured logging to basic transaction flow (Authority/Client) #706
Conversation
…n authority_aggregator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs look great, Evan! I left some optional suggestions. Otherwise, LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, but the doc part is meant to be internal developer doc, i.e. it's for people developing Sui itself rather than on Sui. Is docs
the right place?
/cc @Clay-Mysten
Hi Francois, I ran this past Sam, and he thought it appropriate for the Contribute section of the site aimed at folks looking to augment Sui. I updated our Dev Portal sitemap outline to reflect this: Let me know if that doesn't make sense to you. |
Thanks for the feedback @Clay-Mysten ! Also just added ability to dump span events (start, end, duration) as JSON. Sample output:
|
I agree with not co-mingling these with external developer docs. Maybe a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this! Probably worth also adding a note on how practically one can run an authority and observe its logs (or are they just printing on stderr by default?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super dope! I would pay good money to sit down and watch a demo of this.
Here's the main missing things to me:
- I think the doc section on
## Ways to View Logs, Traces, Metrics
should be expanded. - I don't think the markdown should live in our dev portal, except perhaps in a "for Sui core developers" section (/cc @Clay-Mysten ).
@Clay-Mysten I've moved the doc to a new Thanks everyone for the feedback. I'm planning to restructure the "ways to view logs, traces, etc." section and write some stuff on the basic architecture of it, but leave the actual implementation of subscriber/viewing options to a later PR, in favour of getting this in faster. |
Just noticed |
Ok I just saw this in the doc
most of the output of sui start and genesis are informative and are info level logging, do you think we should print INFO logs by default? |
@patrickkuo Let me fix it in an upcoming PR, am looking into it |
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
) We operate an executor with a bound on the concurrent number of messages (see #463, #559, #706). PR #472 added logging for the bound being hit. We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes #759
Context: The `BatchLoader` is establishing a primary->worker communication using a public address, which is adding latency and competing with outbound traffic. The proper fix us to use the `BlockWaiter`. The issue: We would like a mitigation to be deployed and effective sooner. The fix: We inspect the worker addresses used by the `BatchLoader`, and rewrite their hostname to localhost when it matches the local primary.
…ystenLabs#763) We operate an executor with a bound on the concurrent number of messages (see MystenLabs#463, MystenLabs#559, MystenLabs#706). PR MystenLabs#472 added logging for the bound being hit. We expect the executors to operate for a long time at this limit (e.g. in recovery situation). The spammy logging is not usfeful This removes the logging of the concurrency bound being hit. Fixes MystenLabs#759
Initial PR for #620
The doc explains the goals - using structured logging so we can have proper context for entire transaction flows and be able to easily trace a transaction across processes, components, and async threads. Plus using proper key-value tagging to be able to easily filter logs.
Example of logging showing tracing a single TX across
authority_aggregator
on gateway/client side andauthority
:Log initialization is fixed to take in log levels from the environment
RUST_LOG
variable.This is just a start - much more to come, but I think it gives everyone the basic idea and template to do more. TODOs (either this PR or subsequent ones):