Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphman new command: analyze #3170

Merged
merged 8 commits into from
Jan 21, 2022
Merged

Conversation

tilacog
Copy link
Contributor

@tilacog tilacog commented Jan 18, 2022

Introduces a new Graphman analyze command that calls SQL ANALYZE for the given entity table.

Closes #3152

@tilacog tilacog requested a review from lutter January 18, 2022 19:08
@tilacog tilacog force-pushed the tiago/graphman-command-analyze branch from 25c1ef1 to fa66713 Compare January 18, 2022 20:08
@@ -179,6 +179,13 @@ pub enum Command {
Chain(ChainCommand),
/// Manipulate internal subgraph statistics
Stats(StatsCommand),
/// Perform a SQL ANALYZE in a Entity table
Analyze {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the command could accept a verbose argument?
I would even argue we want verbose as default.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree .. it might be a little tricky to get the command output out of diesel, but if it's not too much work, would be nice to just dump whatever Postgres says onto the tty.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this command should be graphman stats analyze Qm.. Entity, i.e., the analyze should be a subcommand of stats

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a way to capture PostgreSQL output from diesel. Not sure if this is supported at all.

println!("Running ANALYZE for {entity_name} entity");
let entity_type = EntityType::new(entity_name);
let deployment_hash =
DeploymentHash::new(hash).expect("Subgraph hash must be a valid IPFS hash");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could .context()? from anyhow provide better error info for graphman users? 🤔

let layout = store.layout(conn, site)?;
let table = layout.table_for_entity(&entity_type)?;
let table_name = &table.qualified_name;
let sql = format!("analyze {table_name}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if diesel provided analyze in their pg module, perhaps we can add there someday 🙂

Copy link
Collaborator

@lutter lutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good!

@@ -179,6 +179,13 @@ pub enum Command {
Chain(ChainCommand),
/// Manipulate internal subgraph statistics
Stats(StatsCommand),
/// Perform a SQL ANALYZE in a Entity table
Analyze {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree .. it might be a little tricky to get the command output out of diesel, but if it's not too much work, would be nice to just dump whatever Postgres says onto the tty.

@@ -179,6 +179,13 @@ pub enum Command {
Chain(ChainCommand),
/// Manipulate internal subgraph statistics
Stats(StatsCommand),
/// Perform a SQL ANALYZE in a Entity table
Analyze {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this command should be graphman stats analyze Qm.. Entity, i.e., the analyze should be a subcommand of stats

@lutter
Copy link
Collaborator

lutter commented Jan 19, 2022

As follow-on work (i.e., in a separate PR since I think having the command in its current form in use is more important than adding these bells and whistles) these things would be nice:

  • allow specifying the deployment also with sgdNNN (have a look at how the info command resolves names and hashes, I think you can just reuse that code)
  • allow specifying either the entity name or the table name on the command line

with that, graphman stats analyze sgd321 pair_hour_data should also be legal. That form is often more useful when you see a specific query take a long time.

And while we're at it, it would also be neat if the command could print when the table was last analyzed before running analyze. You can get that information with select greatest(last_analyze, last_autoanalyze) from pg_stat_user_tables where relname = 'transaction' and schemaname = 'sgd52520';

@tilacog tilacog requested a review from lutter January 19, 2022 19:52
@tilacog tilacog merged commit c20a630 into master Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

graphman command to run analyze entity tables
3 participants