Refactor GraphQL execution #3005

lutter · 2021-11-24T01:57:22Z

Refactor how we execute GraphQL queries by transforming the AST we get from graphql_parser into a form that is easier to execute. The preprocessing step that translates from graphql_parser's AST to our AST does the following:

interpolate variables
coerce argument values to their correct type
expand fragments so that our selection sets only contain fields
resolve interfaces (and unions) into their constituent object types

This removes a lot of hard-to-follow code in query execution, especially in prefetch.

Besides these simplifications, this PR also fixes a few bugs, and changes the order in which fields appear in the response to what the GraphQL spec mandates. This is a breaking change for the network, as it changes the hash for semantically equivalent query responses (cc @That3Percent on rolling this out on the network)

The refactored code also makes a few weaknesses of the current query execution visible, in particular, field merges (from users specifying the same field multiple times, or from fragment expansion) continue to use a yolo strategy that is not entirely standard conformant. This needs to be addressed in a separate PR.

Fixes #2961.

graphql/src/execution/ast.rs

graphql/src/execution/query.rs

kamilkisiela · 2021-11-24T11:25:46Z

@lutter Hi, I started a Pull Request for graphql_parser that introduces Document Transformer graphql-rust/graphql-parser#56

Not sure it's useful but with such trait, we could have one API for transformations and split the transformation logic into multiple pieces. What do you think?

leoyvens

Small comments from reading ast.rs.

graphql/src/execution/ast.rs

leoyvens · 2021-11-24T11:31:06Z

graphql/src/execution/ast.rs

+pub struct SelectionSet {
+    // Map object types to the list of fields that should be selected for
+    // them
+    items: Vec<(String, Vec<Field>)>,


Is items.len() > 1 only when the selection is over a union or interface? If so it would be clarifying to note that in the comment.

It would be nice to use a type for strings that represent object names (or type names in general, whatever makes more sense).

I've wanted to use something like &s::ObjectType here, but that's not possible for various reasons. I've now resolved to do the next best thing, which is keeping a copy of all object types in ApiSchema, and wrapping them in an Arc. With that, a SelectionSet now has basically items: Vec<(Arc<s::ObjectType>, Vec<Field>)> (in reality a little more complicated to work around other shortcomings of s::ObjectType)

I am curious what you think of that approach.

Looks good! I see that saved some calls to get_object_type_definition.

Added a comment to clarify when items.len() > 1

graphql/src/execution/ast.rs

That3Percent · 2021-11-24T15:56:37Z

Our arbitration charter allows for flexibility around bugfixes, so we are good to go to push this to The Network @lutter.

lutter · 2021-11-24T16:11:22Z

@lutter Hi, I started a Pull Request for graphql_parser that introduces Document Transformer graphql-rust/graphql-parser#56

Not sure it's useful but with such trait, we could have one API for transformations and split the transformation logic into multiple pieces. What do you think?

I saw that PR and was sad that it was just a PR (I even went as far as trying to implement a visitor, but then decided it's too much work for what we need) Now that I've been through this refactor, I actually think it would be better if graphql_parser was more opinionated than just offering transformers, and instead had something along the lines of what this PR does, i.e., a preprocessing step for GraphQL queries. A lot of this PR is very generic, and useful for anyone who wants to execute GraphQL queries, but by necessity closely tied to our codebase. The things that I think should live upstream are

the graph::data::graphql::ext module which is just an attempt to make the graphql_parser API more ergonomic (in graphql_parser, these could just be methods on its types, rather than the awkward dance with extension traits)
the graphql::execution::ast module and the transformation to turn a graphql_parser query into that form in graphql::execution::query
the IntrospectionResolver and the code that merges the introspection schema into the user's schema (APISchema.schema in our case) Keeping the introspection schema and the user schema separate lead to a lot of awkward code which is now gone, but even that you have to switch resolvers is a bit awkward.

The biggest question around doing that to me is how values should be handled, since they are user-specific, and the graphql_parser API for the above would need to be templated in some form around the concrete notion of Value, and, for example, how to coerce a q::Value to the user's notion of Value.

Besides these, one big feature request I'd have for graphql_parser is to use more Arcs in the schema AST, especially for types, since the current data structures make it hard to keep pointers to types around, and you end up referencing them with strings that need to be looked up in the schema all the time.

I have no idea what the upstream devs would think about these changes, but it would make our lives a lot easier.

lutter · 2021-11-29T21:53:59Z

Addressed all the comments so far

lutter · 2022-01-19T19:46:19Z

Rebased to latest master and a dded a few commits to replicate the behavior of the current GraphQL execution for some queries that are really invalid

lutter · 2022-01-21T00:44:19Z

This PR is now ready for review (and hopefully merging/rollout soon)

I've tested this branch against all the queries we get in the hosted service, and the only difference in responses comes from treating the (invalid) construct of not having a selection on a field with an object type slightly differently: the current implementation includes an empty object (or list of empty objects, depending on the field type) in the response, with this PR such fields do not appear in the response.

In other words, for a type like

type Person @entity {
  id ID!
  spouse Person!
  friends [Person!]!
}

and the query

{ persons { id spouse friends } }

with current master, the response has { "id": "..", "spouse": {}, friends: [ {}, {}, {}]} for each person in it (with one empty object in the list for each friend). With this PR, the response would contain neither a spouse nor a friends field. The impact on apps should be negligible since they've been getting no real data all along.

Query preprocessing eliminated all variable references

This change should not change any behavior, and is purely mechanical in replacing one set of structs with another set of identical structs that we control

That makes it possible to use the same schema for data nad for introspection queries

The old name wasn't matching what we use it for any more

No cows were harmed in making this change

That is demanded by the GraphQL spec. We need to have a predictable order to ensure attestations remain stable, though this change breaks attestations since it changes the order in which fields appear in the output from alphabetical (however a `BTreeMap` orders string keys) to the order in which fields appear in a query. It also allows us to replace `BTreeMaps`, which are fairly memory intensive, with cheaper `Vec`. The test changes all reflect the changed output behavior; they only reorder fields in the expected output but do not otherwise alter the tests. Fixes #2943

Fixes #2720

Rather than use a string name, use the actual object type to identify types. It's not possible to do this with plain references, for example, because we pass a reference to a SelectionSet to graph::spawn_blocking, so we do the next best thing and use an Arc. Unfortunately, because `graphql_parser` doesn't wrap its object types in an Arc, that means we need to keep a copy of all of them in ApiSchema.

This just shuffles some code around, but doesn't change anything else, in preparation for representing the schema in a way that's more useful to us.

Set ENABLE_GRAPHQL_VALIDATIONS to any value in the environment to enable validations, rather than enabling them by default and disabling them on demand

Make sure that we handle queries that have a selection from a scalar field by ignoring the selection or that have no selection for a non-scalar field by ignoring that field. The latter differs from the previous behavior where the result would contain an entry for such a field, but the data for the field would be an empty object or a list of empty objects.

Instead of immediately reporting an error, treat missing variables as nulls and let the rest of the execution logic deal with nulls

leoyvens reviewed Nov 24, 2021

View reviewed changes

graphql/src/execution/ast.rs Outdated Show resolved Hide resolved

leoyvens reviewed Nov 24, 2021

View reviewed changes

graphql/src/execution/query.rs Outdated Show resolved Hide resolved

leoyvens reviewed Nov 24, 2021

View reviewed changes

dotansimha mentioned this pull request Nov 28, 2021

GraphQL: Improve documents validation before running execution flow #3013

Closed

dotansimha mentioned this pull request Dec 14, 2021

graph, graphql: introduce GraphQL spec-compliant validation phase and rules #3057

Merged

5 tasks

lutter force-pushed the lutter/gql branch from aedabf6 to 3cdb841 Compare December 14, 2021 19:23

lutter force-pushed the lutter/gql branch from 2076936 to c98485d Compare January 19, 2022 19:45

lutter force-pushed the lutter/gql branch 2 times, most recently from 3727335 to 4c4a261 Compare January 20, 2022 20:15

tilacog approved these changes Jan 24, 2022

View reviewed changes

lutter added 14 commits January 24, 2022 15:38

graph, graphql, server: Encapsulate maps in Value in a new type

5302b94

graphql: Move complexity check/field validation into intermediate struct

fc78a18

graph, graphql: Resolve variables before executing query

be2f07a

graphql: Do not pass variables around during execution

8b3ccee

Query preprocessing eliminated all variable references

graph, graphql: Use our own structs to represent the query AST

d489ca2

This change should not change any behavior, and is purely mechanical in replacing one set of structs with another set of identical structs that we control

graphql: Restrict construction of selection sets

69d5386

graphql: Make SelectionSet.items private

1e43f7f

graphql: Move ObjectCondition to graphql::schema::ast

a6482de

graphql: Mix the introspection schema into the API schema

334f322

That makes it possible to use the same schema for data nad for introspection queries

graphql: Simplify constructing an ExecutionContext for introspection

455218b

graphql: Transform the query before executing it

f9d51fc

graphql: Clean up imports in schema::ast

6645e3f

graphql: Rename ObjectCondition to ObjectType

479e7dd

The old name wasn't matching what we use it for any more

graphql: Simplify CollectedAttributeNames and rename it

8ff30cb

lutter added 16 commits January 24, 2022 15:38

graphql: Remove unused CollectedResponseKey

93729aa

graphql: Simplify Join type in prefetch a little

fc58734

graphql: Pass a schema, not the query, to coerce_argument_values

bc4e200

graphql: Coerce arguments when constructing the query

af14ff7

graphql: Keep original field order when grouping by block constraint

d20ad48

core: Do not rely on debug output in interface tests

4e9eff9

No cows were harmed in making this change

graphql: Verify that root fragments are expanded

b5336d1

Fixes #2720

graph, graphql: Encapsulate access to the underlying schema in ApiSchema

22e0f21

This just shuffles some code around, but doesn't change anything else, in preparation for representing the schema in a way that's more useful to us.

graphql: Remove unused functions

12aba47

graphql: Clarify/fix some comments

6014541

graphql: Make sure that undefined arguments are ignored

383d5a6

graphql: Change the default for running validations

33db777

Set ENABLE_GRAPHQL_VALIDATIONS to any value in the environment to enable validations, rather than enabling them by default and disabling them on demand

graphql: Treat missing variables as null values

9e6435b

Instead of immediately reporting an error, treat missing variables as nulls and let the rest of the execution logic deal with nulls

lutter force-pushed the lutter/gql branch from 4c4a261 to 9e6435b Compare January 24, 2022 23:42

lutter merged commit 9e6435b into master Jan 24, 2022

lutter deleted the lutter/gql branch January 24, 2022 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor GraphQL execution #3005

Refactor GraphQL execution #3005

lutter commented Nov 24, 2021

kamilkisiela commented Nov 24, 2021

leoyvens left a comment

leoyvens Nov 24, 2021

leoyvens Nov 24, 2021

lutter Nov 24, 2021 •

edited

Loading

leoyvens Nov 25, 2021

lutter Nov 29, 2021

That3Percent commented Nov 24, 2021

lutter commented Nov 24, 2021

lutter commented Nov 29, 2021

lutter commented Jan 19, 2022

lutter commented Jan 21, 2022

Refactor GraphQL execution #3005

Refactor GraphQL execution #3005

Conversation

lutter commented Nov 24, 2021

kamilkisiela commented Nov 24, 2021

leoyvens left a comment

Choose a reason for hiding this comment

leoyvens Nov 24, 2021

Choose a reason for hiding this comment

leoyvens Nov 24, 2021

Choose a reason for hiding this comment

lutter Nov 24, 2021 • edited Loading

Choose a reason for hiding this comment

leoyvens Nov 25, 2021

Choose a reason for hiding this comment

lutter Nov 29, 2021

Choose a reason for hiding this comment

That3Percent commented Nov 24, 2021

lutter commented Nov 24, 2021

lutter commented Nov 29, 2021

lutter commented Jan 19, 2022

lutter commented Jan 21, 2022

lutter Nov 24, 2021 •

edited

Loading