Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using OPENJSON for our JSON scalar access #30981

Open
roji opened this issue May 27, 2023 · 3 comments
Open

Consider using OPENJSON for our JSON scalar access #30981

roji opened this issue May 27, 2023 · 3 comments

Comments

@roji
Copy link
Member

roji commented May 27, 2023

When using owned JSON entities, we currently use JSON_VALUE to extract scalars out of them. Since JSON_VALUE always returns nvarchar(4000), we apply a cast to the results based on the model type. For example:

_ = await ctx.Blogs.Where(b => b.Details.Foo == 8).ToArrayAsync();

... translates to:

SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
WHERE CAST(JSON_VALUE([b].[Details],'$.Foo') AS int) = 8

There are issues with this translation:

  • Binary data (varbinary) cannot be extracted in this way: the standard JSON representation for binary data is base64, but a regular relational CAST doesn't do that (see comment). OPENJSON with WITH does decode base64 data, since it applies a JSON-specific conversion. Binary data specifically is covered by #33435.
    • On the other hand, spatial data cannot be converted with OPENJSON with WITH, only with a regular cast. The same may be true of hierarchyid.
    • We need to go through all supported SQL Server types
  • JSON_VALUE returns null for strings larger than 4000 (or can throw in strict mode) (see #29477).
  • There's good reason to believe that this translation is inefficient compared to OPENJSON with WITH, where the query supplies more information to SQL Server.
    • See this and this. Both these comments are about OPENJSON without WITH (and not about JSON_VALUE), and are in the context of primitive collections; but the same principles should hold here as well - needs to be confirmed.

The alternative translation would be to use OPENJSON with WITH instead:

SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
CROSS APPLY OPENJSON([b].[Details]) WITH ([Foo] int '$.Foo') AS [d]
WHERE [d].[Foo] = 8

In other words, every scalar access adds a CROSS APPLY to an OPENJSON invocation, applying the JSON conversion with WITH.

When the property being extracted is known to be a short string in the model (fitting in nvarchar(4000)), we could keep the current JSON_VALUE translation. When the type being extracted is incompatible with OPENJSON with WITH (e.g. geometry), we could do OPENJSON without WITH and then apply a relational cast (we should avoid JSON_VALUE still it truncates).

Note: consider the indexability of this technique, compared to JSON_VALUE (with computed columns)

@ajcvickers
Copy link
Contributor

ajcvickers commented Jun 7, 2023

@roji To check sargability.

@ajcvickers
Copy link
Contributor

Consider only using openjson for types that are going to be truncated (nvarchar(max) and varbinary(max) or where we need the conversion to/from base64 (binarty and varbinary(x)).

@roji
Copy link
Member Author

roji commented Apr 27, 2024

An improvement to SQL Server's JSON_VALUE() may be coming, where a RETURNING clause would allow specifying the returned type, with the conversion happening inside JSON_VALUE(). This would be both more efficient and resolve the above issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants