Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DCR]: Unnecessary 401 Unauthorized api calls made during package restore #11600

Closed
chris-smith-zocdoc opened this issue Feb 16, 2022 · 13 comments
Assignees
Labels
Area:HttpCommunication Area:Plugin V2 plugin w/ cross platform support Functionality:Restore Priority:2 Issues for the current backlog. Type:DCR Design Change Request

Comments

@chris-smith-zocdoc
Copy link

NuGet Product(s) Affected

dotnet.exe

Current Behavior

We host an authenticated nuget feed for our organization using Artifactory. We noticed in the logs an excessive amount of 401 Unauthorized responses originating from dotnet restore throughout the org. It appears that every api call is being made twice, first without authentication, then again with authentication. This doubles the number of api calls make during a package restore.

Desired Behavior

I'd like to force nuget to always send credentials to our feed, there is no need to make the api calls twice. This slows down the restore process, places additional load on the supporting infrastructure and makes monitoring more difficult.

Additional Context

Other package management tools will always send credentials if configured to do so, they do not 401 first.

@chris-smith-zocdoc chris-smith-zocdoc changed the title [DCR]: [DCR]: Unnecessary 401 Unauthorized api calls made during package restore Feb 16, 2022
@kartheekp-ms
Copy link
Contributor

kartheekp-ms commented Feb 17, 2022

@chris-smith-zocdoc - From my initial investigation regarding this request, there is a way to pass credentials on the first request itself without two requests (one request without credentials followed by second request with credentials).

The passwords are stored in encrypted form on windows & on Mono for non-Windows environment. The catch here is that encrypted passwords can only be decrypted when used on the same machine and via the same user as the original encryption. This can be a challenge for CI machines which are running under a different user context. One way to address this problem is adding package source credentials on CI using dotnet nuget update source command before running dotnet restore command. The doc says Set the --valid-authentication-types to basic if the server advertises NTLM or Negotiate and your credentials must be sent using the Basic mechanism, for instance when using a PAT with on-premises Azure DevOps Server.

  • If package source credentials are added to the NuGet.Config file then the following code will pass credentials on the first request.

https://github.com/NuGet/NuGet.Client/blob/4ed05177c6222d32ad55d612d2202aa9c6a1f2a6/src/NuGet.Core/NuGet.Protocol/HttpSource/HttpSourceAuthenticationHandler.cs#L59

@kartheekp-ms kartheekp-ms added Area:HttpCommunication Area:Plugin V2 plugin w/ cross platform support Functionality:Restore WaitingForCustomer Applied when a NuGet triage person needs more info from the OP and removed Triage:Untriaged labels Feb 17, 2022
@chris-smith-zocdoc
Copy link
Author

Thats not the behavior I'm seeing @kartheekp-ms

Please note: I'm not using Azure DevOps private feed I'm using JFrog's Artifactory

Here is how my nuget.config is setup

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <packageSources>
    <add key="Artifactory" value="[removed]/artifactory/api/nuget/v3/nuget" protocolVersion="3"/>
  </packageSources>
  <packageSourceCredentials>
    <Artifactory>
      <add key="Username" value="[removed]" />
      <add key="ClearTextPassword" value="[removed]" />
    </Artifactory>
  </packageSourceCredentials>
</configuration>

Example logs

GET /artifactory/api/nuget/v3/nuget HTTP/1.1 401 
GET /artifactory/api/nuget/v3/nuget HTTP/1.1 200
GET /artifactory/api/nuget/v3/nuget/registration-semver2/awssdk.core/index.json HTTP/1.1 401
GET /artifactory/api/nuget/v3/nuget/registration-semver2/awssdk.core/index.json HTTP/1.1 200
GET /artifactory/api/nuget/v3/nuget/registration-semver2/Download/awssdk.core/3.3.25.1 HTTP/1.1 401
GET /artifactory/api/nuget/v3/nuget/registration-semver2/Download/awssdk.core/3.3.25.1 HTTP/1.1 200

I verified this on OSX, but I believe this is happening on Linux also. I haven't checked windows

dotnet --version
5.0.400

@ghost ghost added WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. and removed WaitingForCustomer Applied when a NuGet triage person needs more info from the OP labels Feb 17, 2022
@kartheekp-ms kartheekp-ms removed the WaitingForClientTeam Customer replied, needs attention from client team. Do not apply this label manually. label Feb 18, 2022
@kartheekp-ms
Copy link
Contributor

I didn't explore httpclienthandler.preauthenticate yet but looks like a potential solution to this problem.

@kartheekp-ms
Copy link
Contributor

https://docs.microsoft.com/en-us/dotnet/api/system.net.httpwebrequest.preauthenticate?view=net-6.0#remarks clearly explains the behavior of PreAuthenticate when it set to true or false.

@nkolev92 nkolev92 added the Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. label Mar 10, 2022
@zivkan zivkan added Priority:1 High priority issues that must be resolved in the current sprint. and removed Priority:3 Issues under consideration. With enough upvotes, will be reconsidered to be added to the backlog. Pipeline:Icebox labels Sep 30, 2022
@zivkan
Copy link
Member

zivkan commented Feb 23, 2023

In case someone else takes up this issue, here's my finding from the PR I started back in October:

Currently NuGet sets HttpClientHandler.Credentials to a custom credential provider, which appears to make it possible to change credentials. There's no way to tell HttpClient to always send the credential. Possibly because the way it's implemented is to wait for a 401 response which tells HttpClientHandler which "scheme" to send auth for (does the server want BASIC, DIGEST, NEGOTIATE?). Based on that response, it then converts the ICredential to an appropriate request Authorization: header with the scheme from the 401 response.

In practise it might be that only BASIC auth is only ever used, but in theory DIGEST or NEGOTIATE might be used too, and there's no way for us to know. If we want to only support BASIC, perhaps we can add an environment variable to opt-out of the feature, so customers can report bugs to us and we'll need to redesign how we implement it.

Anyway, the second problem is from the docs that Kartheek linked to above. By default, HttpClientHandler will ALWAYS first make an unauthenticated request, and then send the auth header when it gets an HTTP 401 response. Preauthenticate will only automatically use the creds if a "cached path" has previously received a HTTP 401 on an unauthenticated request. The problem for us is that NuGet's v3 protocol uses lowercase_package_id/ in URLs, so every single package will be a different path that HttpClientHandler checks for cached credentials. Similar to how there's no way to tell HttpClientHandler to send an Authorization header on the first request, there's no way to tell it to "use credentials from a different path". Every time there's a URL with a "new" (first time HttpClient has seen it) path up to the last /, it will always make an unauthenticated request.

The only solution to this that I could find is to set HttpClientHandler.DefaultsHeaders.Authorization. However, the code that converts an ICredential to to appropriate header value is internal to the BCL, which means that NuGet would have to duplicate the code. Hence why I above mentioned possibly only supporting BASIC auth until we get evidence that NEGOTIATE is actually used (I have doubts about DIGEST being used, but I might be wrong).

edit: I also never investigated how credential providers work, so I don't know if the Azure Artifacts cred provider require us to send a BEARER token, or if it uses BASIC. This will obviously be very important to test, even if we only want to support BASIC for username + passwords saved in nuget.config files.

@nkolev92
Copy link
Member

In #12451, I suggested that we can add a way to mark certain sources as auth required/recommended.

@jmyersmsft
Copy link

The Negotiate auth scheme is definitely in use with on-premises Azure DevOps Server.
The Azure DevOps credential provider (which is only used with the hosted product) provides credentials for Basic.
I'm pretty sure HttpClientHandler's remembering of whether it's been challenged for credentials is on a per-host basis, not a per-url basis, but IIRC NuGet is creating a new one for each request (or at least it was like 5 years ago), so that information is thrown away.

@zivkan
Copy link
Member

zivkan commented Mar 14, 2023

The Negotiate auth scheme is definitely in use with on-premises Azure DevOps Server

In that case, my understanding of how Negotiate works means it's impossible to implement @nkolev92's idea to have a source that pre-authenticates (without being prompted by an HTTP 401) is impossible as a general case. That (sub) feature will need to be limited to HTTP Basic and cred-provider auth.

I'm pretty sure HttpClientHandler's remembering of whether it's been challenged for credentials is on a per-host basis, not a per-url basis

The docs on HttpClientHandler.PreAuthenticate state that:

After a client request to a specific Uri is successfully authenticated, if PreAuthenticate is true and credentials are supplied, the Authorization header is sent with each request to any Uri that matches the specific Uri up to the last forward slash.

From my own testing, I similarly could not get HttpClient to re-use an ICredential on different URLs on the same host where the path up to the last slash was different. If you're aware of how to do it and can share a code sample, this feature should become much easier. At least, the one from the original author, where NuGet should only send 1 unauthenticated request. Pre-authenticating without an HTTP 401 response first will still be up to what HttpClient allows.

Without PreAuthenticate set to true, we see the behaviour that the original issue author reported, where every single unique URL has two requests, the first unauthenticated and gets an HTTP 401 reponse.

The only solution I could find is to not use ICredential, and to set the Authorization header "manually" (which allows both caching credentials per host, or per source, and also send auth without an HTTP 401 response first), but that will require us to rewrite how credentials are saved, stored, and updated, making it a significantly bigger effort, since there is no API to convert from ICredential to HttpRequestMessage.Header.Authorization`.

edit: my previous testing apparently was wrong. Despite the docs sounding like the full URL up to the last slash must match a previously authenticated must match (think string.Equals), it's actually working like a prefix (think string.StartsWith). This still isn't a perfect solution, but it allows servers that implement all their resources as "subdirectories" of the service index's directory.

IIRC NuGet is creating a new one for each request (or at least it was like 5 years ago), so that information is thrown away.

NuGet.Protocol's SourceRepository type has a kind of "dependency injection" system, and I thought it caches the HttpClient and HttpClientHandler within some of the resources, but now I'm doubting myself. It's the end of my day, so I'm not going to validate right away, but an "easy" way to validate should be to set maxHttpRequestsPerSource to a non-zero value, do a restore on a solution with only a single package source configured, but make sure it downloads multiple packages (so needs multiple HTTP requests), and immediately after restore check netstat and see if there are more than maxHttpRequestsPerSource TCP connections open to the remote host. If NuGet is re-using a single HttpClientHandler, then TCP connections will be re-used. If NuGet is creating new instances every request, each request will use a different TCP connection and there will be many ports in the "closing" state.

@kartheekp-ms
Copy link
Contributor

kartheekp-ms commented Mar 14, 2023

@jmyersmsft - Can you please take a look at NuGet/NuGet.Client#4842 (comment) comment and let us know if it possible to authenticate with VssBaseUrl/AzureDevOpsProjectId/_packaging/VssFeedId/nuget? If yes, then all the subsequent requests to any resource matching the URL prefix will be automatically authenticated when HttpClientHandler.PreAuthenticate is set to true.

I tried to authenticate with https://pkgs.dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_packaging/45bacae2-5efb-47c8-91e5-8ec20c22b4f8/nuget/v3 but got an error in the browser saying there is no resource found. I was hoping if it prompts for authentication in the browser.

I am recommending a dummy call to the registration base (or other) resources so that future requests to those resources will send credentials when PreAuthenticate is set to true.

@kartheekp-ms
Copy link
Contributor

kartheekp-ms commented Mar 14, 2023

Sorry, I didn't read all the comments on this issue, @zivkan it looks like currently preauth is supported only for basic credential scheme.

https://github.com/dotnet/runtime/blob/a78f9bde2edd384f5d26c9d394eb7fcc5f48a664/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/AuthenticationHelper.cs#L215-L240

@zivkan
Copy link
Member

zivkan commented Mar 28, 2023

So it turns out my previous testing of HttpClientHandler.PreAuthenticate was wrong. Testing again, once a URL with a shorter "prefix" (url to the last forward slash) has been authenticated, any url that matches the "prefix" (think string.StartsWith) will also no longer send unauthenticated requests.

Therefore, if a server puts all resources in the service index as subdirectories under the service index, then it will work, except if the service index is cacled. So, nuget.org's URL pattern, where it puts search and package upload on different hostnames, won't work with PreAuthenticate. nuget.org doesn't support private feeds, I'm just trying to point out that nuget.org's service index will not be a good precedent for other server implementors to copy. Similar, Azure Artifacts' current implementation where service index entries use GUIDs in their URL when the customer's configured feed uses "human friendly" names, it also wouldn't work. But the artifactory examples used earlier in this issue would work.

As mentioned in the paragraph above, caching causes another problem. NuGet saves all successful responses to the filesystem, and when this filesystem is less than 30-40 minutes old, it will avoid sending an HTTP request. While this is normally good for perf, if we avoid making the HTTP request to the service index, than PreAuthenticate won't work since there won't be a prefix URL in the credential cache. An example scenario I'm thinking of is a customer tries dotnet add package Invalid, the service index will be downloaded and cached, but they get an error message because the package ID has a typo or something. Then then run dotnet add package again, but with the correct package ID. Now every package in the graph that gets downloaded will still send unauthenticated requests because the service index was never requested over HTTP since the file in the http cache was sufficiently fresh.

Another example, which won't be obvious to customers, is projects using MSBuild SDKs that need to be downloaded (a very common scenario for CI builds, as CI agents are frequently cleaned between jobs). The NuGetMSBuildSdkResolver will create SourceRepository instances to download the MSBuild SDK nuget package(s), allow the SourceRepository object to be GCed, and then when restore runs, it will create new SourceRepository instances which means new HttpClient instances, and therefore the service index data in the http cache is fresh before the majority of the packages are actually downloaded.

While we could add special logic into NuGet to basically say "make an HTTP request to the service index once, before using the cached file", if some action would happen to ONLY use files from the HTTP cache and not need to make any HTTP requests, it would be unfortunate to make a request only to the service index URL to prepopulate HttpClient's credential cache, when it wasn't needed. Especially if it triggers an interactive popup to the customer to authenticate.

@zivkan zivkan added Priority:2 Issues for the current backlog. and removed Priority:1 High priority issues that must be resolved in the current sprint. labels Mar 28, 2023
@zivkan
Copy link
Member

zivkan commented Mar 30, 2023

Rather than hijacking this issue and changing what the customer wrote in order to make this an epic, I created this other tracking issue instead:

Therefore, this issue is now a duplicate of the epic issue.

@zivkan zivkan closed this as completed Mar 30, 2023
@zivkan
Copy link
Member

zivkan commented Oct 23, 2023

Unfortunately I discovered that .NET (both Core and Framework) have a bug where HttpHandler.PreAuthenticate prevents refreshing credentials, which NuGet's credential provider architecture allows (short lived OAuth2 tokens, for example)

Therefore, I'm going to revert the previous fix, which is going to regress this bug, from the server's point of view, until a different solution is implemented. I'm not going to reopen this issue though, since the other epic issue is still tracking a bunch of related changes. I just wanted to post this notification in case people are watching this issue, but not the HTTP 401 epic issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area:HttpCommunication Area:Plugin V2 plugin w/ cross platform support Functionality:Restore Priority:2 Issues for the current backlog. Type:DCR Design Change Request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants