Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugger returns empty string as URI when paused inside a macro-generated script #54824

Closed
DanTup opened this issue Feb 5, 2024 · 76 comments
Closed
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. cfe-feature-macros Implement macros features in the CFE front-end-kernel vm-debugger

Comments

@DanTup
Copy link
Collaborator

DanTup commented Feb 5, 2024

If I step into a macro-generated script with the debugger, the uri or the script is an empty string:

"topFrame": {
	"type": "Frame",
	"kind": "Regular",
	"location": {
		"type": "SourceLocation",
		"script": {
			"type": "@Script",
			"fixedId": true,
			"id": "libraries\/@19252696\/scripts\/\/18d790e6a87",
			"uri": ""
		},
		"tokenPos": 117
	},

We use these uris in the debug adapter to map back to source code to display to the user. It's not clear to me if this is a) a bug, b) just a known issue/incomplete or c) expected and the debug adapter will need to handle it.

@jakemac53 @bkonyi

Edit: I think actually we definitely needs these URIs, because it's the only way we can map back to the source code the user already has in the editor from the analyzer (otherwise we'd have to download the source from the VM which we generally don't want for VS Code because it looks like a duplicated file to the user).

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 5, 2024

Unsure if related, but the String source is also missing if I send a getObject() request for that script:

{
	"jsonrpc": "2.0",
	"result": {
		"type": "Script",
		"class": { /* ... */ },
		"size": 80,
		"fixedId": true,
		"id": "libraries\/@19252696\/scripts\/\/18d792687ca",
		"uri": "",
		"_kind": "kernel",
		"_loadTime": 1707134584778,
		"library": {
			"type": "@Library",
			"fixedId": true,
			"id": "libraries\/@19252696",
			"name": "",
			"uri": "file:\/\/\/C:\/Dev\/Google\/dart-language\/working\/macros\/example\/bin\/observable_main.dart"
		},
		"lineOffset": 0,
		"columnOffset": 0
	},
	"id": "38"
}

(Also unsure if there should be two slashes in the ID scripts\/\/18d792687ca?)

@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. vm-debugger labels Feb 5, 2024
copybara-service bot pushed a commit that referenced this issue Feb 5, 2024
This non-null assertion is an error. If any URIs are returned from the VM that are not `file://` URIs and also return `false` from `isResolvableUri` then this would fail, even though the return values allow `null`s and should be used.

I can't currently write a test for this because in theory there aren't any cases where would actually happen, but I triggered it due to another bug (#54824) and think it's better removed.

Also added some TODOs about some behaviour that may need to change as macro support progresses (it's not far enough for me to make these changes yet).

Change-Id: I4108963b6b88bbbff4bb47f69909391900b5d3ba
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/350300
Reviewed-by: Ben Konyi <[email protected]>
Reviewed-by: Helin Shiah <[email protected]>
Commit-Queue: Ben Konyi <[email protected]>
@bkonyi bkonyi added area-front-end Use area-front-end for front end / CFE / kernel format related issues. front-end-kernel and removed area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. labels Feb 5, 2024
@bkonyi
Copy link
Contributor

bkonyi commented Feb 5, 2024

If the Script responses from the service are missing source and uri properties, it's likely that they're not present in the loaded kernel. @johnniwinther I think the CFE isn't providing these details for macros, can you confirm?

Also unsure if there should be two slashes in the ID scripts//18d792687ca?

The script ID takes the form of libraries/$libId/scripts/$encodedScriptUri/$loadedTimestamp, so the double slash is due to the missing URI.

@johnniwinther
Copy link
Member

The CFE doesn't currently generate the merged augmentation uri with the correct dart-macro+ uri. A CL is in progress: https://dart-review.googlesource.com/c/sdk/+/349864

@johnniwinther johnniwinther added the cfe-feature-macros Implement macros features in the CFE label Feb 7, 2024
@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

I'm seeing URIs with that change, thanks!

"topFrame": {
	"type": "Frame",
	"kind": "Regular",
	"location": {
		"type": "SourceLocation",
		"script": {
			"type": "@Script",
			"fixedId": true,
			"id": "libraries/@19252696/scripts/dart-macro%2Bfile%3A%2F%2F%2FC%3A%2FDev%2FGoogle%2Fdart-language%2Fworking%2Fmacros%2Fexample%2Fbin%2Fobservable_main.dart/18d843c57a3",
			"uri": "dart-macro+file:///C:/Dev/Google/dart-language/working/macros/example/bin/observable_main.dart"
		},
		"tokenPos": 117,
		"line": 5,
		"column": 21
	},

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

Although, lookupResolvedPackageUris isn't returning what I'd expect. I would expect it to map dart-macro+package URIs into dart-macro+file in the same way it maps package: URIs into file: URIs.

Here's what I currently see:

==> {
	"jsonrpc": "2.0",
	"id": "19",
	"method": "lookupResolvedPackageUris",
	"params": {
		"isolateId": "isolates/2646600707904379",
		"uris": [
			"dart-macro+package:macro_proposal/observable_lib.dart"
		],
		"local": true
	}
}

<== {
	"jsonrpc": "2.0",
	"result": {
		"type": "UriList",
		"uris": [
			// This is not what I expected, I expected the `package:` part to be resolved to a `file:`
			"dart-macro+package:macro_proposal/observable_lib.dart"
		]
	},
	"id": "19"
}

@bkonyi is my expectation correct, and if so, is it related to this issue or should I file another? (I'm unfamiliar with where the data comes from that's used for these mappings).

@bkonyi
Copy link
Contributor

bkonyi commented Feb 7, 2024

The data for the mappings comes from the kernel's SourceInfo for the script, so it's likely something else that needs to be updated in the CFE.

If we expect that dart-macro+package: URIs map to dart-macro+file: URIS (I'm not sure where we landed on this), uriUtf8Bytes should be different from importUriUtf8Bytes in the SourceInfo for the script (see the kernel spec for details).

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

If we expect that dart-macro+package: URIs map to dart-macro+file: URIS (I'm not sure where we landed on this),

I don't know if this has been specifically agreed, but I believe that we need this for the behaviour we're targetting.

The IDEs are unable to handle the mapping of package:s to file:s which is why we have lookupResolvedPackageUris. The debug adapter takes URIs from the VM (that may be package:) and maps them back to absolute paths that the editor is using for files.

The dart-macro+ prefix needs the same functionality.. the VM is using dart-macro+package: for things inside lib/ but we still need to map them to a dart-macro+file: equiv to match what the editor (and LSP) are using.

Without this, files outside of lib/ when debugged will use the analyzer-provided version but those inside lib/ would have to be downloaded from the VM (which is inconsistent and leads to confusion because the user can see multiple versions of the file - the one already in the editor and another from the debug session).

Should I file a separate issue about this (perhaps it needs some further discussion?).

@jakemac53
Copy link
Contributor

jakemac53 commented Feb 7, 2024

It does sound to me like package:package_config and possibly the related spec need to be updated to handle dart-macro+ schemes.

This goes both ways, resolving a dart-macro+package: uri should give the resolved uri for the original library, with the dart-macro+ prefix. And, asking to look up the package for a dart-macro+<resolved-uri> should also give back the package of the <resolved-uri>.

cc @lrhn

@bkonyi
Copy link
Contributor

bkonyi commented Feb 7, 2024

The IDEs are unable to handle the mapping of package:s to file:s which is why we have lookupResolvedPackageUris. The debug adapter takes URIs from the VM (that may be package:) and maps them back to absolute paths that the editor is using for files.

The dart-macro+ prefix needs the same functionality.. the VM is using dart-macro+package: for things inside lib/ but we still need to map them to a dart-macro+file: equiv to match what the editor (and LSP) are using.

We definitely should be able to support this with the lookupResolvedPackageUris API, I just wasn't sure if we had decided on having both package and file URIs for generated macro scripts. If that's something we need to do, the CFE will need to provide both in the generated kernel and then it should just work with the lookupResolvedPackageUris API.

Without this, files outside of lib/ when debugged will use the analyzer-provided version but those inside lib/ would have to be downloaded from the VM (which is inconsistent and leads to confusion because the user can see multiple versions of the file - the one already in the editor and another from the debug session).

I was under the impression that both the analyzer generated code should be identical to the code generated by the CFE. Is that not the case?

Should I file a separate issue about this (perhaps it needs some further discussion?).

Maybe? I do think it's related to this issue and probably only requires a small change somewhere in this CFE CL (probably at this line) to support. I'll leave a comment.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

We definitely should be able to support this with the lookupResolvedPackageUris API, I just wasn't sure if we had decided on having both package and file URIs for generated macro scripts.

Oh, I see. I assumed that they'd both be needed for the same reason as the originals, but my knowledge of this area is poor :-)

I'm proceeding for now on the assumption that we will have both, and will wait for @johnniwinther's response. If he agrees and thinks it should be included in his fix, no need for another issue. If it's not clear that's the right direction, I'll file an issue for further discussion.

I was under the impression that both the analyzer generated code should be identical to the code generated by the CFE. Is that not the case?

Yep, that's my expectation (although I just discovered it's not currently - I filed #54848 about that). However if we don't make these URIs match, VS Code doesn't know that they're the same file - it just knows it has an analyzer-backed file with one URI, and a VM-downloaded file with another URI, and will show them to the user as if they are distinct files. We need both DAP and LSP to be using the same URIs for this to work correctly (and for LSP we must use unique file URIs, because package: URIs are not globally unique because the workspace can contain multiple versions of the same packages).

@lrhn
Copy link
Member

lrhn commented Feb 7, 2024

I'd definitely prefer if we didn't need to resolve dart-macro+package:foo/bar.dart to anything. If we even can.

The URI is an identifier (I in URI) of generated code, it's not a locator for it (L in URL). It doesn't say where to find the code.
If we resolve dart-macro+package:foo/bar.dart to dart-macro+file:///users/me/dart/foo/lib/bar.dart, then we still haven't given a file location. It's still an opaque resource identifier that doesn't point to the actual file (the actual bar.dart file isn't it, that's the source file the macro augmentation was created for).
So we're replacing one resource identifier with another, without actually bringing anyone closer to finding a file. That sounds like a waste of work.

If we need to find "the macro augmentation for file:///.../bar.dart", I'd rather go the other way and convert bar.dart to a package: URI first, if possible, then add the dart-macro+ prefix to its scheme.

You can still have macro-augmentations for files outside of lib/, but each Dart library, identified by its URI, will only have one macro augmentation file URI. Which is not a package: URI.

I think everyone would be happiest if the code of an augmentation doesn't need to exist on disk at all. Only write it out if some tool is unable to work if it doesn't have physical files. And those files should not be inside the lib/ directory anyway, so they won't really have a package: URI.
(That does mean that package_config needs to be updated to recognize dart-macro+package:foo/ to be in the foo package, because otherwise the file won't have a default language version.)

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

To be clear, the reason we need this is not because we need to find files on disk, or even that we need to tie this script back to the original source file. We need it because it happens to be the unique URI that the editor is using to identify this content and we need the debugger and the editor to use the same reference.

So, it would be fine if we used dart-macro+file: everywhere IMO. I assumed that wouldn't be possible for the same reason that the VM is using package: URIs (and not file: URIs) today, but I don't know if that's accurate.

However it's not fine for us to only have dart-macro+package:foo/foo.dart because in the editor/analyzer, we might have multiple versions of package:foo (because you can have multiple projects open) and therefore this URI is not unique (whereas having a URI that is based on the file is, because the filesystem is).

That said, I don't know how (or if) this interacts with package_config/etc.. these URIs from my perspective are only required for mapping in the debugger (eg. via lookupResolvedPackageUris) to translate between what the client/editor is using and what the VM is using.

I guess in theory, I could drop the prefix from the scheme, use the existing mappings, and then put the prefix back.. however this feels like an assumption we might not want to bake into lots of places.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

these URIs from my perspective are only required for mapping in the debugger

Actually, it occurs to me that they may also be printed in stack traces in stderr.. Ignoring the ambiguity of the IDE wants to linkify this output (if there are multiple versions of a package in the workspace), the package: variants would be shorter than the file variants, so probably that's another reason to not use file variants everywhere (and therefore be able to map back and forth).

@lrhn
Copy link
Member

lrhn commented Feb 7, 2024

I've implemented the minimum support for dart-macro+ in package:package_config, which is to recognize that dart-macro+uri belongs to the same package as uri. That's what's necessary to get a language version.
(dart-lang/package_config#148)

I can allow resolve to "resolve" a dart-macro+package:foo/bar to dart-macro+file:///somewhere/foo/lib/bar and let toPackageUri go in the other direction.
But it doesn't feel right, because toPackageUri is defined as returning a package: URI, and dart-macro+package: is not a package URI. Similarly resolve accepts only package URIs, and that's not a package URI.

I don't want to start claiming that dart-macro+package is a valid package URI, because it's not.

I would actually prefer if we don't change package_config, and let the tools that actually know what a dart-macro+package URI is be the ones to remove the dart-macro+ before asking something about the related Dart file.
The package_config package has one job: Read and understand .dart_tool/package_config.json, and that file has no relation to dart-macro+ URIs.
It should be handled at another level, by tools which do know what those URIs mean.

@jakemac53
Copy link
Contributor

jakemac53 commented Feb 7, 2024

It should be handled at another level, by tools which do know what those URIs mean.

The main thing is that we want all the tools to do this consistently, and if we handle it in this package we get a consistent implementation that handles any URI any tool might encounter. Otherwise should we publish another package that just removes the prefix before calling this package, and then adds it back? That doesn't seem right either.

I am still a bit confused about the exact issue that is happening here though, it does feel wrong to me for tools to be resolving the dart-macro+package uris in this way. It seems like more just a quirk of the fact that IDEs really want to deal in file URIs. So maybe it really is only the editor plugins that have to deal with this, and the logic should live there?

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

I agree with most of the above.. I don't think this really affects package_config. I also think that ideally no tools should should be removing and re-adding these scheme prefixes. I think that you could easily get from dart-macro+file:///foo/bar.dart back to /foo/bar.dart is just an implementation detail that should be assumed only by the code that's generating these scripts.

It seems like more just a quirk of the fact that IDEs really want to deal in file URIs.

To be clear, the issue here is not about getting to file URIs (or back to the original script that caused this one to be generated), we just need to get to a unique URI for the script that is the same both inside the analyzer and the debug adapter. It doesn't matter whether this looks like a file URI or not.

When you click "Go to Definition" in your IDE (when there is no running app) you will get to a copy of the generated macro code. When you add a breakpoint in this file, we need to be able to send that breakpoint to the runtime when you start debugging, and that means we need a unique URI to refer to it by.

Today, we are working on the basis that this URI will be dart-macro+file://path/to/source/file.dart.. it doesn't particularly matter what it is, as long as the analyzer can generate it, and the debug adapter can translate back and forth between it and whatever the runtime is using. To me, these feels exactly like what we have with file paths and package: URIs (the runtime uses package:, but the editor/analysis server use file paths.. the debug adapter translates between them using lookupResolvedUris and lookupResolvedPackageUris).

@lrhn
Copy link
Member

lrhn commented Feb 7, 2024

think that you could easily get from dart-macro+file:///foo/bar.dart back to /foo/bar.dart is just an implementation detail that should be assumed only by the code that's generating these scripts.

With one, sadly rather significant, exception. The semantics of

// library URI: package:foo/bar.dart
augment 'dart-macro+package:foo/bar.dart'; // Inserted by macro generation
// ...

which is the semantics that we want the post-macro-generation program to have, depends on dart-macro+package:foo/bar.dart to have the same language version as package:foo/bar.dart.
The language version is determined by the package that dart-macro+package:foo/bar.dart belongs to.
The package it belongs to is defined as:

  • If it's a package URI, the package with that name in the active package configuration.
  • If not, the (most deeply nested) package in the package configuration, whose root URI the URI is inside.

This is not a package URI, and it's not inside any file: URI, so it will not belong to any package.

The hack, which I have implemented, is to allow packageOf in package:package_config to recognize dart-macro+ and ask for the package of th rest of the URI.
But it is a hack, it's hard-coding a convention into resolution. We can do that (it's just adding a "URI is dart-macro+scheme:rest, then belongs to same package as scheme:rest" rule above), but it is exposing the intermediate files of macro generation.

(I'd rather allow package URIs with queries, like package:foo/bar.dart?macro-generated as the way to define a URI for a macro-generated file. That's still a package: URI, not one which is currently valid, but we could change that - and then not allow users to write such URIs in imports anyway. And it can survives resolve and toPackageUri if we want it to. A query is part of the URI identifier, unlike a fragment, so it is a differnt URI than the one without the query. We just generally don't allow queries, so the syntactic space is free.)

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 7, 2024

I'd rather allow package URIs with queries, like package:foo/bar.dart?macro-generated

I don't personally have any opinion on these URIs, but for the URIs we'll use to communicate with the editor, we couldn't do the same thing (eg. we can't use file:///foo/bar?macro-generated) because in order to provide these "virtual files" in the editor, they need to be a custom scheme. Anything starting file:/// is handled by the editor and expected to be on disk, whereas a custom scheme we can provide the implementation of.

There's certainly no reason we couldn't map between dart-macro+file:/// and package:...?macro-generated, although there is something nice and symmetrical with the prefix.

Whatever the specifics of the URIs, we do need some way to map between them in the debug adapter. Using lookupResolveUris and lookupResolvePackageUris feels natural to me (because it already handles package:<->file:/ mappings and this feels like the same kind of thing), but if there's a reason that won't work and we're happy to encode these rules into the debug adapter, that will probably work (and I'll probably do this locally for now to let me continue testing debugging).

@lrhn
Copy link
Member

lrhn commented Feb 8, 2024

I do like the idea of using a query suffix instead of a scheme prefix.
My one worry is that valid file URIs cannot contain queries.

Our Uri class can build strings starting with file: and ending with ?foo, but that doesn't make them valid file: URLs, and it may give us problems if such an URL ever flows into a tool that assumes, or enforces, valid file URLs.

I think the WhatWG URL spec does accept a query component on a file URL, though, which was my first worry. We do want to move Uri to a WhatWG URL compliant behavior, so we can use the built-in functionality in the browser.
And probably stay there, so if that spec allows file URIs with queries, I wouldn't worry about using them.

Basically someUri?dart-macro is a URI that is adjacent to someUri, but different from it. Whatever we do to read such a file needs to be aware of the difference. Which may be a problem if an IDE sees a file has URI file:///..foo.dart?dart-macro and thinks "I can read that from disk!".
If that's an issue, using a file: URI with a query suffix is probably not viable. (That's the "tool expecting valid file: URIs to be valid" case again, rather than throwing because it's invalid, it ignores the part that is invalid and misinterprets the URI.)

All in all, I think we can choose to say that file:///users/me/dart/pkgs/foo/tool/tool.dart?dart-macro is a valid Dart file-reference URI, but that users are not allowed to specify queries in import/export/part/augment URIs.
The biggest issue is how other tools react to such an URI.

It would have the added advantage that an import inside a generated file will be resolved relative to the original file, because a base URI of package:foo/src/bar.dart?dart-macro used to resolve a reference of ../../baz.dart (notice one .. too many) directly gives package:foo/baz.dart, because resolving a relative URI against a base URI will discard the base query, and because it's resolved as a known package URI with all the necessary special casing we have made for package URIs.
Without that, our compilers will need to recognize the base URI of a macro-generated augmentation file specially, and remove the dart-macro+ prefix before resolving against it. Otherwise a macro file dart-macro+package:foo/src/bar.dart with an import of ../baz.dart would resolve to dart-macro+package:foo/baz.dart, which is a macro-generated file, not the file we tried to import. (I assume that's already happening, but it means one more place where the compilers need to be aware that macro-generated files with weird URIs exist, and need special casing. It can't just use the normal semantics for anything related to the URIs of such files.

We could also use a fragment, #dart-macro. Those have the advantage that they apply to any URI, because they are not really a part of it. They're metadata-on-the-end of a URI>
It's a little more iffy to distinguish import 'foo.dart'; and import 'foo.dart#dart-macro';, because at the URI level they identify the same resource, and the #dart-macro is an internal reference into that resource, which is not the case here.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 8, 2024

For the URI that is used to communicate between LSP and the editor, and the editor and the debug adapter, we definitely cannot use file:///. It's not just that the IDE might try to read from disk, we also have no way to support the contents of files unless they are for a custom URI scheme that we have registered.

So whatever is being used inside the runtime, we must ultimately convert this to a custom-scheme URI in the debug adapter to go to the client (and the LSP server must also use those same custom-scheme URIs to provide content and functionality like navigation/diagnostics).

With one, sadly rather significant, exception.

It occurs to me we probably have another exception here. In the debug adapter, we have settings to allow the user to debug "My code", "My code + external packages" or "My code + external packages + sdk". In order to determine what is "My code", we look at whether the resolved file URI is inside a set of workspace paths.

To handle the same for macros, we'll need to strip the dart-macro+ prefix from the custom-file URI to be able to do this same kind of check.

@lrhn
Copy link
Member

lrhn commented Feb 8, 2024

So, generally, we need the dart-macro+url to behave like url in a number of cases:

  • Is it inside the workspace.
  • Is it inside a particular package in a package_config.
  • Resolving file references inside it (import/export/part/augment)

Any time it matters where a Dart file is, a dart-macro+url should claim to be at the position of the url.

If we actually want to read the file, then that's not where it is (there is a completely different file there).

The question is then who should be stripping the dart-macro+ before asking?
It has to be someone who knows the URIs exist. And we don't want everybody to have to know that.

The obvious choices are the tools that macro-compilation is integrated into: Compilers and the analyzer. Maybe the LSP.

The compilers and analyzer are the ones that need to ask about the package of a dart-macro+ URI. They can strip the prefix before asking, knowing that they are assuming that the macro-generated file has the same answer as the corresponding non-prefixed-URI file. (Which the macro engine should then ensure.) They are also the ones to resolve import URIs.
I think those tools they should be the ones removing prefixes, at the point where they need to ask a question depending on location on a macro-prefixed URI. Downstream tools and packages shouldn't need to know.

To see whether the file is inside a specific workspace, the debug adapter may need to know to strip the prefix, and ask about another file, and again assume the answer to be the same. That's more annoying, but unless the analysis server starts providing a service to check whether a URI is included below another URI, so it can do the prefix-twiddling, the code that actually knows what the the workspace location is, will have to know about macro-URIs too, if it wants them to be included.
(Exposing a general "Is this URI morally inside that URI" functionlity, which can account for things like resolving package URIs or removing macro-prefixes, might be an idea. The "does this file have a package URI" functionality is an example of this, which is language and package-config dependent.)

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 8, 2024

The obvious choices are the tools that macro-compilation is integrated into: Compilers and the analyzer. Maybe the LSP.

I don't think the LSP server needs to know this directly (it wraps the analyzer which should take care of most of this, minus some custom mapping because the analyzer currently uses paths and not URIs in many APIs). The debug adapter does need to know, only for the "is in workspace" check.

One thing that might be helpful in these discussions is defining some terms for these different kinds of URIs. Originally in the debug adapter, we just had two things:

  1. File paths - these are what the editor always used. When the debugger breaks at a breakpoint, we had to tell the editor a file path (something physically on disk) for it to open that file (ignoring a slight niggle of downloading source code from the VM, but that uses internal incrementing integer references unrelated to any paths/URIs from the VM)
  2. URIs - these were used in the VM (for example we might see package:foo/foo.dart or dart:io)

We ended up with some helpers to convert between Paths and URIs and it seemed fairly simple. Then we started mapping SDK sources and started dealing with org-dartlang-sdk URIs. Today, file paths are really going away entirely and instead we need to use URIs everywhere (because the client is no longer restricted to physical files).

So now we have at least three "kinds" of URIs in the debug adapter. I don't know if these already have names, but I think it would be useful to define them (so we can name APIs the same across tools and be on the same page in discussions). For example:

  1. The URIs in the VM and used for importing
    • dart:io
    • package:foo/foo.dart
    • (maybe file:/// for things outside of lib/??)
    • dart-macro+package:/// ?
  2. The URIs that the VM returns when we ask it to resolve those URIs (via lookupResolvedPackageUris).. for non-SDK sources, these are generally file:/// URIs, but for SDK sources they are org-dartlang-sdk URIs
    • org-dartlang-sdk:///...
    • file:///...
    • dart-macro+file:/// ?
  3. The URIs we can send to the client. In the debug adapter I'm currently referring to these as "file-like URIs". Previously they would be Strings and called "paths", but that doesn't work now. These are generally the same as (2) but for SDK sources the org-dartlang-sdk URI is converted to a real file path for the source inside the SDK (the same place the user would end up if they navigated into SDK sources via the analyzer)
    • file:///my/dart-sdk/core/iterable.dart
    • file:///my/projects/foo/lib/foo.dart
    • dart-macro+file:///my/projects/foo/lib/foo.dart

(1) is what we see in the VM and will show up in things like call stacks in the debugger view. In the debug adapter, we will convert them to (2) using lookupResolvedPackageUris and then to (3) using some mappings in the debug adapter. Those are what we'll use to communicate with the editor so it can open those files.

In the opposite direction, the editor will send us (3) (for example when the user adds a breakpoint), we'll convert it to (2) with the built-in mappings in the debug adapter, and then use lookupPackageUris to convert back to (1) to send to the VM (having "package" in the name of that API is a bit confusing, but it is what it is).

The check the debug adapter does for whether a URI is "my code" will take (3) and just do the (unfortunate, but simple and probably necessary) stripping of the URI prefix and then compare the result against the workspace roots it knows about.

@lrhn
Copy link
Member

lrhn commented Feb 12, 2024

I don't thing we need to "resolve" a dart-macro+package:foo/bar to dart-macro+file://home/me/dart/pkg/foo/lib/bar.

It's a non-package: URI, so you can just use it directly. If the macro generated files are never saved to disk, then it doesn't matter whether you ask the source for dart-macro+package:... or dart-macro+file:...`, they are both opaque URIs, and if anything, things might get harder for ourselves if we try to give them two URIs instead of just one.

In a way, dart-macro+package:... and dart-macro+file:... are magical URIs just like org-dartlang-sdk:.... They only make sense when passed used with tools that understand what they mean, and in this case, only if the tools actually know how to generate the file content, if it hasn't already. If we're asking the analyzer about the URI that a front-end compiler generated, the analyzer will just have to generate the same content.

There is no benefit in converting dart-macro+package:... to dart-macro+file:... if the corresponding file doesn't exist anyway. There will be dart-macro+file:... URIs too, from non-lib/-dir libraries.

I think the main reason we have the org-dartlang-sdk: URIs is so we can give a URI to part files.
If we could name those as dart:io/file_system_entity.dart, and just disallow references to URIs of that form from non-platform libraries, I don't think we need the extra org-dartlang-sdk: scheme.

(If a macro generated file contains a macro application, will it get a dart-macro+dart-macro+package:... URI? I guess the plan is to flatten macro generated files for a main library into one level. That requires the files to be entirely generated with all imports uniquely prefixed, otherwise we can't flatten augment files with different imports.)_

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 21, 2024

Note that the lib folder assumptions do not even necessarily hold today either.

Yep, when I said "in lib"/"not in lib" above, I just meant things that would/wouldn't resolve to package: URIs and not that we should check the URIs in that way. We should use lookupResolvedPackageUris/lookupResolvedUris for this translation.

@bkonyi what are your thoughts in putting logic like described above into DDS?

(although that makes me wonder if today we already rely on / guarantee DDS is available?)

@lrhn
Copy link
Member

lrhn commented Feb 21, 2024

So the problem is that:

  • We need to assign a globally unique URI to each file open in an editor, which identifies the source. This is why we assign a file: URI to package: URIs, because package: URIs are not globally unique. (And the IDEs are file editors, so they love file: URIs.)
  • So we can't use dart-macro+package:foo/bar.dart to identify an editor resource, because that's not globally unique.
  • We can then choose to use dart-macro++resolve(package:foo/bar.dart), which is a dart-macro+file: URI.
  • But that's not necessarily globally unique either.
    • A file: URI is globally unique because it identifies a single source of truth about which source code it represents.
    • A dart-macro+file: URI could come either from macro-generating code from a file: URI identified library, or
      from converting a dart-macro+package:... URI to a dart-macro+file:... URI, by "resolving beneath the dart-macro+.
    • And those two can have different source code (possibly must have, if they use an absolute augmentation of declaration).
  • You generally shouldn't use the same file through both a package: and a file: URI, and if you don't, there is no problem.
  • But we haven't prohibited it, or allowed compilers to canonicalize to the package: URI if they can. (We could.)
    • I do hope we have analyzer warnings if someone imports a file in lib/ through a file: URI.

I still think a better uniquification would be to append the location of the package_config.json file that the package: URI is resolved by.
So the URI for package:foo/bar.dart's macro generated code could be dart-macro+package:foo/bar.dart?pkgcfg=file://home/me/libs/foo/.dart_tool/package_config.json.

That's globally unique and it's defined entirely in terms of the original library's identifying URI, plus, if it's a package URI, an identification of the package configuration that package: URI is defined in terms of.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 21, 2024

So the problem is that:

Yes, this is an accurate summary :)

I still think a better uniquification would be to append the location of the package_config.json file that the package: URI is resolved by.

I assume in this world, dart-macro+file: would be used for files imported through file: URIs and/or outside of packages ("outside lib folders")?

I was not a fan of this idea originally, but now knowing what I do, I feel like this might make more sense.

But, I suspect this could also be a big analyzer change. Currently it assumes only a single generated file, and it uses synthetic paths (foo.macro.dart for the content generated for foo.dart). However, if this was done - it feels like it would solve all of the problems - if a user did generate two different sources from the same file being imported as package: and file:, I believe we'd be able to serve them both up and everything would be sound.

(fyi @scheglov @bwilkerson.. this is a long thread, but it's summarised well in @lrhn's comment above this one.. I'd be interested to hear your thoughts)

(Maybe we should move this into #53950 or #54323 because we're probably a bit off-topic for the current title of this issue, with the original (described) issue somewhat resolved)

@scheglov
Copy link
Contributor

The analyzer does canonicalize file: URIs that can be converted into package: URIs.
So, it does not matter if you used file:, you will get package: anyway.

  solo_test_X() async {
    var a = newFile('$testPackageLibPath/a.dart', '');

    await assertNoErrorsInCode('''
// ignore:unused_import
import 'file://${a.posixPath}';
''');

    var uri = result.libraryElement.importedLibraries.first.source.uri;
    expect('$uri', 'package:test/a.dart');
  }

@stereotype441
Copy link
Member

My personal preference would be to resolve this by having the common front end canonicalize file: URIs into package: URIs, just as the analyzer does. That would ensure that each file on the user's file system has a single unique URI.

@jakemac53
Copy link
Contributor

jakemac53 commented Feb 22, 2024

My personal preference would be to resolve this by having the common front end canonicalize file: URIs into package: URIs, just as the analyzer does. That would ensure that each file on the user's file system has a single unique URI.

Just to play devils advocate - while this situation is increasingly less likely - such canonicalization does rely on there being a package config file which knows about the package these files belong to.

Technically, you could still have such a script loaded as a file URI in a context where it has either no package config or a package config with no entry for its containing package. And then also loaded separately with a package config which does know about its package. In which case, you still end up with two separate libraries pointing at the same underlying "file".

This is only valid for the IDE context, where multiple workspaces might exist at once. But that is the primary context we are trying to support here also.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 22, 2024

Technically, you could still have such a script loaded as a file URI in a context where it has either no package config or a package config with no entry for its containing package. And then also loaded separately with a package config which does know about its package. In which case, you still end up with two separate libraries pointing at the same underlying "file".

Can this happen within a single VM though? I don't think this is a problem in the editor itself because the editor is always using the file-like URIs and doesn't care at all about the package versions.

The real blocker here is getting to the point where we can give one of these file-like URIs to a VM and have it map it concretely to the loaded script and back (using lookupResolvedPackageUris/lookupResolvedUris).

For non-macros, we can give file:///foo/bar.dart and get back package:foo/bar.dart (or null/the original file URI if there is no package: version). And we can do the same in reverse.

Based on discussions above, I understand we cannot do this today with dart-macro+file:/ because this mapping is not the same (we might have both a dart-macro+file:/ and dart-macro+package:/ in the runtime for that same file and they can have different contents). So the question is, what do we need to do so that we can reliably convert the file-like URI from the editor to the right thing in the runtime (so - for example - we can add breakpoints into the right script).

If all scripts were canonicalized as Paul suggests, I think this might allow populating importUri and fileUri to support this mapping unambiguously (or at least as unambiguously as we have with file/package today, which presumably is good enough).

The other (sound) alternative I think is to come up with a new format for the unique URIs to go into fileUri in the kernel files and be used by the editors/LSP server. Lasse made some suggestions above (encoding the package config) which I think would work, but I think would also be much more complex (and may look odd to users in the places where these may show up in the editor UI).

There are some slightly less-sound options (such as mapping this in DDS using the rules I noted above) but I really worry that this will be difficult to ensure is always accurate because we're using the same URIs on both sides of the debug adapter to mean different things and we'd need to ensure all code operating on a URI correctly handles it for the given context.

@jakemac53
Copy link
Contributor

Can this happen within a single VM though? I don't think this is a problem in the editor itself because the editor is always using the file-like URIs and doesn't care at all about the package versions.

No, not within a single VM afaik (always one package config). But I don't think that matters. The issue of different file contents for the augmentations based on the import URI of the library is still present, and so using the file URIs as the URI in the editor will always have the possibility of this conflict. Very likely, it just doesn't matter. But canonicalizing things when possible doesn't fully solve the issue.

A concrete example of this would be if you created a subdir of a package, lets say example, which doesn't depend on the package in its parent dir, but instead imports via relative (or absolute) path the files under lib of that parent package. When opening up the parent directory in the IDE, there are now two valid versions of the augmentation libraries for all the libraries under lib/ of that package, depending on if you are viewing it from the context of the parent package or the example package.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 22, 2024

That's true.

That particular setup feels like an edge case though and would only impacts users doing that (and there's a way to avoid it). The current issue is that we're unable to support even the projects that are doing everything "correctly", because we can't map between these URIs (due to fileUri is not populated in the kernel file with the URI the editor is using).

There are some trade-offs here and I don't know who is best placed to decide on a solution (my preferences would be Paul's canonicalization *, followed by Lasse's package-config in URIs *, followed by some mapping function in DAP/DDS). I'd prefer not try to map these in DAP unless it's clear the other two are no-gos because I think it's an overall worse solution.

* I am assuming that with either of these, that fileUri can be populated with the URI the editor will use.. if there are reasons that might not be possible, maybe they're not the right solution.

(I'm available to chat during UK-ish hours if I can help move this decision along, though I think you have a good understanding of the issue)

@jakemac53
Copy link
Contributor

jakemac53 commented Feb 22, 2024

I am completely OK with just accepting that a niche edge case like this has some broken behavior, FWIW. As long as the core assumptions aren't baked into the language spec, and ideally not the compiler/SDK. It is much easier later on to make changes at the tooling layer than the language/SDK layers.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 22, 2024

Does "As long as the core assumptions aren't baked into the language spec, and ideally not the compiler/SDK" mean that Paul's suggestion won't work? (eg., even with that canonicalization, the fileUri for dart-macro+package:foo/bar.dart could not be the equiv dart-macro+file:/ URI the editor is using?)

If so, then I think the next best option may be Lasse's suggestion - first agreeing on a URI format (again 😄) that could go into fileUri? This one may have some knock-on effects with the analyzer because it would change the way we generate these URIs (we might need more context to do so).

@jakemac53
Copy link
Contributor

Does "As long as the core assumptions aren't baked into the language spec, and ideally not the compiler/SDK" mean that Paul's suggestion won't work?

Canonicalizing the URIs (to the package: equivalents) is I think a good, and safe thing to do. We just can't guarantee canonicalization always happens. It is dependent on the package config being set up properly.

(eg., even with that canonicalization, the fileUri for dart-macro+package:foo/bar.dart could not be the equiv dart-macro+file:/ URI the editor is using?)

This would also still be safe to do, within a single compile. I am less convinced it is the right thing to do, for the compiler. It is outside the spec (but this entire concept of "import URI" versus "file URI" is outside the spec afaik). It might be the easiest pragmatic choice though. But it might be hard to roll back if we regret it later.

If so, then I think the next best option may be Lasse's suggestion - first agreeing on a URI format (again 😄) that could go into fileUri?

I would really love to avoid this if we could, but if we need to then yes including the package config path or something in the URI (via query param etc) would resolve this issue too. I am not sure how this would actually end up looking like in the IDE though?

@jakemac53
Copy link
Contributor

jakemac53 commented Feb 28, 2024

Decision reached

In the language meeting today we reached a decision on this 🚀 , assuming the CFE work is feasible:

  • We should change the spec to say that library URIs should be canonicalized to their package uri equivalents where they can be. cc @lrhn I think had some specific wording in mind
    • This is technicially a breaking change, but a very small one. But we do need to go through that process probably.
    • We likely eventually want to make file imports which would be canonicalized to a package uri an error.
      • with a corresponding dart fix
      • this has always been a bad thing to do with very weird behavior and debugging issues
    • Specifically for the CFE:
      • The fileUri field should be populated with a dart-macro+file URI, as soon as possible - mapping dart-macro+package uris in the same way as normal package URIs would be mapped.
      • Actual work to do the canonicalization should wait to be implemented until we figure out which release we want to ship it in. This very likely would not happen until after the next beta cut.

cc @johnniwinther @scheglov - I believe this is a no-op for the analyzer because it already does this canonicalization.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 28, 2024

The fileUri field should be populated with a dart-macro+file URI, as soon as possible - mapping dart-macro+package uris in the same way as normal package URIs would be mapped.

🎉

Thank you for letting me hijack your meeting today, it was great to get to a decision 😄 Should we file specific issues for this work and close this one, because it's very long/noisy and probably not the best way to track what's left.

(I'm also keen to start testing once the change above is available and can build the SDK locally, so I'm interested in being CC'd on the CL if it might give me a head-start if landing the change might take a little longer).

@lrhn
Copy link
Member

lrhn commented Feb 28, 2024

We have several options here on how to specify this change. I'd probably go for:

Every file of a Dart program is identified by a URI. A URI to identify a file can be provided an import, export, part, augmentation, augmentation of and part of URI string, or provided as an entry point to tools. Before such a URI is used, it will now be "package-canonicalized". If the given URI (resolved against the relevant base if relative) corresponds to a package: URI, as determined by a package configuration maintained by the compiler, then the URI is canonicalized to (replaced by) the corresponding package: URI before being used. If there is no corresponding package: URI for a URI, the canonicalization is the original URI. The canonicalization of a package: URI is always itself.

This means that if a file: URI denoting a library file is considered as corresponding to a package: URI, there is no way to import the that library with an identifying URI which is a file: URI.

Our imports already do a bunch of normalization (fx case-, escape-, and path-normalization), but this is not just one more. The other normalizations depend only on the URI syntax, this one depends on external metadata, a "package configuration", which we will keep completely abstract in the spec. A package configuration is just, in this context, a partial mapping from URIs to package: URIs. (The actual mapping has some nice properties, like nearby files having nearby package: URIs, but for canonicalization, that doesn't need to matter.)

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 29, 2024

@johnniwinther thank you for https://dart-review.googlesource.com/c/sdk/+/355140 - I just tested and it seems to be working exactly as I hoped, this is great :-)

I'll leave it to others to decide whether to close this issue and open a more specific one about the canonicalization (which may need announcing as a breaking change).

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 29, 2024

@johnniwinther actually, my testing was bad - I was testing in the bin folder. I'm actually seeing empty strings for URIs again when the macro-generated sources are inside the lib folder. Here's a repro (using a bleeding edge build at 08e056a).

bin/main.dart

import 'package:macro_proposal/foo.dart' as foo;

void main() {
  foo.main();
}

lib/foo.dart

import 'package:macro_proposal/with_hello.dart';

void main() {
  A().hello();
}

@WithHello()
class A {}

lib/with_hello.dart

// There is no public API exposed yet, the in-progress API lives here.
import 'package:_fe_analyzer_shared/src/macros/api.dart';

macro class WithHello implements ClassDeclarationsMacro {
  const WithHello();

  @override
  Future<void> buildDeclarationsForClass(
    ClassDeclaration clazz,
    MemberDeclarationBuilder builder,
  ) async {
    builder.declareInType(DeclarationCode.fromString('''
  void hello() {
    throw 'a';
  }
'''));
  }
}

If I run this and pause on the exception, I see an empty string URI for the top frame's script. If I run it without the debugger, the printed stack trace from the exception also shows no URI:

PS C:\Dev\Test Projects\macro-sample> dart --version
Dart SDK version: 3.4.0-edge.08e056a9649ea7bce74e67d5f6e0f14f426c49e3 (main) (Thu Feb 29 11:59:20 2024 +0000) on "windows_x64"
PS C:\Dev\Test Projects\macro-sample> dart --enable-experiment=macros .\bin\danny.dart
Unhandled exception:
a
#0      A.hello ()                          <-- URI in parens is empty here
#1      main (package:macro_proposal/foo.dart:4:7)
#2      main (file:///C:/Dev/Test%20Projects/macro-sample/bin/danny.dart:4:7)
#3      _delayEntrypointInvocation.<anonymous closure> (dart:isolate-patch/isolate_patch.dart:297:19)
#4      _RawReceivePort._handleMessage (dart:isolate-patch/isolate_patch.dart:184:12)
PS C:\Dev\Test Projects\macro-sample> 

@johnniwinther
Copy link
Member

@bkonyi Do you know if some plumbing is needed for the new file uri?

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 29, 2024

@johnniwinther I noticed in your CL augmentationUri changed to augmentationFileUri here:

component.uriToSource[augmentationFileUri] = new Source(

Most of the other instances of augmentationUri changed to augmentationImportUri. Is this possibly an error? I have zero knowledge of this code and don't know the implications of changing it, but I tried it locally and it appeared to fix the empty string:

PS C:\Dev\Test Projects\macro-sample> dart --enable-experiment=macros .\bin\danny.dart
Unhandled exception:
a
#0      A.hello (dart-macro+package:macro_proposal/foo.dart:5:5)            <-- This looks better
#1      main (package:macro_proposal/foo.dart:4:7)
#2      main (file:///C:/Dev/Test%20Projects/macro-sample/bin/danny.dart:4:7)
#3      _delayEntrypointInvocation.<anonymous closure> (dart:isolate-patch/isolate_patch.dart:297:19)
#4      _RawReceivePort._handleMessage (dart:isolate-patch/isolate_patch.dart:184:12)
PS C:\Dev\Test Projects\macro-sample> 

Edit: Not certain it fixes all the cases though, my tests still fail with empty strings, so maybe that's not the right fix (or not the complete fix).

@johnniwinther
Copy link
Member

@DanTup @bkonyi This is exactly where the model breaks down (and where there might be some plumbing needed in the VM); the uriToSource uses file URIs as keys, so to be consistent with that, I used it here as well.

@DanTup
Copy link
Collaborator Author

DanTup commented Feb 29, 2024

Ah, gotcha. I'll wait for Ben's input then, thanks!

@bkonyi
Copy link
Contributor

bkonyi commented Feb 29, 2024

I'll have to dig around in the VM to understand what's happening here. I wouldn't expect to need any extra plumbing for this, though, as long as we're still populating these URIs in uriUtf8Bytes and importUriUtf8Bytes in the kernel.

@DanTup
Copy link
Collaborator Author

DanTup commented Mar 4, 2024

@bkonyi thanks! Let me know if anything isn't clear or you can't repro it from the above. This comment and using the cli is probably the easiest way to repro the issue of empty strings, but DAP's use is a bit more complex. If you have a proposed fix I'm happy to build it locally and test with my in-progress DAP changes.

@bkonyi
Copy link
Contributor

bkonyi commented Mar 7, 2024

Some other stuff came up, so @derekxu16 is going to look into this for me.

@derekxu16
Copy link
Member

derekxu16 commented Mar 7, 2024

The data for the mappings comes from the kernel's SourceInfo for the script, so it's likely something else that needs to be updated in the CFE.

It looks like this is true, when uri_string is "dart-macro+package:macros_sample/foo.dart" (I'm using Danny's repro steps but I named my project "macros_sample" instead of "macro_proposal") on line 1931 here

const String& uri_string = helper_.SourceTableUriFor(index);
const String& import_uri_string = helper_.SourceTableImportUriFor(index);

Stepping into the SourceTableImportUriFor call reveals that a size of 0 is being read on line 3094.

String& KernelReaderHelper::SourceTableImportUriFor(intptr_t index) {
AlternativeReadingScope alt(&reader_);
SetOffset(GetOffsetForSourceInfo(index));
SkipBytes(ReadUInt()); // skip uri.
SkipBytes(ReadUInt()); // skip source.
const intptr_t line_start_count = ReadUInt(); // read number of line start
// entries.
for (intptr_t i = 0; i < line_start_count; ++i) {
ReadUInt();
}
intptr_t size = ReadUInt(); // read import uri List<byte> size.
return H.DartString(reader_.BufferAt(ReaderOffset()), size, Heap::kOld);
}

@jakemac53
Copy link
Contributor

cc @johnniwinther ?

@johnniwinther
Copy link
Member

@DanTup
Copy link
Collaborator Author

DanTup commented Mar 11, 2024

@johnniwinther I built that locally and my new tests are now passing, so it looks good to me - thank you! 🙂

@DanTup
Copy link
Collaborator Author

DanTup commented Mar 12, 2024

Considering this fixed by 0a68ca9, I'll open more specific issues if I hit any other issues. Thanks :)

@DanTup DanTup closed this as completed Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. cfe-feature-macros Implement macros features in the CFE front-end-kernel vm-debugger
Projects
None yet
Development

No branches or pull requests

8 participants