Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce ref assemblies from command-line and msbuild #17558

Merged
merged 14 commits into from
Mar 21, 2017

Conversation

jcouv
Copy link
Member

@jcouv jcouv commented Mar 5, 2017

Relates to #2184

There are two command-line parameters (/refout: and /refonly) which map to CommonCommandLineArguments.OutputRefFileName and EmitOptions.EmitMetadataOnly.

If /refonly is specified, the ref assembly will be output as the primary output (to the /out path) and no /refout path is allowed. (This maintins the previous behavior for the EmitOptions.EmitMetadataOnly flag)
Also, if /refonly is specific, the no PDB will be generated (despite any command-line parameters to the contrary).

If /refout is specified, then the assembly is output as the primary output (to /out path) as usual, and the ref assembly is output to /refout path.

Background:

The compiler already supports emitting "metadata-only" assemblies. This is used by the IDE for cross-language dependencies. Those include private members, but do not include any method bodies.

We will be introducing a second concept, which is "reference assemblies" (also called skeleton assemblies). Those are a further stripped down versions metadata-only assemblies (with some caveats, they won't include private types or members, for instance). They will be used for build scenarios. Producing ref assemblies will be driven by the /refonly and /refout parameters.
The guiding principle of ref assemblies is that they should have the minimum amount of stuff to preserve the compile-time behavior of consumers.

But, as a starting point (to unblock work on msbuild), we are using metadata-only assemblies as a close approximation of ref assemblies. That gives us many of the benefits of ref assemblies at much lower cost. As we narrow down what is included in ref assemblies (see #17612), the two concepts will diverge. When they do, we will likely need specific emit or compilation options (TBD) to indicate that we're producing ref assemblies.

Filed #17612 for refining what changes and diagnostics should affect ref assemblies.

@jcouv jcouv self-assigned this Mar 5, 2017
@jcouv jcouv force-pushed the msbuild-refdll branch 7 times, most recently from 6a721ff to 349610e Compare March 7, 2017 00:42
@gafter gafter added the PR For Personal Review Only The PR doesn’t require anyone other than the developer to review it. label Mar 7, 2017
@jcouv jcouv mentioned this pull request Mar 7, 2017
34 tasks
@jcouv jcouv force-pushed the msbuild-refdll branch 4 times, most recently from f8a0f4a to 4529afb Compare March 9, 2017 17:58
default:
return handle.Kind.ToString();
}
}

// PROTOTYPE it would be better to shared the implementation with PdbToXml
Copy link
Member Author

@jcouv jcouv Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will not check this in as-is. I'll try to factor this common code. #Resolved

@jcouv jcouv changed the title [Not ready for review] Produce ref assemblies from command-line and msbuild Produce ref assemblies from command-line and msbuild Mar 9, 2017
@jcouv jcouv removed the PR For Personal Review Only The PR doesn’t require anyone other than the developer to review it. label Mar 9, 2017
if (!CompileAndEmitRefAssembly(compilation, emitOptions.WithEmitMetadataOnly(true),
finalRefPeFilePath, touchedFilesLogger, diagnosticBag, cancellationToken))
{
return Failed;
Copy link
Member Author

@jcouv jcouv Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should still write touched files (see below) even if some part this failed? What do you think? #Closed

@jcouv
Copy link
Member Author

jcouv commented Mar 9, 2017

@dotnet/roslyn-compiler Please take a look at your convenience.
If possible, let's focus on tests, validation and scenarios first.
I'll port the remaining tests to VB once we have agreement on C# side. #Resolved

@jcouv jcouv changed the base branch from master to features/refout March 9, 2017 18:28
@AlekseyTs
Copy link
Contributor

AlekseyTs commented Mar 9, 2017

It looks like there are some legitimate test failures. #Closed

@jcouv
Copy link
Member Author

jcouv commented Mar 9, 2017

Sorry about that. Fixed now. #Resolved

@@ -86,6 +86,11 @@ public abstract class CommandLineArguments
public string OutputFileName { get; internal set; }

/// <summary>
/// Path of the output ref assembly or null if not specified.
/// </summary>
public string OutputRefFilePath { get; internal set; }
Copy link
Contributor

@AlekseyTs AlekseyTs Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OutputRefFilePath [](start = 22, length = 17)

We are not adding RefOnly in this PR? #Closed

Copy link
Member Author

@jcouv jcouv Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RefOnly is in EmitOptions. #Resolved

Copy link
Member Author

@jcouv jcouv Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using EmitOptions.EmitMetadataOnly. #Resolved

if (finalRefPeFilePath != null)
{
if (!CompileAndEmitRefAssembly(compilation, emitOptions.WithEmitMetadataOnly(true),
finalRefPeFilePath, touchedFilesLogger, diagnosticBag, cancellationToken))
Copy link
Contributor

@AlekseyTs AlekseyTs Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finalRefPeFilePath, touchedFilesLogger, diagnosticBag, cancellationToken)) [](start = 28, length = 74)

Consider shifting this line to the right. #Closed

@jcouv
Copy link
Member Author

jcouv commented Mar 21, 2017

FYI @AlekseyTs we had a test that was trying to generate netmodules with metadata only. Per our discussion, this is not useful and it is explicitly disallowed now (both from the command-line and the API). Updated the test to avoid that case.

@jcouv jcouv merged commit 235acb3 into dotnet:features/refout Mar 21, 2017
@jcouv jcouv deleted the msbuild-refdll branch March 21, 2017 18:55
@@ -2008,6 +2036,7 @@ internal void EnsureAnonymousTypeTemplates(CancellationToken cancellationToken)
IMethodSymbol debugEntryPoint = null,
Stream sourceLinkStream = null,
IEnumerable<EmbeddedText> embeddedTexts = null,
Stream metadataPeStream = null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadataPeStream [](start = 19, length = 16)

'e' should be upper case: metadataPEStream

I'd suggest a different name though. Something like contractAssemblyStream, or referenceAssemblyStream.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll capitalize.
As for the name, ref/contract assemblies are only one kind of metadata-only assemblies. That's why I used this name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I don't understand. Why would we capitalize PE, but not PDB?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PE is two letters, PDB is three. See .NET Design Guidelines

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other kinds of metadata-only assemblies are there? Perhaps metadataOnlyAssemblyStream?

@@ -2331,6 +2387,7 @@ internal void EnsureAnonymousTypeTemplates(CancellationToken cancellationToken)
DiagnosticBag metadataDiagnostics = null;
DiagnosticBag pdbBag = null;
Stream peStream = null;
Stream refPeStream = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refPeStream [](start = 19, length = 11)

referenceAssemblyStream

@@ -2318,6 +2373,7 @@ internal void EnsureAnonymousTypeTemplates(CancellationToken cancellationToken)
internal bool SerializeToPeStream(
CommonPEModuleBuilder moduleBeingBuilt,
EmitStreamProvider peStreamProvider,
EmitStreamProvider metadataPeStreamProvider,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadataPeStreamProvider [](start = 31, length = 24)

refrenceAssemblyStreamProvider

}
}

if (metadataPeStream != null && options?.EmitMetadataOnly == true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if [](start = 12, length = 2)

We should also disallow Embedded PDBs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in that case pdbStream will be null since the PDB is emitted to peStream)


In reply to: 107311874 [](ancestors = 107311874)

// If so, use the RVA of the already serialized one.
// Note that we don't need to rewrite the fake tokens in the body before looking it up.

// Don't do small body method caching during deterministic builds until this issue is fixed
// https://github.com/dotnet/roslyn/issues/7595
int bodyOffset;
if (!_deterministic && isSmallBody && _smallMethodBodies.TryGetValue(methodBody.IL, out bodyOffset))
if (!_deterministic && isSmallBody && _smallMethodBodies.TryGetValue(il, out bodyOffset))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_deterministic [](start = 17, length = 14)

This issue doesn't apply to "throw null" bodies, so perhaps we should actually reuse these bodies in reference assemblies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How confident are you that's the case? Aleksey wasn't sure we understood the root cause of this problem, in which case it is safer to apply the same workaround.

Copy link
Member

@tmat tmat Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%. The problem with determinism and body haring is related to string literals. There are no string literals in the "throw null" body.


// A map of method body before token translation to RVA. Used for deduplication of small bodies.
private readonly Dictionary<ImmutableArray<byte>, int> _smallMethodBodies;

private static readonly ImmutableArray<byte> ThrowNullIL = ImmutableArray.Create((byte)ILOpCode.Ldnull, (byte)ILOpCode.Throw);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThrowNullIL [](start = 53, length = 11)

Instead of hardcoding just the method IL, i'd suggest to hardocde the entire encoded method body byte sequence. It's gonna be just a single byte or two more and we can avoid some extra code in SerializeMethodBody.

(methodBody.LocalsAreZeroed ? MethodBodyAttributes.InitLocals : MethodBodyAttributes.None),
ref mvidStringHandle,
ref mvidStringFixup);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid all this clutter if we hardcode encoded "throw null" body instead of just the IL.

{
return false;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to move this to the caller that potentially produces uses both outputs? I think that would be just a single caller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I see the advantage. There is a single caller, but why have two methods with same name spread across two classes? Especially since this one is a thin wrapper on top of the existing one.

PdbPath = pdbPath,
EmitPdb = emitPdb,
EmitPdb = emitPdb && !metadataOnly,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emitPdb && !metadataOnly [](start = 26, length = 24)

Code above checks the value of emitPdb. So if we are gonna change it here the check above won't take the metadataOnly flag into account.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct and that's by design. The idea is that you can just add /refonly on a command-line and it will work, without having to remove the parameters related to emitting PDBs. For instance, you may have source links and we won't complain about it (one of the checks above).
Does that seem ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'd then add a comment.

/// Asserts that all the method bodies in the PE are empty or `throw null`, if expectEmptyOrThrowNull is true.
/// Asserts that none of the methods bodies are empty or `throw null`, if expectEmptyOrThrowNull is false.
/// </summary>
internal static void VerifyMethodBodies(ImmutableArray<byte> peImage, bool expectEmptyOrThrowNull)
Copy link
Member

@tmat tmat Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VerifyMethodBodies [](start = 29, length = 18)

This is an odd helper. Perhaps change to VerifyMethodBodies(peImage, Action<byte[]> ilValidator)


public string GetTypeFromDefinition(MetadataReader reader, TypeDefinitionHandle handle, byte rawTypeKind)
{
var typeDef = reader.GetTypeDefinition(handle);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var typeDef = reader.GetTypeDefinition(handle) [](start = 16, length = 46)

Doesn't capture nested types.


public string GetTypeFromReference(MetadataReader reader, TypeReferenceHandle handle, byte rawTypeKind)
{
var typeRef = reader.GetTypeReference(handle);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var typeRef = reader.GetTypeReference(handle); [](start = 16, length = 46)

Doesn't capture nested types.

Copy link
Member

@tmat tmat Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

public string GetSZArrayType(string elementType)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps make all these methods expression-bodied.

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment that might need addressing in the build tasks. That part is very subtle and mistakes there can mean problems for the IDE, so definitely think through it.

@@ -705,6 +718,8 @@ protected internal virtual void AddResponseFileCommands(CommandLineBuilderExtens
commandLine.AppendPlusOrMinusSwitch("/optimize", _store, nameof(Optimize));
commandLine.AppendSwitchIfNotNull("/pathmap:", PathMap);
commandLine.AppendSwitchIfNotNull("/out:", OutputAssembly);
commandLine.AppendSwitchIfNotNull("/refout:", OutputRefAssembly);
commandLine.AppendWhenTrue("/refonly", _store, nameof(RefOnly));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you intend this to be user-settable in projects, for example if I have some project with manual type definitions so I can make a reference assembly that way? If so, then you should consider moving this to AddResponseFileCommandsForSwitchesSinceInitialReleaseThatAreNeededByTheHost.

var readStream = OpenFile(readFilesPath, consoleOutput, mode: FileMode.OpenOrCreate);
var writtenStream = OpenFile(writtenFilesPath, consoleOutput, mode: FileMode.OpenOrCreate);

if (readStream == null || writtenStream == null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if one is created and the other isn't we leak the stream?

jcouv added a commit to jcouv/roslyn that referenced this pull request Mar 23, 2017
Microsoft.CodeAnalysis.CommandLineArguments.SourceLink.get -> string
Microsoft.CodeAnalysis.Compilation.CreateAnonymousTypeSymbol(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.ITypeSymbol> memberTypes, System.Collections.Immutable.ImmutableArray<string> memberNames, System.Collections.Immutable.ImmutableArray<bool> memberIsReadOnly = default(System.Collections.Immutable.ImmutableArray<bool>), System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Location> memberLocations = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Location>)) -> Microsoft.CodeAnalysis.INamedTypeSymbol
Microsoft.CodeAnalysis.Compilation.CreateErrorNamespaceSymbol(Microsoft.CodeAnalysis.INamespaceSymbol container, string name) -> Microsoft.CodeAnalysis.INamespaceSymbol
Microsoft.CodeAnalysis.Compilation.CreateTupleTypeSymbol(Microsoft.CodeAnalysis.INamedTypeSymbol underlyingType, System.Collections.Immutable.ImmutableArray<string> elementNames = default(System.Collections.Immutable.ImmutableArray<string>), System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Location> elementLocations = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Location>)) -> Microsoft.CodeAnalysis.INamedTypeSymbol
Microsoft.CodeAnalysis.Compilation.CreateTupleTypeSymbol(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.ITypeSymbol> elementTypes, System.Collections.Immutable.ImmutableArray<string> elementNames = default(System.Collections.Immutable.ImmutableArray<string>), System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Location> elementLocations = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Location>)) -> Microsoft.CodeAnalysis.INamedTypeSymbol
Microsoft.CodeAnalysis.Compilation.Emit(System.IO.Stream peStream, System.IO.Stream pdbStream = null, System.IO.Stream xmlDocumentationStream = null, System.IO.Stream win32Resources = null, System.Collections.Generic.IEnumerable<Microsoft.CodeAnalysis.ResourceDescription> manifestResources = null, Microsoft.CodeAnalysis.Emit.EmitOptions options = null, Microsoft.CodeAnalysis.IMethodSymbol debugEntryPoint = null, System.IO.Stream sourceLinkStream = null, System.Collections.Generic.IEnumerable<Microsoft.CodeAnalysis.EmbeddedText> embeddedTexts = null, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) -> Microsoft.CodeAnalysis.Emit.EmitResult
Microsoft.CodeAnalysis.Compilation.Emit(System.IO.Stream peStream, System.IO.Stream pdbStream = null, System.IO.Stream xmlDocumentationStream = null, System.IO.Stream win32Resources = null, System.Collections.Generic.IEnumerable<Microsoft.CodeAnalysis.ResourceDescription> manifestResources = null, Microsoft.CodeAnalysis.Emit.EmitOptions options = null, Microsoft.CodeAnalysis.IMethodSymbol debugEntryPoint = null, System.IO.Stream sourceLinkStream = null, System.Collections.Generic.IEnumerable<Microsoft.CodeAnalysis.EmbeddedText> embeddedTexts = null, System.IO.Stream metadataPeStream = null, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) -> Microsoft.CodeAnalysis.Emit.EmitResult
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to ping @jaredpar and @gafter on public API change in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants