Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set external linkage on Rust functions defined with foreign ABIs. #9945

Merged
merged 0 commits into from
Nov 2, 2013

Conversation

miselin
Copy link
Contributor

@miselin miselin commented Oct 19, 2013

This fixes an issue where extern "C" functions would have internal linkage, making them invisible to anything that might be linked against the resulting object. In my use case, a Rust function is being called from assembly, and because the Rust symbol did not have the right linkage, I ran into linker errors.

This has also now been applied to bare Rust functions as well as statics that are publicly visible; that is, those which have a fully public path.

@alexcrichton
Copy link
Member

This could currently be happening for one of two reasons:

  1. You're building a binary. All functions are flagged as internal in this case (or at least rust functions, not sure if all functions are caught).
  2. You're building a library and this function isn't exported in the sense that another crate can use it.

I'm hesitant to land this without any tests. Linkage is something which is becoming quite tricky for us with lots of various cases here and there. I would expose it under a symbol with a particular name and then have that in an extern block in another test/file to make sure that it links right.

For the test, also be sure that it fails on current master and then passes afterwards. No use in committing tests that always work!

I think that this location for setting the linkage is fine for now. I think in general this needs to be cleaned up (because linkage is set in so many different places), but that's another problem for another time.

@miselin
Copy link
Contributor Author

miselin commented Oct 19, 2013

Another use case is in building a library that needs to use symbols in the main binary it is linked against. For example, a plugin opened with dlopen might need to find symbols in the main binary. (edit: I just read this again and realised it's basically a roundabout way of saying your first reason up there...)

I have a test that shows the broken behaviour on untouched master and I am just confirming now that this changeset does in fact solve the problem.

In the longer term ideally there'd be a function attribute for this purpose that can define the linkage of the function, such as #[linkage(weak)], #[linkage(global)], and so on.

I'll bump the next commit through once I'm happy with the test.

@thestinger
Copy link
Contributor

@pcmattman: Are you compiling the Rust code with --lib? If you're linking in extra code, you need to be, even if it defines main.

@@ -376,6 +376,9 @@ pub fn register_rust_fn_with_foreign_abi(ccx: @mut CrateContext,
node_id,
lib::llvm::CCallConv,
llfn_ty);
unsafe {
llvm::LLVMSetLinkage(llfn, lib::llvm::ExternalLinkage as u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this, and let's actually move this to base::get_item_val and you don't have to use the raw LLVM function (there's a helper lib::SetLinkage I think)

We currently have linkage sprawling throughout, and good to try to keep it in one place!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - much nicer in base::get_item_val. Using the helper now as well.

PR updated and squashed into one commit as well.

@alexcrichton
Copy link
Member

After thinking about this, I'm not convinced that this is the correct approach to take. I think that you should be able to declare internal extern "C" functions, which this does not allow. On the other hand, I think that our visibility system is an excellent method of exporting functions from a library (or keeping them hidden).

I think that the real bug in this situation is that when you're compiling an executable all functions are flagged as internal except for main. I think that in this case normally exported extern functions are incorrectly being flagged as internal.

@thestinger
Copy link
Contributor

@alexcrichton: clang has the same behaviour when compiling with -flto, extern is about the ABI not anything to do with visibility/linkage

@brson
Copy link
Contributor

brson commented Oct 21, 2013

Agree we should not be setting linkage based on the extern keyword.

@miselin
Copy link
Contributor Author

miselin commented Oct 21, 2013

Sure, no worries, I'm happy to consider alternative approaches. In the meantime I can solve the problem in my own project with objcopy --globalize-symbol, so there's no rush :-).

With respect to the visibility system, with this change applied linkage is overridden to internal (implicitly, eek!) if the function is not visible. That is, if a foreign ABI function is not visible to the "parent" of the crate root, it will be emitted with internal linkage.

An example (bin crate):

main.rs
pub mod foreign1;
pub mod foreign2;
mod foreign3;

foreign1.rs
extern "C" foreignabi1() -> int {
     5
}

foreign2.rs
pub extern "C" foreignabi2() -> int {
     5
}

foreign3.rs
pub extern "C" foreignabi3() -> int {
     5
}

Only foreign2::foreignabi2 will be linked with external linkage. foreign1::foreignabi1 will be linked with internal linkage as it is not pub, even though it is extern "C". foreign3::foreignabi3 will be linked with internal linkage as it is not visible to main's parent, even though the function is defined pub extern "C".

However, as mentioned this internal linkage override is implicit - this patch does not actually explicitly check function visibility before setting external linkage. It would probably be best to make that behaviour explicit for foreign ABI functions, if it ends up being desirable.

@alexcrichton
Copy link
Member

For now I think that our visibility system is sufficient enough to encode what should be exported and what should be kept internal. I thought that the code example you had compiles today, but if it doesn't then that's definitely a bug!

@miselin
Copy link
Contributor Author

miselin commented Oct 22, 2013

Apologies, my previous comment was not incredibly clear. To clarify what I was trying to describe slightly (also noting that some of the syntax in my previous comment is not correct - more psuedocode-y), without this pull request applied, nm run on the output of rustc -o main main.rs gives (on OSX):

0000000100000c00 t _foreignabi1
0000000100000c80 t _foreignabi2
0000000100000d00 t _foreignabi3
0000000100000dc0 T _main

The #[no_mangle] attribute was applied to this example before building. Note the lowercase t indicating private.

Now, with this PR applied, nm outputs:

0000000100000c00 t _foreignabi1
0000000100000c80 T _foreignabi2
0000000100000d00 t _foreignabi3
0000000100000dc0 T _main

So everything still compiles with or without - no bug there :-) - just an issue with symbol linkage.

@thestinger
Copy link
Contributor

The extern marker isn't associated with anything special in terms of linkage. If you want the functions to not be marked private, you can use rustc --lib -c and then link together an executable with the other parts of the crate. Rust is either building a library crate (many exposed symbols) or a executable crate (just an entry point exposed) - all of the options are still available through that.

If it's private when building a library but the path is fully public, that's a bug we should fix (without making it exposed from an executable).

@miselin
Copy link
Contributor Author

miselin commented Oct 22, 2013

Good point. No, this is not with building a library, so all behaviour is as-expected and there is not a bug.

I presume foreign ABI functions in a bin crate would be expected to be used as parameters to functions in C libraries as the like, rather than called by plugins or other external entities?

@thestinger
Copy link
Contributor

@pcmattman: Yeah, it's just about the ABI. It's confusing that we use the same keyword for declaring blocks of external functions.

We desperately need to offer other forms of building crates (static linking, whole program optimization), but I think exposing a symbol from an executable is always going to be an edge case where you probably end up using external compiler/linker calls anyway to link in whatever is going to call that function.

@eddyb
Copy link
Member

eddyb commented Oct 22, 2013

There is a case where building a binary requires externalizing a symbol, without compiling with -c and linking the object file against some other object file importing Rust functions.

That is, if you don't link against a library providing the mem* functions (memcpy, memmove, memset, etc.), LLVM can still generate calls to those functions.
You can implement them in Rust, but if the symbol is not external, it will either get removed or it won't be recognized by LLVM as the target of those calls.

This is also the only case that can't be solved by moving everything to Rust (to remove the need for exporting Rust symbols to C or assembly).

@thestinger
Copy link
Contributor

I don't really understand why it would have to be external for that case. Why wouldn't the calls to memcpy be able to find an internal definition of it? As a workaround, they could be defined in a crate once we have static linking support.

The current flagging of all functions as internal in a binary could be replaced by reachability-based visibility if it took into account the differences between executables and libraries. In a library, it would have to treat any calls from #[inline] functions as exposing those functions but not in an executable.

@miselin
Copy link
Contributor Author

miselin commented Oct 22, 2013

I also don't understand - I'm a bit confused now. It shouldn't be necessary to externalise a symbol such as memcpy that is used only within the crate as the linker should still be able to use internal symbols at that stage. The root issue that led me to create this PR was making foreign ABI functions available externally in a bin crate - symbol references inside a crate shouldn't be a problem.

Is there a code example that shows a binary failing to build because another symbol inside the crate does not have external linkage?

@miselin
Copy link
Contributor Author

miselin commented Oct 22, 2013

Also, FWIW objcopy --globalize-symbol doesn't work to make a symbol external after compiling an object file if it's been removed as 'dead, uncalled, unreferenced' code, because the symbol doesn't exist.

  • External linkage is enough to label the code as 'possibly called/referenced', but only makes sense on functions where it is the desired outcome for them to be exported and visible outside the bin crate.
  • clang/gcc offer __attribute__((used)) for this purpose, which does not affect linkage. This emits a hidden reference (with no side-effects) to the symbol, allowing a function/variable to be maintained and emitted even if it doesn't seem to be referenced, without affecting its visibility.

Ahh, the wonderful rabbit hole of linkages, visibility, and reachability. :-)

@eddyb
Copy link
Member

eddyb commented Oct 22, 2013

Relevant function from clang's CodeGen.
Would something like #[used], with an equivalent implementation, be accepted in rustc?
(I guess we should ask questions first and shoot later, unlike my last PR)

@thestinger
Copy link
Contributor

I don't think #[used] is the right solution here either. Fully implemented reachability-based visibility with knowledge of the difference between executables and libraries for dealing with #[inline] seems best.

@eddyb
Copy link
Member

eddyb commented Oct 22, 2013

mem* wouldn't be reachable without a hack, like what clang does for __attribute__((used)), because they would be used solely by intrinsic codegen, which comes after the function can get removed.
A similar thing happens when calling a function from inline assembly (asm!("call foobar"); on x86), which is the main use case mentioned in the GCC documentation for __attribute__((used)).

EDIT: Also see this test from my last PR, it should also work with #[used] instead of #[linkage(external)].

@thestinger
Copy link
Contributor

I'm suggesting making fully public paths in an executable external again.

@alexcrichton
Copy link
Member

I thought that this was going to make fully-public extern items external in an executable build? You can't technically interface with a bare fn that's not extern because the ABI isn't necessarily stable (I think?)

@thestinger
Copy link
Contributor

It's making non-fully-private items external too right now. Until the reachability check actually works and takes into account that #[inline] in an executable isn't exposed, it takes away the ability to keep symbols internal.

I really don't understand why visibility would be tied to something exposing a C ABI rather than a Rust ABI. All normal crates are exposing Rust ABI functions.

There's nothing stopping interfacing with a Rust ABI function, we just don't guarantee that the ABI remains the same between Rust versions. A library exposing a stable ABI has to go out of the way to make sure none of the types is the signature is ever changed, such as never adding a private field to one.

@miselin
Copy link
Contributor Author

miselin commented Oct 22, 2013

@thestinger what do you mean by non-fully-private in this context?

@thestinger
Copy link
Contributor

As in this patch results in functions with paths that are not fully public being exposed as external symbols. It removes the ability to expose only the entry point.

@miselin
Copy link
Contributor Author

miselin commented Oct 23, 2013

@thestinger is this in reference to the example in my past comment (at #9945 (comment)), or are there additional cases I haven't considered where functions that are not fully public are exposed by this patch?

@thestinger
Copy link
Contributor

mod foo {
    pub extern "C" fn bar() {}
}

pub extern "C" fn baz() {
    foo::bar()
}

fn main() {}

@miselin
Copy link
Contributor Author

miselin commented Oct 23, 2013

Yup, you're right - after applying #[no_mangle] to make reading the output from nm easier:

0000000100000cd0 T _bar
0000000100000d50 T _baz
0000000100000e10 T _main

bar now has external linkage and it shouldn't.

An example of possible desired output would be:

0000000100000cd0 t _bar
0000000100000d50 T _baz
0000000100000e10 T _main

(as bar is not fully public)

@miselin
Copy link
Contributor Author

miselin commented Oct 23, 2013

Okay, I broke the PR with my last push (... figuring that out now... - update: fixed!), but I now have the following output from nm:

0000000000400840 t bar
00000000004008c0 T baz
0000000000400980 T main

(different addresses as this was done on a different host)

@miselin
Copy link
Contributor Author

miselin commented Oct 24, 2013

The latest commit adds a compile-fail test and updates the run-pass test to handle the addition of bare functions and statics to the set of things that get external linkage when they are publicly visible.

@miselin
Copy link
Contributor Author

miselin commented Oct 24, 2013

Fixed broken test - looks like my testing of the tests used a stale file? Full testsuite (make check) is confirmed to pass locally now.

@miselin
Copy link
Contributor Author

miselin commented Oct 25, 2013

Tests failed as OSX doesn't successfully link libraries unless it can resolve all symbols at library creation time (or it has been passed -undefined dynamic_lookup or -U <symbolname>).

The point of failure is in building the auxiliary library rather than actually executing the test proper.

@miselin
Copy link
Contributor Author

miselin commented Oct 25, 2013

I have removed the compile-fail test as it seems to only be reliable on Linux (on OSX it breaks in building the auxiliary library rather than in trying to link against it).

It could be a run-fail test, doing a proper fail!() in a signal handler for the SIGTRAP the dynamic linker gives? It does this on OSX, not sure what ld.so does on Linux when this happens.

The run-pass test remains, and will fail if linkage breaks (in the case of OSX, it won't fail if linkage breaks until #10062 is resolved).

@miselin
Copy link
Contributor Author

miselin commented Oct 28, 2013

Most recent failure due to an undefined reference to __rust_crate_map_toplevel12743 when building stdtest on OSX.

@alexcrichton you mentioned yesterday in IRC this happens when crate_map() is called twice - is that still correct and relevant here, or is the undefined reference an indicator of an issue with this PR applied?

@alexcrichton
Copy link
Member

Once by changes land for extracting libuv to another crate, this should be ready for another go (although it will probably need a rebase).

@miselin
Copy link
Contributor Author

miselin commented Oct 30, 2013

This is now freshly rebased to latest master.

@alexcrichton
Copy link
Member

OK after thinking about this, I don't think that this is the right way to do things, and I don't think that the right way to do things will involve this sort of management regardless. I don't think that this should be merged now because I believe that this is going in the opposite direction from where it should.

The results of privacy should not be at all used inside of codegen. The privacy of a function is a difficult idea and with this change you would be forced to decide whether privacy or reachability is your visibility constraint. I still believe that we should have a unified "set the linkage of this LLVM value" function, but that may be very far off. For now, I believe that all of these changes should be moved inside of the reachability phase of the compiler instead of the codegen phase. Additionally, I think the check to see if the compiler is building a library and if so mark everything as internal should be moved to reachability as well. The best way to begin unifying how linkage visibility is determined is to begin using one source of information for visibility, and that source should be reachability.

So to merge this, I would rather see code removed from trans and moved to reachability (with appropriate checks for building a library or not) and then trans should "do the right thing".

Additionally, I'd like to see some more comprehensive tests no matter what comes out of this. I don't think that the way you've set up the tests right now is the best way to do it. It involves weird linker hacks on OSX, won't work at all on windows, and only works by luck on linux. Instead, I would be much more comfortable using the unstable::dynamic_lib library for determining whether a symbol was exposed or not. This should return null for all internalized symbols and non-null for any symbols which made it through as external.

Does that sound OK with you? I can open an issue for this as well if you don't have time to get around to this. I believe though that this is more future-proof along with moving the code in the compiler to where it should be anyway.

@miselin
Copy link
Contributor Author

miselin commented Oct 30, 2013

Sounds OK to me - now's the time to get it right, rather than try and fix it later.

I should be able to get around to this, assuming there's no significant urgency (~1-2 weeks worst-case, maybe?). Happy to release to somebody else though, if time is of the essence.

@alexcrichton
Copy link
Member

You can certainly claim it! We're certainly in no rush for this, and if you need any help feel free to ping me!

@bors bors merged commit e42e378 into rust-lang:master Nov 2, 2013
@alexcrichton
Copy link
Member

Hm, so github thinks that this pull request has an infinite number of commits. It sounds like this may need to get reopened. (not sure why it's closed in the first place, but something looks awry).

@miselin
Copy link
Contributor Author

miselin commented Nov 2, 2013

Oooer that does not look right. I must've accidentally pushed origin/master into the remote. I'll have a look at figuring out why this is broken when I pick it back up again.

Should never have done the work on master in the first place :-)

flip1995 pushed a commit to flip1995/rust that referenced this pull request Dec 1, 2022
…ogiq

Re-enable `uninlined_format_args` on multiline `format!`

fix rust-lang/rust-clippy#9719

There was an issue with the code suggestion which can be sometimes completely broken (fortunately when applied it's valid), so we do not show it.

changelog: [`uninlined_format_args`] re-enable for multiline format expression, but do not show the literal code suggestion in those cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants