Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vkEnumeratePhysicalDevices should return the primary GPU first #153

Closed
ShabbyX opened this issue Feb 27, 2019 · 28 comments
Closed

vkEnumeratePhysicalDevices should return the primary GPU first #153

ShabbyX opened this issue Feb 27, 2019 · 28 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@ShabbyX
Copy link
Contributor

ShabbyX commented Feb 27, 2019

In systems with multiple GPUs present, it's often the case that the Vulkan application picks the first physical device as reported by vkEnumeratePhysicalDevices (more often than not because multi-GPU systems are rare, and the choice is not necessarily obvious).

In my PC, I just enabled IGD MultiMonitor from the bios, while keeping PEG as the primary graphics output. That means my primary output is my Nvidia card (and that's what my actual desktop uses), but I can still use the Intel integrated GPU if I want.

However, on Linux (not tested elsewhere) vkEnumeratePhysicalDevices orders the physical devices with the integrated GPU first, and the Nvidia card second. While it's difficult to assign any order to the system GPUs in general, I believe the primary GPU used by the system should appear first in the list of physical devices.

@lenny-lunarg
Copy link
Contributor

You're not the first person to suggest this, but it's difficult to implement on all platforms. We definitely have some plans to order physical devices on Windows, but we don't have plans for this on Linux.

The main question on Linux is how do you determine the order? On newer Windows systems, there's a query that will give you sorted DXGI adapters, and then you can use a Vulkan query to associate the adapters with the Vulkan physical devices. But on Linux, we need not only a way to get the preferred GPU from the OS, but also a way to correlate that handle with the Vulkan physical device. I'm not aware of any such queries.

Ultimately, unless you can point me to some way of determining the proper order, my response is going to be "sorry, but we don't have the ability to do that". If you do know a way to fetch a good order, then I would definitely consider doing this at the same time we start sorting physical devices on Windows.

@lenny-lunarg lenny-lunarg self-assigned this Feb 27, 2019
@ShabbyX
Copy link
Contributor Author

ShabbyX commented Feb 27, 2019

Thanks for the explanation. I'll see if I can figure something out.

I don't understand this part though: a way to correlate that handle with the Vulkan physical device. Isn't the Vulkan physical device made from such handles? Or is Vulkan trying out devices based on .so files without asking the OS for a list of them?

@lenny-lunarg
Copy link
Contributor

I don't understand this part though: a way to correlate that handle with the Vulkan physical device.

The key here is that the loader and the driver are two separate components that can only communicate in specific ways. The loader usually dispatches Vulkan calls to the drivers, and in the case of vkEnumeratePhysicalDevices (and other calls) it's responsible for aggregating the results from multiple drivers. But it almost never implements an API call by itself. As a result, the only handle that the loader has for a physical device is whatever the driver gives it. In practice I believe that this is usually a pointer to a driver structure (and that structure probably does contain the handles), but there's no guarantees that this is the case and every driver does it a little differently. The only way for the loader to figure out what that handle actually is, is to use the same Vulkan API queries that an application can use.

In the Windows case, the loader can call vkGetPhysicalDeviceProperties2 with a VkPhysicalDeviceIDProperties struct in the pNext chain so that it can figure out the adapter for each physical device. I suspect that a different field in the same struct would let us correlate a VkPhysicalDevice to the system handles, but I'd need to know what we get from a query before I can guarantee this.

Since I really anticipate getting the order being the bigger problem I'd worry more about getting the order. If we really needed to, we could probably also write a new version of the loader/driver interface so that we can get the underlying platform handles. I'd prefer to avoid that, since that makes this a much bigger change since then we'd need to get agreement with the various vendors, and then they'd need to implement a corresponding change.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Feb 28, 2019

Interesting. I was thinking along the same lines that vkEnumeratePhysicalDevices could use vkGetPhysicalDeviceProperties2 to get the vendor/product id of the GPUs and use that to match what the operating system reports as the primary GPU.

And regarding just that, I haven't yet found a succinct solution, but as glxinfo can tell which GPU is primary, digging deep there could reveal a solution. If that involves talking to the window system, would that be a blocker?

@KarenGhavam-lunarG KarenGhavam-lunarG added the enhancement New feature or request label Mar 28, 2019
@misyltoad
Copy link
Contributor

misyltoad commented Jul 24, 2019

On Windows:

The order doesn't match that returned by DXGI nor what the user would want/expect as their default GPU.

I have two GPUs, an Vega 56 and a RTX 2060 Super.

Despite the Vega 56 being in the first PCIe slot, being set as my High Performance and Default GPU in Windows, and being the first one enumerated by DXGI->EnumAdapters... In Vulkan, the first adapter enumerated is always the 2060 Super, no matter what I change.

@jjulianoatnv
Copy link
Contributor

There is an existing issue in the Khronos-private gitlab issue tracker, issue 1414, that addresses the Vulkan loader sorting the Windows' list of vkPhysicalDevices such that the OS's "preferred display adapter" is first in the list of vkPhysicalDevices. This issue aspires to make the Vulkan loader sort the preferred vkPhysicalDevice first, similar to how DXGI sorts the preferred adapter first. (vkPhysicalDevice is analogous to DX adapter)

While waiting for the loader to behave in this manner, an application could do the following to find the vkPhysicalDevice that matches the the OS' preferred GPU:

This is how the aforementioned issue proposes that the Vulkan loader discover which vkPhysicalDevice to sort first in the list of vkPhysicalDevices. Until this has been implemented, sorting is done by a different technique that predates support within the Windows OS for selection of preferred adapter, and that is why you are seeing a discrepancy between DXGI and Vulkan on the latest version of Windows 10.

That's for windows. For linux there is no equivalent to EnumAdapterByGpuPreference driven by an OS-owned way to select preferred GPU. Something else will have to be done for linux, TBD.

@jjulianoatnv
Copy link
Contributor

@Joshua-Ashton , your 2060 Super is sorted first based on GPU preference in the NVIDIA control panel. As mentioned in my previous comment, the sorting technique being used on your computer predates the newer OS-owned GPU selection thing. Sorting will switch to the new technique after an improved Vulkan loader has been published, where the Vulkan loader takes responsibility for sorting based on the OS' preference. The OS changes shipped first, now you are exposed.

If your driver still has the "optimus" GPU selection UI in the NVIDIA control panel, you can influence sorting by changing that setting, like is the case on older OS versions. (I'm unclear which driver versions combined with which OS versions do or do not have the legacy GPU sorting thing in the control panel.)

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Jul 26, 2019

On Linux, when you start an OpenGL application, how does it know which device to use? Couldn't Vulkan use the same mechanism to put that device first? There could be an answer in mesa?

@misyltoad
Copy link
Contributor

misyltoad commented Jul 26, 2019

While waiting for the loader to behave in this manner, an application could do the following to find the vkPhysicalDevice that matches the the OS' preferred GPU: [...] Until this has been implemented

If you do end up using this as the solution, please use LoadLibrary and ensure you load the system DXGI to do this, otherwise you will break DXVK!

https://github.com/doitsujin/dxvk

@Plagman
Copy link

Plagman commented Jul 26, 2019

@lenny-lunarg comment above is valid, has work already started down that path?

@jjulianoatnv
Copy link
Contributor

I see your concern about use of DXGI. We have to use DXGI to find the preferred adapter according to the new OS support, because AFAIK that's the only way the OS exposes this data.

Not talking about DXGI swapchain. Talking about a DXGI query to discover AdapterLUID of the preferred adapter.

@misyltoad
Copy link
Contributor

misyltoad commented Jul 26, 2019

Yes but we replace the entire dxgi.dll so you cannot link to dxgi.lib directly. Use LoadLibrary to get the system DXGI (C:\Windows\System32\dxgi.dll [remember to escape the \'s]) and then GetProcAddress CreateDXGIFactory1 from that.

Otherwise whenever the vk loader gets called we're going to go vk->dxgi->vk->dxgi->... until a crash happens.

@lenny-lunarg
Copy link
Contributor

I will mention that I was already planning on using LoadLibrary to get dxgi and getting the entry points through GetProcAddress. This is being done to provide support for older systems where the new functions may not be present. It sounds from @Joshua-Ashton's comment that this should work fine as long as we follow that pattern. Is that correct?

@misyltoad
Copy link
Contributor

Yes, it should do. Thanks for the clarification. 👍

@misyltoad
Copy link
Contributor

... as long as you don't just query dxgi.dll and query C:\Windows\System32\dxgi.dll

@lenny-lunarg
Copy link
Contributor

Alright, I have been simply querying dxgi.dll, but I don't see any reason why I can't change it to use the full path. I'll make sure I do that.

@misyltoad
Copy link
Contributor

Cheers

@jjulianoatnv
Copy link
Contributor

jjulianoatnv commented Jul 26, 2019

The Vulkan loader is middleware that is loaded into a process .exe that is controlled by someone else. For security, middleware should pass absolute pathnames to LoadLibrary. Otherwise it is subject to a dll injection vulnerability. This is another reason to use absolute pathname. We already fixed a related issue for the linux build of the Vulkan loader. Don't want to re-add on Windows a type of vulnerability that has been fixed previously.

You need to query the OS for the path to system32. Don't hard-code a string literal because the path is configurable by the user.

@misyltoad
Copy link
Contributor

Yeah, you can use GetSystemDirectory for this.

@jicama
Copy link

jicama commented Aug 13, 2019

For Optimus systems, I'd think choosing a Discrete GPU over the Integrated GPU would do the trick.

@kwizart
Copy link

kwizart commented Dec 2, 2019

I think libdrm should be the tool to do that. But I'm not sure libdrm might know how to deal with optimus. libglvnd might be more appropriate for this...

@kbrenneman
Copy link

To answer the earlier question about OpenGL on Linux, that's what libglvnd does. For GLX, libglvnd asks the X server for the name of the driver to load for each X screen, and then the driver can figure out internally which device it should use. For EGL, libglvnd just goes through each driver until it finds one that can hand back a valid EGLDisplay handle.

In both cases, libglvnd and the driver know what window system they're talking to.

For Vulkan, vkEnumeratePhysicalDevices is independent of any windowing system. You might have no window system at all. Or you might have multiple window systems running at the same time, each with a different notion of a "primary" device.

@kbrenneman
Copy link

Also, I've been working on adding something to libglvnd to make it use an alternate driver and device for particular applications (based on environment variables and/or a config file) specifically for Optimus-style systems.

Ideally, it should be possible for Vulkan to follow the same configuration so that a user can just configure an application without having to care which API the app uses. I'm not sure how best to make that happen, though.

Maybe an extension to let an application ask for a preferred VkPhysicalDevice handle for a given display connection? And then sorting or filtering devices to deal with older applications?

@lenny-lunarg lenny-lunarg added this to the P2 milestone Apr 7, 2020
@sylveon
Copy link

sylveon commented Aug 4, 2020

This is super edge-case but I've been encountering this issue too.

My configuration is the following: 2x Vega FE (in Crossfire) and a separate Vega 64. The display output (I only have 1 monitor) is connected to the first Vega FE and no other card has any display connected to it.

The Vega 64 is sorted as GPU 0 in Vulkan, and because of X570 chipset restriction, can only have a PCIe x4 link to the system.

This means that Vulkan games defaulted to the Vega 64, and took a significant performance hit due to the required copies to the main GPU and generally smaller bandwidth bus.

image

With some games like Doom Eternal I was able to workaround the issue by adding +r_physicalDeviceIndex 1 to the launch arguments, but other games did not have this option or their built-in GPU selector was not working (for example No Man's Sky always used the Vega 64 despite me selecting a Vega FE in the in-game settings).

Games that used DirectX all correctly defaulted to my primary Vega FE.

It's obviously not a perfect solution, but if a system has a single monitor, putting the GPU which is responsible for output on that monitor first would fix most people's issues.

@ShabbyX
Copy link
Contributor Author

ShabbyX commented Aug 5, 2020

For Vulkan, vkEnumeratePhysicalDevices is independent of any windowing system. You might have no window system at all. Or you might have multiple window systems running at the same time, each with a different notion of a "primary" device.

I see your point. I still think there's value if the Vulkan loader tried things out, like "pick any running window system, ask for primary device". That should cover the majority of Linux users.

Alternatively, instead of asking the X server what device it prefers, why not take the code that X uses to determine the primary device and do the same in the Vulkan loader?

@kbrenneman
Copy link

Guessing a default for a windowing system isn't necessarily a bad idea. Looking at the $DISPLAY environment variable if it's set and at the default Wayland socket if it exists would probably cover the vast majority of graphical applications. In both cases, you'd need to define a protocol to query the display servers.

That still seems like something that would make more sense in a layer, though. There's enough fuzziness in defining the "correct" answer that putting it in something modular like a layer seems like a good idea.

@lenny-lunarg
Copy link
Contributor

Sorry for not replying to this issue at the time, but this is now supported (admittedly with some limitations) on Windows. It does require driver support, so only physical devices whose driver can support the new interface will be sorted, but we can't sort all of them without the new API. As for Linux, we just don't have a reliable Linux API to get the sorting, so I can't promise anything now or in the future. We may be able to add it, but I'm making no promises. As such I'm closing the issue.

@jl452
Copy link

jl452 commented Oct 10, 2020

@lenny-lunarg can you add filter string for chose videocard (for linux)? like it done in dxvk "DXVK_FILTER_DEVICE_NAME=PITCAIRN"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests