Dear Einstein developers,
as the new nVidia Pascal generation was reviewed, I stumbled across the fact the ever since Maxwell all of their GPUs support HyperQ, a feature which was only limited to the highest end GPUs in the Kepler generation. So by now the feature should be broadly available in the DC community. Could this be of use here?
I imagine the following: the CUDA app detects whether the GPU supports HyperQ or not (easy with the compatibility level). If not: nothing changes. If so: HyperQ is used to submit the same Einstein jobs as before into 1 of the 32 compute queues available with that feature.
What's the benefit? Better load balancing and ultimately ressource utilization of the GPUs. Should a user decide to run more than 1 concurrent Einstein WU, those WUs wouldn't task-switch in round-robin fashion but rather truely run simultaneously, albeit on dynamically assigned different SMs. I expect thereby a higher peak GPU utilization is possible while needing less concurrent WUs to achieve. And if other projects adapt the scheme one may be able to better mix different projects. I.e. Einstein or SETI gobble up most of the GPU memory bandwidth they get, whereas a project like POEM (hardly needs any bandwidth by GPUs standards) utilizes the shaders which the 1st project can't use due to the bandwidth limitation.
What do the programming experts say? Am I missing something here or could this work as easily as I imagine?
MrS
Scanning for our furry friends since Jan 2002
Copyright © 2024 Einstein@Home. All rights reserved.
Not going to lie, if such a
)
Not going to lie, if such a feature were to improve GPU utilization I'd be quite happy about it.... Doubly so if I could run Einstein@Home as well as POEM@Home in the same rig simultaneously on a single GPU, and not get any errors because of this!
I'm just a dreamer that's good at breaking things, and occasionally fixing them though ^_^;;;
Is the option just to fill in
)
Is the option just to fill in the graphics queue with compute work? Some of my GPUs do not even have a monitor or drive any graphics so there would be no benefit from what I see. Or was that just one example from Anand? Some projects just utilize a GPU more efficiently/completely than others due to the type of work or coding. FAH and Collatz peg GPU utilization while some other projects need 2-3 concurrent units just to get to 90%.
I can run multiple E@H WUs at once. If there are errors running different projects at once I would guess it's due to clock frequency or different code sets causing conflicts rather than work actually interfering in the compute cores.
It wouldn't matter if you
)
It wouldn't matter if you have a monitor connected or not, it's just about using several compute queues for GPUs. Currently, as far as I know, only a single queue is used and if you run WUs "concurrently" they're actually sending their work into the same queue alternatingly, so that at any point in time only work from one WU is being run.
It's clear that different projects cause different amounts of GPU utilization. And where errors upon mixing projects come from I don't know - it's very probably somewhere in the software stack. If this HyperQ would work as I expect, you might be able to achieve e.g. 98% GPU utilization by running just 2 Einsteins concurrently, instead of 90% running 4 WUs. Don't take those numbers for granted, though - I actually only want to argue that it might get better.
MrS
Scanning for our furry friends since Jan 2002