Means to specify number of parallel CUDA tasks per card

Michael Karlinsky
Michael Karlinsky
Joined: 22 Jan 05
Posts: 888
Credit: 23502182
RAC: 0
Topic 195846

Hi.

It would be nice to get rid of the need to use app_info.xml to specify how many CUDA tasks are allowed to run on a CUDA capable card.

Either by adding a new project/BOINC preference setting or (semi-) automatically, e.g. determined by GPU memory available divided by GPU memory needed.

Kind Regards,

Michael

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 734223262
RAC: 1292360

Means to specify number of parallel CUDA tasks per card

Hi!

I agree it's a a pain setting up app_info.xml files for this purpose.

I have a feeling tho that this feature would be non-trivial in many respects.

At first I thought that this required, maybe, nothing more than some new applications being configured in the project (say BRP3_CUDA_x2, BRP_CUDA_x3,..) where all these apps use the same code but the metadata that the projects serve to the clients would indicate that the app uses only 1/2, 1/3 , ... of a GPU.

But then again, when people do specify 1/n GPU in their app_info.xml files, BOINC seems to always try to instantiate n apps .... even if the video memory is too small to support so many instances. I wonder wether this can be considered a bug of the BOINC client code?

Then again, the root cause of wanting to run several GPU tasks in parallel in the first place is the remaining CPU load in the computation that reduces the GPU utilization.

I'm currently looking a bit into speeding up the only remaining CPU calculation of the main loop. So maybe this feature is best considered only after I have finished my experiments, because *maybe* running parallel GPU tasks won't be that much of a speed up anymore with an improved version.

CU
HB

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2960995982
RAC: 696495

In the BOINC framework,

In the BOINC framework, there's also the need to consider the 'GPU-occupancy' when multiple projects are scheduled to have access to the same GPU. It took time to get there, but the later members of the BOINC v6.10.xx range eventualy got the ability to time-slice nVidia GPU projects (pre-empting a running task after the 'task switch interval', in the way that is familiar for CPU tasks). I believe BOINC v6.12.xx (currently v6.12.33 recommended) has the same ability for native-mode ATI projects - I don't own an ATI card, so I can't test personally - and the next range of clients, which has just gone into pre-alpha testing as v6.13.0, should eventually have the same ability for OpenCL projects.

But, many GPU tasks/projects run for less than the default TSI, and hence can't be pre-empted (or at least, they shouldn't be). What happens if you have two tasks running in half a GPU each? When one finishes, BOINC only has half a GPU free and available to schedule. Unless your other project can also work in a fractional GPU, life gets very difficult - BOINC tends to lock you into the fractional-GPU project for ever. Unless that problem can be resolved, I can't see BOINC having native fractional-GPU support any time soon - the committment to multi-project support is more fundamental.

FrankHagen
FrankHagen
Joined: 13 Feb 08
Posts: 102
Credit: 272200
RAC: 0

RE: At first I thought that

Quote:
At first I thought that this required, maybe, nothing more than some new applications being configured in the project (say BRP3_CUDA_x2, BRP_CUDA_x3,..) where all these apps use the same code but the metadata that the projects serve to the clients would indicate that the app uses only 1/2, 1/3 , ... of a GPU.

probably worth a try - anyone who want's to could opt in for CUDA_x2 or CUDA_xx.

Quote:
Then again, the root cause of wanting to run several GPU tasks in parallel in the first place is the remaining CPU load in the computation that reduces the GPU utilization.

OF COURSE!

it's just a pita to watch a GPU running with 30% load..

Thomas
Thomas
Joined: 27 Aug 11
Posts: 7
Credit: 7152760
RAC: 0

I'd vote for this feature as

I'd vote for this feature as well. Just got to know that one can run multiple WUs on a GPU. Now I need to figure out how to do that on my Linux rig (I just switched from Windows).

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250682743
RAC: 35161

What would be possible with

What would be possible with reasonable effort is a set of different plan classes one could select with an own project-specific prefs setting. This would basically transfer the responsibility for valid setting from the project & client to the user. It won't work appropriately on hosts with multiple, possibly different cards, on farms with more different cards than venues etc. And the setting may have to be changed manually with every change of the application and the workunits, i.e. the memory requirements.

I'm not sure that it's worth the effort on the project side if these limitations would remain.

BM

BM

cliff
cliff
Joined: 15 Feb 12
Posts: 176
Credit: 283452444
RAC: 0

Hi Bernd, 2nd that I

Hi Bernd,
2nd that I have a win7 rig with 2 GTX GPU's a GTX460 and a GTX560
With a 6 cor AMD cpu.
When I tried to double up on the GPU tasks other factors kicked in.
My rig crashed within seconds of Boinc picking up on the app_info.xml settings.
I suspect one or both of the followwing.
[1] a 750watt 80+ gold psu is not upto the task of handling the extra power required when 2 GPU's are maxed out.
[2] My houshold electricity suppy is off 15Watt socket outlet feeding a power strip extension that has not only my computer plugged into it, but also 2 printers and a 2nd monitor.
Since I also have 6x 1Tbyte sata hard drives and 2 small SSD's in the computer as well as 2 DVD/CD combo burners.
So the power drawn in normal use is fairly high, running a double load on the 2 GPU's was the straw that broke the camels back.
My mains supply also threw a wobbly at the same time:-( I Couldnt just reboot, my rig simply wasnt getting any power..
After a bit of faffing around and a new extension strip and a reboot, minus the app_info.xml entries for double GPU usage I had no problems.

However if the option was provided in Boinc to select the number of GPU concurrent tasks, there is a fair risk that anone else with a nominal PSU/Electrical supply would be hit with a similar problem. And since it would have been set in Boinc there would be no way to back out of the problem,
at least with an app_info.xml file a user can reboot in protected mode and edit that file before rebooting and having Boinc startup automatically.

To avoid bering locked into a mode of operation that constantly crashed a computer there would have to be some preset delay before boinc contacted the client software and it might have to be a fairly long one. Or the client would have to be manually started.

After all it wouldnt help the use of distributed computing if boinc was seen to send computers into an endless loop of reboots without being able to cure the problem other than a protected mode removal of the boinc directory structure:-(

Regards,

Cliff,

Been there, Done that, Still no damm T Shirt.

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2699403
RAC: 0

RE: Hi Bernd, 2nd

Quote:
Hi Bernd,
2nd that I have a win7 rig with 2 GTX GPU's a GTX460 and a GTX560
With a 6 cor AMD cpu.
When I tried to double up on the GPU tasks other factors kicked in.
My rig crashed within seconds of Boinc picking up on the app_info.xml settings.
I suspect one or both of the followwing.
[1] a 750watt 80+ gold psu is not upto the task of handling the extra power required when 2 GPU's are maxed out.
[2] My houshold electricity suppy is off 15Watt socket outlet feeding a power strip extension that has not only my computer plugged into it, but also 2 printers and a 2nd monitor.
Since I also have 6x 1Tbyte sata hard drives and 2 small SSD's in the computer as well as 2 DVD/CD combo burners.
So the power drawn in normal use is fairly high, running a double load on the 2 GPU's was the straw that broke the camels back.
My mains supply also threw a wobbly at the same time:-( I Couldnt just reboot, my rig simply wasnt getting any power..
After a bit of faffing around and a new extension strip and a reboot, minus the app_info.xml entries for double GPU usage I had no problems.


Asus's Recommended Power Supply Wattage Calculator Recommends a Minimum PSU of 900Watts for a System of your Spec's, ie an FX-6100, 2*GTX560, 6*Hardrives, 2*DVD/CD-RW Combos,

Even Corsair's outdated Power Supply Finder recommends 850Watts if you put in something similar,

You're basically underspeced your PSU by decent margin, I suggest you eithier don't try and run Multiple Cuda tasks at once, or Remove one of your GPUs until you're got a PSU that will cope,

Claggy

cliff
cliff
Joined: 15 Feb 12
Posts: 176
Credit: 283452444
RAC: 0

Hi Claggy Decent psu

Hi Claggy
Decent psu over 900watts is very pricy, so I've opted for the less costly solution:-) Just run 1 task at a time per GPU.

When we have the next outage I'll look into my mains supply again as well,
I didnt like it that I lost power to my kit at the same time.

Nevertheless, I am probably not the only one who has added more and more kit over time, and not had a psu caused crash. So I still think my original point is valid. If the option exists in Boinc other than in an app_info.xml file there is the possibility of others ending up with a system crash, and no easy fix, since one cannot access boinc as easily as a file that can be edited.

Regards,

Cliff,

Been there, Done that, Still no damm T Shirt.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 734223262
RAC: 1292360

Hi! Thanks for sharing

Hi!

Thanks for sharing your experience, definitely there will be a "big-fat-warning" with the input from this thread next to this option in the Web GUI if it's decided to include it, it's definitely still on the radar of the devs.

Update: it will be done this week, see Bernd's announcement on new app versions.

Cheers
HB

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250682743
RAC: 35161

RE: Update: it will be done

Quote:
Update: it will be done this week, see Bernd's announcement on new app versions.

Done.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.