Can anyone explain this result?

Darrell
Darrell
Joined: 11 Nov 04
Posts: 32
Credit: 15397991
RAC: 0

Except for the one CasA that

Except for the one CasA that thru an exception, the others have completed, but the validator is marking them all as invalid. Probably should change the exclusions and put the HD3000 to rest.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 3000752006
RAC: 694498

There's an interesting stderr

There's an interesting stderr posted at MilkyWay, which shows one possible way problems like this can happen - Milkyway thread 3534.

Scenario is very different - OS X, NVidia plus Intel GPU, and I don't trust MilkyWay as a reliable witness either. But these lines seem unambiguous:

BOINC GPU type suggests using OpenCL vendor 'NVIDIA Corporation'
...
Found 1 platform
Platform 0 information:
Name: Apple
Version: OpenCL 1.2 (Dec 8 2013 21:07:05)
Vendor: Apple
...
Didn't find preferred platform
Using device 0 on platform 0
Found 2 CL devices
Device 'HD Graphics 4000' (Intel:0x1024400) (CL_DEVICE_TYPE_GPU)
...
Double extension: (none)
Device doesn't support double precision


and the task duly failed.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

The first of some changes is

The first of some changes is now added in https://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commit;h=c42457f30559596f83ef110c8a45745bbaafa5d1 (and a fix for it)

Quote:
API: return CL_INVALID_DEVICE from boinc_get_opencl_ids() if init_data.xml passes a value for gpu_opencl_dev_index which does not correspond to an OpenCL capable device.


Before Einstein runs to go use this API, please wait as there may be another buglet under the grass here. I will continue to monitor the conversation.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

BOINC Git wrote:client: don't

BOINC Git wrote:

client: don't try to run OpenCL jobs on non-OpenCL GPUs

Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.

Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.

Note: this problem would go away if we treated each GPU as a separate resource.


source

This will come in a future BOINC client (past 7.3.17 at least).

mikey
mikey
Joined: 22 Jan 05
Posts: 12855
Credit: 1884342265
RAC: 309863

RE: BOINC Git wrote:client:

Quote:
BOINC Git wrote:

client: don't try to run OpenCL jobs on non-OpenCL GPUs

Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.

Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.

Note: this problem would go away if we treated each GPU as a separate resource.


source

This will come in a future BOINC client (past 7.3.17 at least).

WOO HOO!!! Thanks Jord!!!

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 3000752006
RAC: 694498

RE: RE: BOINC Git

Quote:
Quote:
BOINC Git wrote:

client: don't try to run OpenCL jobs on non-OpenCL GPUs

Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.

Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.

Note: this problem would go away if we treated each GPU as a separate resource.


source

This will come in a future BOINC client (past 7.3.17 at least).

WOO HOO!!! Thanks Jord!!!


Trouble is, unless the platform is also enforced, the commit as described (I haven't burrowed down into the actual code) won't solve the MilkyWay problem.

The Intel 'HD Graphics 4000' is fully OpenCL 1.2 capable (we use them in that mode on this project too), but lacks the other attributes that MW requires.

mikey
mikey
Joined: 22 Jan 05
Posts: 12855
Credit: 1884342265
RAC: 309863

RE: RE: RE: BOINC Git

Quote:
Quote:
Quote:
BOINC Git wrote:

client: don't try to run OpenCL jobs on non-OpenCL GPUs

Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.

Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.

Note: this problem would go away if we treated each GPU as a separate resource.


source

This will come in a future BOINC client (past 7.3.17 at least).

WOO HOO!!! Thanks Jord!!!


Trouble is, unless the platform is also enforced, the commit as described (I haven't burrowed down into the actual code) won't solve the MilkyWay problem.

The Intel 'HD Graphics 4000' is fully OpenCL 1.2 capable (we use them in that mode on this project too), but lacks the other attributes that MW requires.

I wasn't thinking of that particular issue but I don't think it will be as easy as 'turning it on' either. But over time people are now crunching with more then 3 gpu's in a single machine and each on a different project, or all on the same one, OR even some combination of that. Boinc will figure it out, it may just take a few releases to do it completely seamlessly.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

Giving an early warning.

Giving an early warning. Darrell may want to look out for 7.3.19, when that one is available for testing (at the time of writing this not yet), as it will have a fix for your problem included. Or at least, the developers hope it'll fix your problem. :-)

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

It is available now. So if

It is available now. So if you can please install this version, Darrell, and then re-allow your HD3000 in combination with the HD58x0 and see if you still see the behaviour from this thread. You shouldn't.

- boinc_7.3.19_windows_intelx86.exe
- boinc_7.3.19_windows_x86_64.exe

Darrell
Darrell
Joined: 11 Nov 04
Posts: 32
Credit: 15397991
RAC: 0

Before installing 7.3.19 I

Before installing 7.3.19 I had 1 Milkyway opencl on device 0 (HD5850) and one Seti opencl on device 0, 1 Moo wrapper on device 1 (HD3000) and 1 Primegrid Genefer (opencl) on device 1. After installation, no change. Will have to wait and see what happens when a CasA gets downloaded (right now it is excluded from device 0). I have 1 Solo_collatz opencl waiting to start. The problem with the Collatz's tasks is that the Brook/Cal solo and opencl solo have the same short app name, so it is excluded from device 0. Will be interesting to see where Boinc starts it. A couple of days ago I had four opencl task running, 2 on the HD5850 and 2 on the HD3000, Process Explorer showed that all four tasks top thread was amdocl64.dll, so they were in reality, probably all using the HD5850 since it and the cpu are the only devices opencl capable.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.