Except for the one CasA that thru an exception, the others have completed, but the validator is marking them all as invalid. Probably should change the exclusions and put the HD3000 to rest.
API: return CL_INVALID_DEVICE from boinc_get_opencl_ids() if init_data.xml passes a value for gpu_opencl_dev_index which does not correspond to an OpenCL capable device.
Before Einstein runs to go use this API, please wait as there may be another buglet under the grass here. I will continue to monitor the conversation.
client: don't try to run OpenCL jobs on non-OpenCL GPUs
Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.
Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.
Note: this problem would go away if we treated each GPU as a separate resource.
client: don't try to run OpenCL jobs on non-OpenCL GPUs
Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.
Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.
Note: this problem would go away if we treated each GPU as a separate resource.
client: don't try to run OpenCL jobs on non-OpenCL GPUs
Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.
Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.
Note: this problem would go away if we treated each GPU as a separate resource.
This will come in a future BOINC client (past 7.3.17 at least).
WOO HOO!!! Thanks Jord!!!
Trouble is, unless the platform is also enforced, the commit as described (I haven't burrowed down into the actual code) won't solve the MilkyWay problem.
The Intel 'HD Graphics 4000' is fully OpenCL 1.2 capable (we use them in that mode on this project too), but lacks the other attributes that MW requires.
client: don't try to run OpenCL jobs on non-OpenCL GPUs
Suppose
- the host has 2 GPUs of same vendor; A is OpenCL capable, B isn't
- the volunteer sets "use_all_gpus" config flag
Then the client will try to run OpenCL jobs on B.
Depending on how the app is written,
it may run on B and fail, or run on A and overload A.
Solution: when assigning GPUs to OpenCL jobs,
check that the GPU instance is OpenCL capable.
Note: this problem would go away if we treated each GPU as a separate resource.
This will come in a future BOINC client (past 7.3.17 at least).
WOO HOO!!! Thanks Jord!!!
Trouble is, unless the platform is also enforced, the commit as described (I haven't burrowed down into the actual code) won't solve the MilkyWay problem.
The Intel 'HD Graphics 4000' is fully OpenCL 1.2 capable (we use them in that mode on this project too), but lacks the other attributes that MW requires.
I wasn't thinking of that particular issue but I don't think it will be as easy as 'turning it on' either. But over time people are now crunching with more then 3 gpu's in a single machine and each on a different project, or all on the same one, OR even some combination of that. Boinc will figure it out, it may just take a few releases to do it completely seamlessly.
Giving an early warning. Darrell may want to look out for 7.3.19, when that one is available for testing (at the time of writing this not yet), as it will have a fix for your problem included. Or at least, the developers hope it'll fix your problem. :-)
It is available now. So if you can please install this version, Darrell, and then re-allow your HD3000 in combination with the HD58x0 and see if you still see the behaviour from this thread. You shouldn't.
Before installing 7.3.19 I had 1 Milkyway opencl on device 0 (HD5850) and one Seti opencl on device 0, 1 Moo wrapper on device 1 (HD3000) and 1 Primegrid Genefer (opencl) on device 1. After installation, no change. Will have to wait and see what happens when a CasA gets downloaded (right now it is excluded from device 0). I have 1 Solo_collatz opencl waiting to start. The problem with the Collatz's tasks is that the Brook/Cal solo and opencl solo have the same short app name, so it is excluded from device 0. Will be interesting to see where Boinc starts it. A couple of days ago I had four opencl task running, 2 on the HD5850 and 2 on the HD3000, Process Explorer showed that all four tasks top thread was amdocl64.dll, so they were in reality, probably all using the HD5850 since it and the cpu are the only devices opencl capable.
Except for the one CasA that
)
Except for the one CasA that thru an exception, the others have completed, but the validator is marking them all as invalid. Probably should change the exclusions and put the HD3000 to rest.
There's an interesting stderr
)
There's an interesting stderr posted at MilkyWay, which shows one possible way problems like this can happen - Milkyway thread 3534.
Scenario is very different - OS X, NVidia plus Intel GPU, and I don't trust MilkyWay as a reliable witness either. But these lines seem unambiguous:
and the task duly failed.
The first of some changes is
)
The first of some changes is now added in https://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commit;h=c42457f30559596f83ef110c8a45745bbaafa5d1 (and a fix for it)
Before Einstein runs to go use this API, please wait as there may be another buglet under the grass here. I will continue to monitor the conversation.
BOINC Git wrote:client: don't
)
source
This will come in a future BOINC client (past 7.3.17 at least).
RE: BOINC Git wrote:client:
)
WOO HOO!!! Thanks Jord!!!
RE: RE: BOINC Git
)
Trouble is, unless the platform is also enforced, the commit as described (I haven't burrowed down into the actual code) won't solve the MilkyWay problem.
The Intel 'HD Graphics 4000' is fully OpenCL 1.2 capable (we use them in that mode on this project too), but lacks the other attributes that MW requires.
RE: RE: RE: BOINC Git
)
I wasn't thinking of that particular issue but I don't think it will be as easy as 'turning it on' either. But over time people are now crunching with more then 3 gpu's in a single machine and each on a different project, or all on the same one, OR even some combination of that. Boinc will figure it out, it may just take a few releases to do it completely seamlessly.
Giving an early warning.
)
Giving an early warning. Darrell may want to look out for 7.3.19, when that one is available for testing (at the time of writing this not yet), as it will have a fix for your problem included. Or at least, the developers hope it'll fix your problem. :-)
It is available now. So if
)
It is available now. So if you can please install this version, Darrell, and then re-allow your HD3000 in combination with the HD58x0 and see if you still see the behaviour from this thread. You shouldn't.
- boinc_7.3.19_windows_intelx86.exe
- boinc_7.3.19_windows_x86_64.exe
Before installing 7.3.19 I
)
Before installing 7.3.19 I had 1 Milkyway opencl on device 0 (HD5850) and one Seti opencl on device 0, 1 Moo wrapper on device 1 (HD3000) and 1 Primegrid Genefer (opencl) on device 1. After installation, no change. Will have to wait and see what happens when a CasA gets downloaded (right now it is excluded from device 0). I have 1 Solo_collatz opencl waiting to start. The problem with the Collatz's tasks is that the Brook/Cal solo and opencl solo have the same short app name, so it is excluded from device 0. Will be interesting to see where Boinc starts it. A couple of days ago I had four opencl task running, 2 on the HD5850 and 2 on the HD3000, Process Explorer showed that all four tasks top thread was amdocl64.dll, so they were in reality, probably all using the HD5850 since it and the cpu are the only devices opencl capable.