Curious... Just noticed the option for ABP Search (SP) on my Preferences. Am I to assume it's referring to the CUDA App...?
Not exactly.
The two options for "Arecibo Pulsar Search" and "ABP Search (SP)" refer to two different searches, ABP1 and ABP2.
Currently only ABP1 work is distributed, there are ABP1 apps for CPU and GPU (CUDA). If you want to continue crunching ABP1 on the CPU but would like not to crunch them on the GPU, you should deselect the "Use GPU" option in the same configuration screen.
While the ABP1 search is doing some of its processing in single precision (e.g. the part that also gets executed on the GPU in the CUDA app), other parts are still done in double precision. The new ABP2 apps will be doing nearly everything in single precision, which will also allow to put more load on the GPU with high efficiency.
Curious... Just noticed the option for ABP Search (SP) on my Preferences. Am I to assume it's referring to the CUDA App...?
Not exactly.
The two options for "Arecibo Pulsar Search" and "ABP Search (SP)" refer to two different searches, ABP1 and ABP2.
Currently only ABP1 work is distributed, there are ABP1 apps for CPU and GPU (CUDA). If you want to continue crunching ABP1 on the CPU but would like not to crunch them on the GPU, you should deselect the "Use GPU" option in the same configuration screen.
While the ABP1 search is doing some of its processing in single precision (e.g. the part that also gets executed on the GPU in the CUDA app), other parts are still done in double precision. The new ABP2 apps will be doing nearly everything in single precision, which will also allow to put more load on the GPU with high efficiency.
CU
Bikeman
Nah... I'm not too worry about CUDA apps, Macs can't even support CUDA. Just curious about the new SP option.
Any idea when will the new ABP2 search be implemented...?
CUDA Beta App testers should drain their work cache and switch back to the normal project work.
BM
This looks like an opt out condition ;o) With the beta version it worked on my
computers. First I simply removed the app_info.xml (after I finished all work
and stopped boinc). But starting it again resulted in the message, that BOINC
was too old (installed just one month ago) to get some work from einstein@home.
Surprising, because other computers without GPUs still use BOINC 5.10 with
getting work from einstein@home (I cannot update this, because the home
directory is the same for several computers, what seems to stop newer versions
of BOINC working, independently from the fact that they are installed in
different directories for each computer).
I switched to BOINC 6.10 for this new computer and the result is, that the
GPU gets no work anymore, because of the driver/CUDA2.3 condition - BOINC was
not able to get the correct driver version and it obviously uses CUDA2.2.
Well, no big problem, because there is anyway no big difference, whether it
works with or without the GPU. Now the CPUs start to crunch the ABP again.
The installed driver is already from derived from NVidia, not that what is
indicated from Debian to be the current stable version.
Because I do not want to reinstall every month new experimental drivers on an
else stable computer, for now this CORE i7 crunches without the GPU again.
For another new notebook I think I will continue to crunch with this
app_info.xml, just because it worked without problems and for this notebook
GPU those ABP are ok (looks like it is already to small or to slow for GPUgrid ;o)
...And hey, don't tell me that I can leave the project if I don't like it. This is the most antisocial attitude I've heard of. So if you don't like to share the resources of this planet with others, maybe it's time for you to leave it!
i agree with that !!
No one told anyone to leave the project. XJR-Maniac was just quite rude without reason.
If you don't want Einstein CUDA tasks, deselect them in your preferences. Nothing easier than that.
Yes, I too am unsure how that deduction was made from Gary's post. :-)
In any case the primary requirement for CUDA to yield significant benefit is that the problem must lend itself to massive parallelism ( ideally thousands of threads, plus other restrictions ). This is a basic reason ( plus of course issues like compiler technology ) that leads to variable success with apps.
The development here at E@H is quite cautious, with a considerable user pool feeding back via beta testing. CUDA is no exception. While not always successful ( a failure outcome is within the definition of testing ), one hopes to be able to productively generalise beyond the test participants. One can opt out of CUDA if it doesn't fly well enough. In fact that is likely to be a common response for those with unsuitable hardware for optimal CUDA use. Alas as Oliver pointed out, without changing BOINC code ( not under E@H control ) then a default setting of opt-out was/is not available.
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Hmmm ... the CUDA app also seems to have a similar problem to GPUGrid, in that it fails to correctly detect CUDA hardware on a Windows host that has been connected to with Remote Desktop and has not yet been accessed again from the console:
6.10.17
Activated exception handling...
[20:00:57][4508][INFO ] Starting data processing...
[20:00:57][4508][INFO ] Using CUDA device #0 "Device Emulation (CPU)" (518.40 GFLOPS)
[20:00:57][4508][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[20:00:58][4508][WARN ] Couldn't allocate 25165824 bytes of CUDA pinned host memory for resampled time series! Trying fallback...
[20:00:58][4508][WARN ] Couldn't allocate 25165832 bytes of CUDA pinned host memory for resampled time series FFT! Trying fallback...
[20:00:58][4508][ERROR] Error allocating CUDA device memory: 25165832 bytes (no CUDA-capable device is available)
[20:00:58][4508][ERROR] Demodulation failed (error: 3)!
20:00:58 (4508): called boinc_finish
The GPU in this host is a 512MB 9800GT:
6.10.17
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce 9800 GT"
# Clock rate: 1.62 GHz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 14
# Number of cores: 112
GPUGrid work units that are already running will continue to run without issue across a remote desktop session, but no new work units can be started until you have logged in again from the console first. This may be related to GPUGrid seemingly using a deprecated function to test for the presence of CUDA hardware: I'm not sure if you have the same problem.
My current workaround for GPUGrid is to suspend work fetch when I am expecting to be away and to suspend all tasks except the currently active one. This strategy, of course, regularly results in GPU idle time.
Unfortunately, the "stealth" release of the CUDA version caught me unawares and resulted in a swag of errored work units within the space of ~90 seconds (about five hours ago, while I was out) due to this problem and that, in turn, has reduced this host's quota to 2/day, so it seems I won't be doing much Einstein work on this host (CPU or GPU) for some time :-(
In an ideal world (a) the CUDA hardware would continue to be available despite the Remote Desktop video driver having been invoked and (b) if a work unit fails for this reason, the queue of GPU WUs needs to be paused, since once the first has failed, all the others in the queue *are* going to suffer a similar fate.
That has nothing to do with any projects's application but all with Remote Desktop and how Microsoft handles the graphics drivers.
There've been plenty threads at SETI and BOINC dev about that topic.
Gruß,
Gundolf
I don't wish to appear rude, but did you actually read my post?
a) "... GPUGrid appears to be using a deprecated function ..."
b) If it is "unfixable" (and I'm not yet convinced, but I don't frequent the SETI board), then my suggestion that "if a work unit fails for this reason, the queue of GPU WUs needs to be paused" seems all the more relevant.
More importantly, running work units don't spontaneously abort upon a Remote Desktop connection being initiated, so the CUDA hardware clearly remains accessible to the *running* app, which leads me to question whether there is an assertion in the start-up code that isn't testing what it thinks it is testing.
As a follow up:
[BOINC] #936: CUDA devices not detected when logged in through Remote Desktop
[pre]
---------------------------+------------------------------------------------
Reporter: mart0258 | Owner:
Type: Enhancement | Status: closed
Priority: Undetermined | Milestone: Undetermined
Component: Undetermined | Version: 6.6.31
Resolution: fixed | Keywords:
---------------------------+------------------------------------------------[/pre]
Changes (by romw):
* status: new => closed
* resolution: => fixed
Comment:
This is now fixed in the 6.10 version of the BOINC client.
"resolution: => fixed" doesn't seem consistent with your comment, although this issue doesn't actually seem to be fixed as of 6.10.17.
Hmmm ... the CUDA app also seems to have a similar problem to GPUGrid, in that it fails to correctly detect CUDA hardware on a Windows host that has been connected to with Remote Desktop and has not yet been accessed again from the console:
We (and the BOINC team) are already aware of that problem. Do you use Windows Vista or Windows 7 and run BOINC as a service?
Curious... Just noticed the
)
Curious... Just noticed the option for ABP Search (SP) on my Preferences. Am I to assume it's referring to the CUDA App...?
RE: Curious... Just noticed
)
Not exactly.
The two options for "Arecibo Pulsar Search" and "ABP Search (SP)" refer to two different searches, ABP1 and ABP2.
Currently only ABP1 work is distributed, there are ABP1 apps for CPU and GPU (CUDA). If you want to continue crunching ABP1 on the CPU but would like not to crunch them on the GPU, you should deselect the "Use GPU" option in the same configuration screen.
While the ABP1 search is doing some of its processing in single precision (e.g. the part that also gets executed on the GPU in the CUDA app), other parts are still done in double precision. The new ABP2 apps will be doing nearly everything in single precision, which will also allow to put more load on the GPU with high efficiency.
CU
Bikeman
RE: RE: Curious... Just
)
Nah... I'm not too worry about CUDA apps, Macs can't even support CUDA. Just curious about the new SP option.
Any idea when will the new ABP2 search be implemented...?
RE: Any idea when will the
)
According to Oliver's message here, probably in the next 1-2 weeks. I guess the road-map will be something like
- Beta-Test for ABP2 CPU app
- Beta-Test for ABP2 CUDA app
- Release of ABP2 CPU app
- Release of ABP2 CUDA app
CU
Bikeman
RE: CUDA Beta App testers
)
This looks like an opt out condition ;o) With the beta version it worked on my
computers. First I simply removed the app_info.xml (after I finished all work
and stopped boinc). But starting it again resulted in the message, that BOINC
was too old (installed just one month ago) to get some work from einstein@home.
Surprising, because other computers without GPUs still use BOINC 5.10 with
getting work from einstein@home (I cannot update this, because the home
directory is the same for several computers, what seems to stop newer versions
of BOINC working, independently from the fact that they are installed in
different directories for each computer).
I switched to BOINC 6.10 for this new computer and the result is, that the
GPU gets no work anymore, because of the driver/CUDA2.3 condition - BOINC was
not able to get the correct driver version and it obviously uses CUDA2.2.
Well, no big problem, because there is anyway no big difference, whether it
works with or without the GPU. Now the CPUs start to crunch the ABP again.
The installed driver is already from derived from NVidia, not that what is
indicated from Debian to be the current stable version.
Because I do not want to reinstall every month new experimental drivers on an
else stable computer, for now this CORE i7 crunches without the GPU again.
For another new notebook I think I will continue to crunch with this
app_info.xml, just because it worked without problems and for this notebook
GPU those ABP are ok (looks like it is already to small or to slow for GPUgrid ;o)
RE: RE: RE: ...And hey,
)
Yes, I too am unsure how that deduction was made from Gary's post. :-)
In any case the primary requirement for CUDA to yield significant benefit is that the problem must lend itself to massive parallelism ( ideally thousands of threads, plus other restrictions ). This is a basic reason ( plus of course issues like compiler technology ) that leads to variable success with apps.
The development here at E@H is quite cautious, with a considerable user pool feeding back via beta testing. CUDA is no exception. While not always successful ( a failure outcome is within the definition of testing ), one hopes to be able to productively generalise beyond the test participants. One can opt out of CUDA if it doesn't fly well enough. In fact that is likely to be a common response for those with unsuitable hardware for optimal CUDA use. Alas as Oliver pointed out, without changing BOINC code ( not under E@H control ) then a default setting of opt-out was/is not available.
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Hmmm ... the CUDA app also
)
Hmmm ... the CUDA app also seems to have a similar problem to GPUGrid, in that it fails to correctly detect CUDA hardware on a Windows host that has been connected to with Remote Desktop and has not yet been accessed again from the console:
6.10.17
Activated exception handling...
[20:00:57][4508][INFO ] Starting data processing...
[20:00:57][4508][INFO ] Using CUDA device #0 "Device Emulation (CPU)" (518.40 GFLOPS)
[20:00:57][4508][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[20:00:58][4508][WARN ] Couldn't allocate 25165824 bytes of CUDA pinned host memory for resampled time series! Trying fallback...
[20:00:58][4508][WARN ] Couldn't allocate 25165832 bytes of CUDA pinned host memory for resampled time series FFT! Trying fallback...
[20:00:58][4508][ERROR] Error allocating CUDA device memory: 25165832 bytes (no CUDA-capable device is available)
[20:00:58][4508][ERROR] Demodulation failed (error: 3)!
20:00:58 (4508): called boinc_finish
The GPU in this host is a 512MB 9800GT:
6.10.17
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce 9800 GT"
# Clock rate: 1.62 GHz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 14
# Number of cores: 112
GPUGrid work units that are already running will continue to run without issue across a remote desktop session, but no new work units can be started until you have logged in again from the console first. This may be related to GPUGrid seemingly using a deprecated function to test for the presence of CUDA hardware: I'm not sure if you have the same problem.
My current workaround for GPUGrid is to suspend work fetch when I am expecting to be away and to suspend all tasks except the currently active one. This strategy, of course, regularly results in GPU idle time.
Unfortunately, the "stealth" release of the CUDA version caught me unawares and resulted in a swag of errored work units within the space of ~90 seconds (about five hours ago, while I was out) due to this problem and that, in turn, has reduced this host's quota to 2/day, so it seems I won't be doing much Einstein work on this host (CPU or GPU) for some time :-(
In an ideal world (a) the CUDA hardware would continue to be available despite the Remote Desktop video driver having been invoked and (b) if a work unit fails for this reason, the queue of GPU WUs needs to be paused, since once the first has failed, all the others in the queue *are* going to suffer a similar fate.
That has nothing to do with
)
That has nothing to do with any projects's application but all with Remote Desktop and how Microsoft handles the graphics drivers.
There've been plenty threads at SETI and BOINC dev about that topic.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
RE: That has nothing to do
)
I don't wish to appear rude, but did you actually read my post?
a) "... GPUGrid appears to be using a deprecated function ..."
b) If it is "unfixable" (and I'm not yet convinced, but I don't frequent the SETI board), then my suggestion that "if a work unit fails for this reason, the queue of GPU WUs needs to be paused" seems all the more relevant.
More importantly, running work units don't spontaneously abort upon a Remote Desktop connection being initiated, so the CUDA hardware clearly remains accessible to the *running* app, which leads me to question whether there is an assertion in the start-up code that isn't testing what it thinks it is testing.
As a follow up:
[BOINC] #936: CUDA devices not detected when logged in through Remote Desktop
[pre]
---------------------------+------------------------------------------------
Reporter: mart0258 | Owner:
Type: Enhancement | Status: closed
Priority: Undetermined | Milestone: Undetermined
Component: Undetermined | Version: 6.6.31
Resolution: fixed | Keywords:
---------------------------+------------------------------------------------[/pre]
Changes (by romw):
* status: new => closed
* resolution: => fixed
Comment:
This is now fixed in the 6.10 version of the BOINC client.
"resolution: => fixed" doesn't seem consistent with your comment, although this issue doesn't actually seem to be fixed as of 6.10.17.
Hi Jim, RE: Hmmm ...
)
Hi Jim,
We (and the BOINC team) are already aware of that problem. Do you use Windows Vista or Windows 7 and run BOINC as a service?
Thanks,
Oliver
Einstein@Home Project