With the impending end of BRP6 (Parkes) work availability many of us who have worked exclusively with the cuda55 application are faced with running BRP4G as an only-available work type pending the hope-for GPU version of the Gamma-Ray pulsar search.
Until today, BRP4G was only available to nvidia users in cuda32 form, but apparently sparked by Mumak's inquiry, Bernd announced release of a cuda55 variant.
Sadly, multiple users have seen similar failures, often in the first few seconds. A common symptom has the web page for the task showing "-1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION".
This problem has occurred on multiple hosts with different versions of Windows, different nvidia driver versions, and different models of GPU card.
example error on host 11368689
example error on host 3409259
example error on host 12421865
example error on host 12254311
example error on host 12318766
Despite the quite recent initial release, it is actually rather easy to find examples of the syndrome. While some tasks appear to have run time far longer than just a few seconds, I harbor a suspicion that just reflects a longer time before the user responded to a Windows error notification, as the stderr does not log any useful activity in the cases I have reviewed.
Possibly in response to these observations, or perhaps based on other data available at the mother ship, Bernd announced withdrawal of the cuda55 BRP4G variant from service pending analysis.
Why bother with this thread? I thought people interested in the potential availability and status of a cuda55 form of the BRP4G might find it useful.
Copyright © 2024 Einstein@Home. All rights reserved.
I too was encouraged by the
)
I too was encouraged by the release of the CUDA55 variant last night. Unfortunately, I have already returned one error of the -1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION type. Still have many queued up. Question is to abort them or let them run for analysis by the developers.
example error on host 3967953
Keith Myers wrote:Question is
)
There is so very little information in stderr for this particular situation, and it seems so uniform, that I doubt adding more instances from the same machine to the pile would help.
As has been pointed out to me
)
As has been pointed out to me many times over at SETI; the stderr.txt file that gets sent back to the project is just a "friendly name" file and NOT the science result. The science result is a completely different file and is not normally viewable by the public. The file is the one that the scientists need to troubleshoot what went wrong. That file is only available for examination while it is still in the slot directory the task was run in and NOT already uploaded to the project. That makes it hard to find and examine by the user unless you know what you are doing and quick to find it and copy it to another directory before it gets deleted. At least that is how it works at SETI, I am assuming the same conditions apply here at Einstein.
It seems that only the x64
)
It seems that only the x64 version of BRP4G-CUDA55 (v1.56) had a problem.
I have completed several v1.57 on a XP32 machine and was surprised by the speed-up on a 750 Ti: from 2600s for cuda32 to 2000s (running 1 WU).
-----
My GTX660 is doing 48 BRP4G
)
My GTX660 is doing 48 BRP4G betas a day for obviously 48000 credit, it was doing 18 BRP6 betas a day for 79200 credit per day. Is this what is intended?
My 1070 is doing 96 a day (3x
)
My 1070 is doing 96 a day (3x every 45min). I think the 6G WUs were more PPD but so goes it.
mmonnin wrote:My 1070 is
)
My HD7950 is making four WUs of these every hour in Linux so it does not look bad compared to your GTX1070.
My GTX 660Ti is finishing 1 every 30 minutes as it kicks in as backup of GPUGRID.
Just info.
Not quite a CUDA topic; but
)
Not quite a CUDA topic; but close enough. The OpenCL BRP4G app is giving transfer buffer errors on a Mac Pro. This looks vaguely familiar from a few years back (Oliver?)
https://einsteinathome.org/host/12303547/tasks
[20:40:58][23994][ERROR] Error in OpenCL context: OpenCL Error : Error loading transfer buffer