Multi-Directional Gravitational Wave search on O3 (GPU) v1.02 () windows_x86_64
I'm seeing computation errors - I think floating point exceptions 0xc0000090 - on this app on a Windows 11 machine, with NVidia GPUs (980Ti, 980, both have the same compute capabilities). I've tried changing Nvidia driver versions but it didn't make a difference.
e.g. https://einsteinathome.org/task/1394004030
I realize people are on holidays, but thought I'd ask in case anyone else is seeing this.
Copyright © 2024 Einstein@Home. All rights reserved.
Got a bunch of those this
)
Got a bunch of those this morning on a Windows 10 host, running a NVidia RTX A2000. My only other WU that is currently GPU enabled is my programming laptop, which got a GTX1060 and that one just got a few minutes ago 4 of those, but they are not started yet.
Binary Radio Pulsar Search (MeerKAT) is running on the same machine just fine, though I noticed in general that not all of the received and returned WUs are showing up under my account here, but that's a story for another thread...
Well, after the last post, I
)
Well, after the last post, I got a lot more of those GPU WUs and all of them will terminate (from a couple of seconds to a couple of minutes) with a computation error and now for the second time within an hour, the whole system blue screens and reboots.... :(
TPCBF wrote: Well, after the
)
If you don't already use it get MSIAfterburner, it works for both AMD and Nvidia gpu's, and check out the heat that is being generated on that laptop by those 03 tasks and then scroll down and check the cpu temps as well, you could be seeing problems due to overheating.
mikey wrote: TPCBF
)
No, I don't run that program (yet), but I doubt that it is a heat related issue. On one batch that then lead to a blue screen, the WUs got from 0 to 90% in a couple of seconds, then showed computation error a couple of seconds later, before the machine even got a chance to run hot. It also runs other GPU task just fine (like the OPNG ones from WCG).
Als, when the latest batch came in last night, I suspended all the NVidia tasks (the Intel GPU tasks ran just fine all this time, so did the CPU tasks), then resumed them manual one by one until I went to bed, and again this morning when getting back to my desk. And those WUs run and finish just fine (just don't get any credit for them still)...
TPCBF wrote:mikey
)
So you are running tasks on your cpu, your gpu AND the gpu built-into the cpu all at the same time? AND all of that on a laptop as well?
I started seeing these errors
)
I started seeing these errors on a machine with a successful validation rate, so something changed. These work units have multiple "Error while computing" instances.
Here is one of these units. Workunit 693628022 | Einstein@Home (einsteinathome.org)
Here is another from my other computer. Workunit 694423029 | Einstein@Home (einsteinathome.org)
Thank you.
I enabled some extra logging,
)
I enabled some extra logging, and also am seeing an error with the AMD/ATI version of the task.
Log snippet here:
The NVIDIA version is throwing a floating point error:
The AMD/ATI version is throwing an Account/Password error:
Am happy to run specific tests or provide more data.
Best regards
Colin
the project has acknowledged
)
the project has acknowledged an issue with the Windows application, but they wont be able to address it until early next year.
https://einsteinathome.org/content/multi-directional-gravitational-wave-search-o3-data-o3md1f?page=4#comment-205427
_________________________________________________________________________
Thanks for confirming other
)
Thanks for confirming other folks have the issues too.
I've modified my cc_config.xml to disable the OM3DF tasks and other stuff is running fine.