06-Sep-2011 20:37:23 [Einstein@Home] Task p2030.20090419.G61.17-01.48.S.b2s0g0.00000.dm_2144_0 exited with zero status but no 'finished' file
All CUDA units are doing this for a few days now, I've tried resetting the project didn't help.
"NVIDIA GPU 0: GeForce GT 240 (driver version unknown, CUDA version 4000, compute capability 1.2, 511MB, 257 GFLOPS peak)"
BOINC: 6.12.34
GPU: Nvidia 240GT (512mb)
Nvidia drivers: 270.41.19
CPU units are completing normally, GPU units used to be working fine, without any changes on my end this started happening.
06-Sep-2011 21:34:07 [---] NVIDIA GPU 0: GeForce GT 240 (driver version unknown, CUDA version 4000, compute capability 1.2, 511MB, 257 GFLOPS peak)
I updated the Nvidia drivers to: 275.09.07, no change.
Does anyone have any ideas?
Copyright © 2024 Einstein@Home. All rights reserved.
exited with zero status but no 'finished' file
)
Did you see the error message in the erroneous CUDA tasks?
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
Nope, no version mismatch
)
Nope, no version mismatch problem. This started before I updated the Nvidia drivers. I did that to see if it would help, it didn't. Note if those didn't match wouldn't have even let me restart X11 after I updated the drivers. Though it can happen if I update the drivers but don't restart X11 and leave the old Nvidia module in memory.
Did you try a reboot of the
)
Did you try a reboot of the system? The only way to reset anything wrong with (something stuck in the memory of) a GPU (on a videocard) is to reboot the whole computer.
RE: Nope, no version
)
That excerpt was from your task 246074743 sent back on 7 Sep 2011 4:38:27 UTC.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
His tasks on September
)
His tasks on September 7th
September 6th
September 5th
and September 4th, show a memory problem.
Now, of course you can ignore that and continue to try to run tasks, while hoping for a miracle, but it will only be fixed when there's enough GPU memory free. Which can only reliably be done with a system reboot. It cannot be done in any other way, only with a full power cycle.
RE: RE: Nope, no version
)
Right, after the driver update but before I did a reboot. Wasn't in there that long, just left BOINC running while I updated a few other things.
And problem started before then.
Note the CUDA units seem to be completing normally now, without changing anything on this end.
Yes the system has been restarted since the problem began.
RE: Yes the system has been
)
Which is what fixed it. Next time you see loads of errors, don't ask, just reboot first. Chances are high that'll instantly fix the problem.
RE: RE: Yes the system
)
Was still doing it for awhile after the reboot.
It started doing it again.
)
It started doing it again. Just to be clear: A cold boot DID NOT help.
Well, if your videocard has
)
Well, if your videocard has those errors repeatedly, and it continues immediately after a reboot, there's a good chance it's a problem with the videocard (broken memory, broken capacitors, too much heat, etc.). I don't see any other option but to either replace that videocard, or test with another.