[SOLVED] Upgrade to Cuda 5.0 NOT causing invalid results - Problem due to poor cooling.

microchip
microchip
Joined: 10 Jun 06
Posts: 50
Credit: 218941113
RAC: 59681

RE: Checking the system

Quote:

Checking the system isn't a bad idea. I just hadn't considered it since the upgrade to CUDA 5 is what caused invalid results to happen en-mass. At-least that is what appeared to be the case. I am going to let the majority of my work done validate and then start crunching again after a system evaluation.

Here is a listing of the typical temps in my system when crunching:

Adapter: ISA adapter
Core 0: +72.0°C (high = +82.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 3: +70.0°C (high = +82.0°C, crit = +100.0°C)

coretemp-isa-0002
Adapter: ISA adapter
Core 1: +73.0°C (high = +82.0°C, crit = +100.0°C)

coretemp-isa-0003
Adapter: ISA adapter
Core 2: +70.0°C (high = +82.0°C, crit = +100.0°C)

Gpu : N/A
Gpu : 69 C
Fan Speed : 40 %

What kind of system is it? A lappy or a desktop? I find the temp values a bit too high. Maybe you need to dust out the fan and heatsink

Derek
Derek
Joined: 9 Feb 05
Posts: 9
Credit: 2059412
RAC: 0

RE: What kind of system is

Quote:
What kind of system is it? A lappy or a desktop? I find the temp values a bit too high. Maybe you need to dust out the fan and heatsink

It is a desktop.

So, I will admit that at first I blew your comment off. But, after thinking about it I took a look inside the case and have now realized the horror that two golden retrievers can have on the insides of a computer. I got the temps to come down a bit; Core 0 and Core 1 fluctuate a lot not not really sure what thats all about... I also noticed that my case fan has stopped working at some point since July when this computer was last cleaned out. Thanks for the comment.

coretemp-isa-0000
Adapter: ISA adapter
Core 0: +66.0°C (high = +82.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 3: +59.0°C (high = +82.0°C, crit = +100.0°C)

coretemp-isa-0002
Adapter: ISA adapter
Core 1: +66.0°C (high = +82.0°C, crit = +100.0°C)

coretemp-isa-0003
Adapter: ISA adapter
Core 2: +61.0°C (high = +82.0°C, crit = +100.0°C)

Gpu : N/A
Gpu : 68 C
Fan Speed : 38 %

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5883
Credit: 119035044879
RAC: 24737948

RE: ... I took a look

Quote:
... I took a look inside the case and have now realized the horror ...


Welcome to the computer cleaning club :-).

In my experience, almost invariably, when errors start creeping into results that were previously quite OK, it's time to check out the cooling efficiency. It's quite tempting to blame things like driver upgrades, etc, but when others aren't having issues with the driver in question, you've got to start looking elsewhere.

Once you find that the spring clean has cured the problem (I imagine it will) you might like to change the thread title to something like, "Upgrade to Cuda 5.0 didn't cause invalid results - it was a cooling problem caused by dust buildup", or something along those lines. I believe you can change the thread title during the one hour edit window after you make a new post to the thread. When you edit your most recent message, the thread title is editable too (I believe).

It would be helpful to have a more accurate title to inform others reading about your problem of the need to check on cooling.

Cheers,
Gary.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6592
Credit: 331181593
RAC: 300347

RE: .... the horror that

Quote:
.... the horror that two golden retrievers can have on the insides of a computer. I got the temps to come down a bit ....


Know the issue well! Depending on the variety : it's the finest of hairs from around the ears, perhaps also the lower legs, that do get anywhere and everywhere. :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

microchip
microchip
Joined: 10 Jun 06
Posts: 50
Credit: 218941113
RAC: 59681

@ Derek So do you still

@ Derek

So do you still have problems with crunching after dusting out?

Derek
Derek
Joined: 9 Feb 05
Posts: 9
Credit: 2059412
RAC: 0

I got 17 work units on Dec

I got 17 work units on Dec 20th and returned them all on Dec 20th. A few have validated at this point, and the rest are waiting for validation. After a few days the rest should be validated/found invalid and then I will know if the computer is back to normal.

I ran memtest86+ overnight the 19th and found no issues. Is there an equivalent thing for GPUs? I couldn't find anything on Google. If not, I might be able to cook something up.

Thanks for the help!

Sid
Sid
Joined: 17 Oct 10
Posts: 164
Credit: 996910748
RAC: 594602

RE: I got 17 work units on

Quote:

I got 17 work units on Dec 20th and returned them all on Dec 20th. A few have validated at this point, and the rest are waiting for validation. After a few days the rest should be validated/found invalid and then I will know if the computer is back to normal.

I ran memtest86+ overnight the 19th and found no issues. Is there an equivalent thing for GPUs? I couldn't find anything on Google. If not, I might be able to cook something up.

Thanks for the help!


Try this one:

http://mikelab.kiev.ua/index_en.php?page=PROGRAMS/vmt_en

Not perfect but good enough.

Zapp
Zapp
Joined: 27 Mar 10
Posts: 5
Credit: 1354983
RAC: 0

Folding@Home developed some

Folding@Home developed some memory checkers in the vein of memtest86+, first for CUDA and later for OpenCL: http://folding.stanford.edu/English/DownloadUtils.

microchip
microchip
Joined: 10 Jun 06
Posts: 50
Credit: 218941113
RAC: 59681

RE: Folding@Home developed

Quote:
Folding@Home developed some memory checkers in the vein of memtest86+, first for CUDA and later for OpenCL: http://folding.stanford.edu/English/DownloadUtils.

Thanks! That's very useful tool. Didn't know about it

Derek
Derek
Joined: 9 Feb 05
Posts: 9
Credit: 2059412
RAC: 0

So, 13 of 17 results from the

So, 13 of 17 results from the 20th have been validated. The other 4 are still waiting to be validated. So far none have come up as fishy. I'm going to call this one fixed. Downloading WUs now!

I tried to update the title to reflect the true reason I was having trouble but I don't think I can.

Thanks everyone for you help and reminding me that computers get dirty.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.