I recently upgraded to CUDA 5.0 and have since noticed a large amount of invalid results. Previously I was getting maybe one a month now its has exceeded 6 per day if not more.
Previous config:
CUDA 4.0 with the 290.10 nvidia driver
Ubuntu 10.04 LTS Server
Now
CUDA 5.0 with the 304.54 nvidia driver
Ubuntu 10.04 LTS Server
Graphics CARD:
Nvidia GTX 550 Ti
Copyright © 2024 Einstein@Home. All rights reserved.
[SOLVED] Upgrade to Cuda 5.0 NOT causing invalid results - Probl
)
In case anyone was having similar problems. I have upgraded to the 310.19 driver, and things seem to be better now.
I think I spoke to soon...
)
I think I spoke to soon... Most of my tasks are in a validation inconclusive state. Is there any way that I can find out why the validator does not like the tasks? It might help me pin down the problem on my end.
Here is an example work unit that is invalid: 140678872
Here is one that is in the inconclusive state: 140693298
Thanks for any help
So, I found this thread which
)
So, I found this thread which seems to be a similar problem. http://einsteinathome.org/node/196578
I ran lsof and indeed BOINC is using the 64-bit libraries. But, it seems to be running fine except for the validation errors. I tried the trick at the bottom of the thread with making a link in the project directory but to no avail. I have installed the 32 bit libraries but no amount of ldconfig seems to get BOINC to use them.........
Tried 'export
)
Tried 'export LD_LIBRARY_PATH=$PATH_TO_32BIT_LIBS' before running boinc? And perhaps 'ldd $INSTALL_PREFIX/boinc' to see which libs would be used.
Hi! The version of the app
)
Hi!
The version of the app you are using on this host is:
einsteinbinary_BRP4_1.31_x86_64-pc-linux-gnu__BRP4cuda32nv270
so it is 64bit, and therefore, it needs 64 bit libs. That is ok.
Some of the hosts that did produce valid results for the tasks that failed on your PC actually also had CUDA 5 drivers under Linux (there is a line in the stderr.txt file that gets uploaded and is visisble in the result view).
http://einsteinathome.org/task/326520791
I doubt there is something fundamentally wrong with the app wrt. CUDA 5 drivers.
Cheers
HB
Neil, thanks for the advice I
)
Neil, thanks for the advice I had tried all kinds of things trying to figure out the library thing. But, I think Bikeman's comment indicates that I should be using the 64-bit drivers/libs, so thats good.
Bikeman, are you saying that I should expect invalid results from CUDA 5 until NVidia works things out or is there something that I can do on my end. I would prefer not to submit invalid results to the project on a daily basis.
I'm going to assume that rolling back to CUDA 4 might be my only option and that is going to be a royal pain. I think I still have the install files.......
Thanks for the responses.
I´m running ubuntu x64 10.04
)
I´m running ubuntu x64 10.04 LTS with 310.14 which has been quite stable generally speaking.
Here is something I prepared earlier...
http://einsteinathome.org/task/327511338
I don´t recall installing any specific cuda version, so not sure if that helps.
RE: Bikeman, are you
)
To me, the fact that others are completing the same workunits with CUDA 5 on Linux seems to indicate the problem is not with NVIDIA. It always good to reboot, check the cooling, check the power supply, check the memory in this kind of situation. It's hard to diagnose these things remotely, of course.
CU
HB
I'm on openSUSE with the
)
I'm on openSUSE with the latest stable NV driver (310.19) which offers CUDA 5 and I have no issues with it. Yes, I'm on 64-bit Linux too. I agree with Bikeman, check your system...
Checking the system isn't a
)
Checking the system isn't a bad idea. I just hadn't considered it since the upgrade to CUDA 5 is what caused invalid results to happen en-mass. At-least that is what appeared to be the case. I am going to let the majority of my work done validate and then start crunching again after a system evaluation.
Here is a listing of the typical temps in my system when crunching:
Adapter: ISA adapter
Core 0: +72.0°C (high = +82.0°C, crit = +100.0°C)
coretemp-isa-0001
Adapter: ISA adapter
Core 3: +70.0°C (high = +82.0°C, crit = +100.0°C)
coretemp-isa-0002
Adapter: ISA adapter
Core 1: +73.0°C (high = +82.0°C, crit = +100.0°C)
coretemp-isa-0003
Adapter: ISA adapter
Core 2: +70.0°C (high = +82.0°C, crit = +100.0°C)
Gpu : N/A
Gpu : 69 C
Fan Speed : 40 %