Hi,
Sorry if this has already been commented on (could not find a specific post about it). Some of my/everyones computers with CUDA gpus return "Completed, marked as invalid". (http://einsteinathome.org/host/3825357 or check top 20 list for any computer with gpu(s)) This seems to be a common thing with most gpu jobs, that some will have this exit. Why, and how can one avoid it?
Thanks! Best,
Aron
Copyright © 2024 Einstein@Home. All rights reserved.
CUDA jobs marked as invalid
)
Hi!
There is a known "cross validation" issue which is discussed in this thread: http://einsteinathome.org/node/195567&nowrap=true#109649. This means that different hosts will return slightly different results for the same workunit, depending on whether the task was run on CPU or GPU, or even on which model of GPU.
In addition, I had a situation myself recently where a GPU consistently returned results that would not validate with other results. After a reboot, the results validated again.
So it's currently not unusual to get an invalid result from time to time, unfortunately. But if you only get invalid results, a reboot would be a good idea, and one should check cooling and reduce overclocking, if applicable.
HB
Hi, Ok, I see. Thank you
)
Hi,
Ok, I see. Thank you for your prompt reply.
Best,
Aron
I am working with an old PC
)
I am working with an old PC (Dell Optiplex 320) but a gigabyte G210 graphics, CUDA capable, and I am obteining a lot of erros.
I could read somewhere that I must update to the last BOINC software and it is done, but I updated to the last Nvidia Drivers today as well.
I wish the number of errors will go down. Will see.
RE: I am working with an
)
Two things I see that may be the cause of your problems: 1st you are check pointing every minute:
[18:58:31][2168][INFO ] Checkpoint committed!
[18:59:38][2168][INFO ] Checkpoint committed!
[19:00:44][2168][INFO ] Checkpoint committed!
[19:01:50][2168][INFO ] Checkpoint committed!
[19:02:52][2168][INFO ] Checkpoint committed!
[19:03:53][2168][INFO ] Checkpoint committed!
[19:04:53][2168][INFO ] Checkpoint committed!
[19:05:54][2168][INFO ] Checkpoint committed!
[19:06:54][2168][INFO ] Checkpoint committed!
[19:07:59][2168][INFO ] Checkpoint committed!
If you raise that to say every 5, 10 or even 15 minutes it will use less memory. 2nd if it is not already set to 'leave applications in memory while suspended' change that so they ARE saved in memory. Both of these can be done on either the website or in the Boinc Manager. Doing it on the website means it is a global thing, doing it in the Boinc Manager means it is a by the pc thing. I have my checkpoint set to 900 seconds, 15 minutes.
The 3rd thing I see is that your pc only has 2 gig of ram in it, raising that will let your pc 'breathe' and it will run much better. Right now it is using the hard drive as 'virtual' ram, raising the physical amount will stop it from having to use 'virtual' stuff. I see you are running the 64bit version of Windows 7, that means you are not confined to only using 3.5 gig but instead are confined by the physical limitations of the board. A pdf on the internet says your pc can have a Maximum memory of 4 GB. I would look at the costs involved and see if they are worth it to you. A quick look at this website
http://www.crucial.com/store/listparts.aspx?model=OptiPlex%20380%20Desktop says it is 25 bucks for one 2gig module, that means 50 bucks to upgrade. This is for a desktop and you did not say if you have the mini form factor, a tower or what kind of case you have, so the price could change based on that.