I have a 2nd machine running both einstein and seti
its set for both to process CPU and GPU tasks.
With E@H it runs all tasks except the gravity wave ones ok.
The gravity wave tasks error out in about 1 second with a no output file error.
But it runns the other CPU & GPU tasks OK
I have no idea why this occurs. I have suspended new tasks, completed all outstanding tasks and reset the project in case it was a corrupted client application causing the problem.
However after reset I got 3 gravity wave tasks and all 3 errored out the same way.
Since I can process the other E@H tasks ok, I was loth to stop getting more tasks, but seemingly have no choice, since the option to deselect gravity wave tasks is grey'd out and cannot be altered.
As it stands the machine is now set not to recieve tasks untill this can be sorted out.
I posted earlier in the crunching forum in error, now corrected I hope by posting in this one.
Regards,
Cliff,
Been there, Done that, Still no damm T Shirt.
Copyright © 2024 Einstein@Home. All rights reserved.
Compute errors on only 1 type of task
)
Well, 'no output file' is a symptom that something has gone wrong, but not an error in itself.
The actual error is visible in stderr_txt:
We know from discussions elsewhere that your other computer may be under-powered: it looks as if this one could use some hardware TLC too.
Be aware that the Einstein Gravity Wave application has some of the tightest code around (thanks, Akosf!) and perhaps stresses CPUs more than most projects - though I don't think the 'hot-loop' will be reached as early as this. This application also stresses the memory bus very heavily at startup. Do you monitor CPU temperatures, and have you checked for dust bunnies recently?
Hi Richard, I only
)
Hi Richard,
I only look at that machine now and then, the GPU's are brand new, the CPU has been around for some time, but when I pressed it into service it was with an upgraded heatsink, not the standard amd one and I used a good thermal paste. Since its a 125watt cpu and the motherboard is rated for that with a 700 watt psu driving just the cpu, 2 gpu's and 1 hard drive and one dvd burner that psu should be ok.
The cpu is currently at 65C I dont think thats too high.
I have 8 gig of pc3 1600 memory cas latency is 9.
I'll have a look on the amd site and see if I can find out the max temp it should run at.
Damm looks like it should run at 62C:-( I'll see what I can do to lower that cpu temp.
Regards,
Cliff,
Been there, Done that, Still no damm T Shirt.
Hi Richard, Just
)
Hi Richard,
Just put cpuid's hw monitor on that rig, it was showing 65, so I added another external fan [mains ex an old drs500 mini] and it shows cpu temp now at 54c, but lower down on the hw monitor are core temps and those are running at about 68c.. and staying there.
I'm not sure if the max 62c refers to the cpu overall or to core temps:-/
I supose I might be able to lower the core temps by watercooling the cpu, but I cant do that just yet.. No funds right now.
Guess its going to be a no tasks situation for a while, or I just let those tasks fall over if they get sent.
I've reduced cpu usage to 90% in local settings and that 'seems' to have lowered the core temps a bit.
Regards,
Cliff,
Been there, Done that, Still no damm T Shirt.
I also am receiving similar
)
I also am receiving similar results. I have three machines that run Einstein pretty much 24/7. Neither of the other machines is particularly capable, being laptops running very throttled for heat management, but have never had a problem completing a task. And then I have a brand new machine ( http://einsteinathome.org/host/4671431 ) that errors on every single Gravity Wave task it gets.
I'd just turn off the Gravitational Wave app and happily continue crunching the others, but as cliff stated, I can't.
Hi AM, I'm not sure,
)
Hi AM,
I'm not sure, but if you suspend those GW tasks, will boinc request more of them?
I've not long been back with these programs, so I dont know how boinc or its clients treat suspended tasks.
Also I see you have 2 identical machines running e@h, but only 1 seems to have this problem..
So there must be some difference between those machines.. You seem to be running the same OS, Win7 sp1 so maybe as Richard suggested to me it might be heat related, cpuid also make a HW [harware monitor] utility and is on their website for D/L.
I also found it usefull to change the CPU usage % in local settings from 100% down to less than 90%. At Least local setting doesnt cause a slowdown on all the other machines, just the one:-) But it did bring my CPU heat down a bit.
Still there has to be some difference between those 2 computers either in hardware or software. Do both have the same type, quantity and speed of memory?
Regards
Cliff,
Been there, Done that, Still no damm T Shirt.
Actually, they're the same
)
Actually, they're the same machine; one of those entries is bogus (I think it came from BAM) and I'm just waiting for the one task it has assigned to time out before I can delete it.
My CPU temps sit right around 50C or a bit less at full load, graphics runs at around 65C (again, under load).
As for suspending, I can't say currently, as I haven't gotten any new Gravity Wave tasks yet.
Let's continue this
)
Let's continue this discussion here
http://einsteinathome.org/node/196209 to keep it in one central place.
HB