massive problems starting 4 nov with cruncher

peristalsis
peristalsis
Joined: 20 Mar 05
Posts: 29
Credit: 49915332
RAC: 0
Topic 196053

Haven't looked at Boinc in a couple of days. Checked and it shows all errors since the fourth of November. Could some kind person take a look at my results and give an opinion? My initial thought is memory but I'm trying to remember when I installed Win7 SP1 too. Seti running fine.
My machine number: 2240837
Thanks!

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

massive problems starting 4 nov with cruncher

The Gamma-ray pulsar search tasks all seem to end up with

Network access is denied. (0x41) - exit code 65 (0x41)

whatever that means.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

peristalsis
peristalsis
Joined: 20 Mar 05
Posts: 29
Credit: 49915332
RAC: 0

Thanks Gundolf. Opened up the

Thanks Gundolf.
Opened up the machine, popped the 4 sticks of ram out, cleaned contacts, reinstalled. Everything else looks cleaner than my car (g).
Allowed new work for Einstein. Got two WU's. First one errored out with:
11/11/2011 18:03:40 Einstein@Home Output file LATeah0052S_416.0_54350_0.0_1_0 for task LATeah0052S_416.0_54350_0.0_1 absent
11/11/2011 18:03:40 Einstein@Home Output file LATeah0052S_416.0_54350_0.0_1_1 for task LATeah0052S_416.0_54350_0.0_1 absent

Opened up Boinc manager properties. Ticked "run as administrator" and "run in XP compatibility mode". We'll see if that has any affect on the remaining WU.
thanks again!
john

peristalsis
peristalsis
Joined: 20 Mar 05
Posts: 29
Credit: 49915332
RAC: 0

Second WU errored out the

Second WU errored out the same. Baffled. Thinking of doing a backup (Acronis) and then a Windows restore previous to SP1 install...matter pends.
Hate wasting Einstein's resources!

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118425855168
RAC: 25863597

RE: Opened up Boinc manager

Quote:
Opened up Boinc manager properties. Ticked "run as administrator" and "run in XP compatibility mode". We'll see if that has any affect on the remaining WU.


It's not needed and it's not really a good idea to run BOINC as a privileged user.

I think it's much more likely to be a hardware related problem so here's a list of things to try in order to check which bit of hardware.

  • * If you are overclocking, reduce (or fully remove) the overclocking. Use one of the CPU stressing tools to check for stability.
    * You said the insides were clean but really check the fins of the heat sink and the rotation speed of the CPU fan. The other possibility is that the thermal interface material (TIM) may have dried out with age and the CPU might be overheating that way.
    * Loosen your RAM timings to a little worse than what the SPD values specify. I've seen a few examples where this has cured problems like yours which have suddenly started after a long period of stability.
    * Test memory with something like memtest86 (several passes without error).
    * Carefully inspect your motherboard for any signs of swollen caps. If you find any in the voltage regulator section showing domed or split tops, this would probably be the cause. With basic soldering skills, it's actually quite easy to remove and replace capacitors on the motherboard.
    * Inspect the PSU in the same way. Try a spare PSU and see if the problem goes away. Is your PSU running hot? Maybe the PSU fan needs re-lubrication.
    * If still no joy, remove two sticks of ram and see if the problem goes away. If not, swap the inserted pair with the removed pair and try again. After all this, if the problem persists, it's probably not your RAM. You could try running with just a single stick in case there's a problem with dual channel mode.
    * If you have a discrete graphics card, check it for swollen caps. If you can, try running with just integrated graphics and see if anything changes.

Good luck with tracking down the cause.

Cheers,
Gary.

peristalsis
peristalsis
Joined: 20 Mar 05
Posts: 29
Credit: 49915332
RAC: 0

RE: RE: Opened up Boinc

Quote:
Quote:
Opened up Boinc manager properties. Ticked "run as administrator" and "run in XP compatibility mode". We'll see if that has any affect on the remaining WU.

It's not needed and it's not really a good idea to run BOINC as a privileged user.

I think it's much more likely to be a hardware related problem so here's a list of things to try in order to check which bit of hardware.

  • * If you are overclocking, reduce (or fully remove) the overclocking. Use one of the CPU stressing tools to check for stability.
    * You said the insides were clean but really check the fins of the heat sink and the rotation speed of the CPU fan. The other possibility is that the thermal interface material (TIM) may have dried out with age and the CPU might be overheating that way.
    * Loosen your RAM timings to a little worse than what the SPD values specify. I've seen a few examples where this has cured problems like yours which have suddenly started after a long period of stability.
    * Test memory with something like memtest86 (several passes without error).
    * Carefully inspect your motherboard for any signs of swollen caps. If you find any in the voltage regulator section showing domed or split tops, this would probably be the cause. With basic soldering skills, it's actually quite easy to remove and replace capacitors on the motherboard.
    * Inspect the PSU in the same way. Try a spare PSU and see if the problem goes away. Is your PSU running hot? Maybe the PSU fan needs re-lubrication.
    * If still no joy, remove two sticks of ram and see if the problem goes away. If not, swap the inserted pair with the removed pair and try again. After all this, if the problem persists, it's probably not your RAM. You could try running with just a single stick in case there's a problem with dual channel mode.
    * If you have a discrete graphics card, check it for swollen caps. If you can, try running with just integrated graphics and see if anything changes.
Good luck with tracking down the cause.


Thanks!
All fans are running, CPU temps are OK, 500w power supply is pushing out plenty of hot air but you never know. Ran the Win7 memory check with no errors. It's about worthless IMO.
Motherboard and video card look good. Running 4 one gig Corsair ram chips. I'll pull two and see what happens.
In a fit of stupidity I decided to install win7 64 bit over my XP install. I'd forgotten what a pain it is to re-install all of my programs.
Back on my win7 32bit until my irritation ebbs. Getting too old for this windows nonsense. While windows was installing I did update my linux box so not a complete waste of time.
Christmas is coming so it might just be time to build a new machine and replace this AMD 4400 two core with something newer.
Thanks again, john

peristalsis
peristalsis
Joined: 20 Mar 05
Posts: 29
Credit: 49915332
RAC: 0

The saga continues. Stopped

The saga continues. Stopped running Einstein entirely and am just running Rosetta..it's a faster completion to see if problems resolved. Pulled two mem sticks, no change (four one gig sticks total). Swapped memory sticks, no change. Keep getting "output file not found". Software issue??? I'm dazed and confused about the real cause!! Windows restore gave me a 'file not found'. How trustworthy. Backed up current C drive and replaced with one from six months (prior to SP1 install) to see if it's a software issue.
If this doesn't work may have to go for a complete tear-down.
Matter pends..john

paul milton
paul milton
Joined: 16 Sep 05
Posts: 329
Credit: 35825044
RAC: 0

have you tryd disabling your

have you tryd disabling your antivirus? or at least making sure that boincs data directory is on the "exclude" list for it?

as to the no network access bit, i honestly havent got a clue, i want to say firewall but to me that would block both ways, proxy perhaps?

good luck! :)

seeing without seeing is something the blind learn to do, and seeing beyond vision can be a gift.

peristalsis
peristalsis
Joined: 20 Mar 05
Posts: 29
Credit: 49915332
RAC: 0

All of my problems seemed to

All of my problems seemed to happen after I installed Win 7 SP1. I reloaded an old image and installed all of the Windows updates issued before October 2011. Haven't had a bad WU for Einstein or Rosetta since then. Still have to install the newer updates and see if or which one borks my system. It's got me baffled but just as long as things are working again don't care (g)..

Edward Lim
Edward Lim
Joined: 13 Aug 10
Posts: 9
Credit: 8436040
RAC: 20302

I have 2 computers operating

I have 2 computers operating BOINC yet one has failed to communicate completed wu updates while the other does. I have tried uninstalling and installing BOINC fresh thinking a fresh boot would help. I have re-booted both computers. No luck.
On ALLPROJECTSSTATS.com I have not seen an update for several days now.

This is my last message attached

2011-11-15 16:27:17.0597 [PID=14654] Request: [USER#xxxxx] [HOST#3831119] [IP xxx.xxx.xxx.75] client 6.12.35
2011-11-15 16:27:17.0622 [PID=14654] [send] effective_ncpus 4 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2011-11-15 16:27:17.0622 [PID=14654] [send] effective_ngpus 0 max_jobs_on_host_gpu 999999
2011-11-15 16:27:17.0622 [PID=14654] [send] Not using matchmaker scheduling; Not using EDF sim
2011-11-15 16:27:17.0622 [PID=14654] [send] CPU: req 1198.55 sec, 0.00 instances; est delay 0.00
2011-11-15 16:27:17.0622 [PID=14654] [send] work_req_seconds: 1198.55 secs
2011-11-15 16:27:17.0622 [PID=14654] [send] available disk 9.24 GB, work_buf_min 8640
2011-11-15 16:27:17.0622 [PID=14654] [send] active_frac 0.999958 on_frac 0.992463 DCF 1.992044
2011-11-15 16:27:17.0636 [PID=14654] [send] [HOST#3831119] is reliable
2011-11-15 16:27:17.0637 [PID=14654] [send] set_trust: random choice for error rate 0.000010: yes
2011-11-15 16:27:19.2341 [PID=14654] [version] Checking plan class 'SSE2'
2011-11-15 16:27:19.2343 [PID=14654] [version] reading plan classes from file '../plan_class_spec.xml'
2011-11-15 16:27:19.2344 [PID=14654] [version] Best version of app einstein_S6Bucket is ID 268 (5.24 GFLOPS)
2011-11-15 16:27:19.2352 [PID=14654] [debug] Sorted list of URLs follows [host timezone: UTC-18000]
2011-11-15 16:27:19.2352 [PID=14654] [debug] zone=-21600 url=http://einstein-dl2.phys.uwm.edu
2011-11-15 16:27:19.2352 [PID=14654] [debug] zone=-21600 url=http://einstein-dl4.phys.uwm.edu
2011-11-15 16:27:19.2352 [PID=14654] [debug] zone=-28800 url=http://einstein.ligo.caltech.edu
2011-11-15 16:27:19.2352 [PID=14654] [debug] zone=+03600 url=http://einstein-mirror.aei.uni-hannover.de/EatH
2011-11-15 16:27:19.2354 [PID=14654] [send] [HOST#3831119] Sending app_version einstein_S6Bucket 6 101 SSE2; 5.24 GFLOPS
2011-11-15 16:27:19.2364 [PID=14654] [send] est. duration for WU 109566859: unscaled 12365.51 scaled 24820.74
2011-11-15 16:27:19.2364 [PID=14654] [HOST#3831119] Sending [RESULT#257336996 h1_0432.70_S6GC1__2321_S6BucketA_1] (est. dur. 24820.74 seconds)
2011-11-15 16:27:19.2380 [PID=14654] [send] don't need more work
2011-11-15 16:27:19.2380 [PID=14654] [send] don't need more work
2011-11-15 16:27:19.2380 [PID=14654] [send] don't need more work
2011-11-15 16:27:19.2380 [PID=14654] [send] don't need more work
2011-11-15 16:27:19.2393 [PID=14654] Sending reply to [HOST#3831119]: 1 results, delay req 60.00
2011-11-15 16:27:19.2396 [PID=14654] Scheduler ran 2.186 seconds

I would appreciate any help, keep in mind I am a computer tech novice.
Thanks,
Ed

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

I'm not sure where you see a

I'm not sure where you see a problem, since both of your computers have reported completed tasks today.

3831119: 15 Nov 2011 15:03:41 UTC
3267790: 15 Nov 2011 20:03:39 UTC

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.