slow work

Alinator

Joined: 8 May 05

Posts: 927

Credit: 9352143

RAC: 0

RE: RE: Here's the link

19 Nov 2006 13:49:22 UTC

Message 50900 in response to message 50899

(moderation:

)

Quote:

Quote:
Here's the link to Dr. Allen's post on the matter:

Long WU Criteria

The only part I'm not clear on is how they determine the credit per CPU second.

Essentially it works out that roughly 1GHz class hosts and faster will get long WUs.

Alinator

Hosts whose benchmarks place them among the slowest 20% of hosts are given short WU if possible. The remaning 80% of machines get both slow and fast WU.

Cheers,
Bruce

Hi Dr. Allen,

That part I got. Here's a snippet from the last contact log from my P4:

2006-11-19 06:21:14.7627 [PID=18514] [normal ] [HOST#656342] [RESULT#53452272 l1_1408.5_S5R1__322_S5R1a_1] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2006-11-19 06:21:14.7627 [PID=18514] [debug ] cpu 31471.171875 cpcs 0.003723, cc 111.696181

The part I'm not getting is how the CPCS is calculated. If you multiply the shown CPCS and the CPU seconds reported you get a credit value of 117.167173, which doesn't match the reported CC value of 111.696181. When I calculate CPCS manually I get 0.003549.

Not much of a difference I grant you, but it's a "mystery" and I hate computational mysteries! :-)

Regards,

Alinator

Bruce Allen

Moderator

Joined: 15 Oct 04

Posts: 1119

Credit: 172127663

RAC: 0

RE: RE: RE: Here's the

20 Nov 2006 19:01:51 UTC

Message 50901 in response to message 50900

(moderation:

)

Quote:

Quote:
Quote:
Here's the link to Dr. Allen's post on the matter:

Long WU Criteria

The only part I'm not clear on is how they determine the credit per CPU second.

Essentially it works out that roughly 1GHz class hosts and faster will get long WUs.

Alinator

Hosts whose benchmarks place them among the slowest 20% of hosts are given short WU if possible. The remaning 80% of machines get both slow and fast WU.

Cheers,
Bruce

Hi Dr. Allen,

That part I got. Here's a snippet from the last contact log from my P4:

2006-11-19 06:21:14.7627 [PID=18514] [normal ] [HOST#656342] [RESULT#53452272 l1_1408.5_S5R1__322_S5R1a_1] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2006-11-19 06:21:14.7627 [PID=18514] [debug ] cpu 31471.171875 cpcs 0.003723, cc 111.696181

The part I'm not getting is how the CPCS is calculated. If you multiply the shown CPCS and the CPU seconds reported you get a credit value of 117.167173, which doesn't match the reported CC value of 111.696181. When I calculate CPCS manually I get 0.003549.

Not much of a difference I grant you, but it's a "mystery" and I hate computational mysteries! :-)

Regards,

Alinator

Your machine must have microscopic benchmark values. What happens if you use the BOINC manager to re-run the benchmarks when the machine is idle? Do the benchmark values change? Note: you can find the benchmark values in client_state.xml

Bruce

Director, Einstein@Home

Alinator

Joined: 8 May 05

Posts: 927

Credit: 9352143

RAC: 0

RE: Your machine must

3 Dec 2006 1:14:00 UTC

Message 50902 in response to message 50901

(moderation:

)

Quote:

Your machine must have microscopic benchmark values. What happens if you use the BOINC manager to re-run the benchmarks when the machine is idle? Do the benchmark values change? Note: you can find the benchmark values in client_state.xml

Bruce

Sorry about not replying sooner, but the P4 and PIII got "locked out" shortly after you posted with a LAN Router meltdown at their location (it's remote) and you folks had your server failure, so I hadn't fully thought about your reply until it was too late. I decided to let them run their course to study their behaviour on full "AutoBOINC" and they just got back in the game yesterday.

Anyway, when I first posted the P4 was showing slightly lower BM's than normal, which was most likely due to Windows trying to figure out why the router was acting up and not responding normally. In addition, I had failed to take into account the other performance metrics. Rerunning the calcs, now they work out as expected.

Conclusion: DUH..... Alinator! Give self slap on head! :-)

Alinator

Omikronman

Joined: 23 Nov 06

Posts: 33

Credit: 83254

RAC: 0

I have now seen several

7 Dec 2006 10:52:19 UTC

Message 50903

(moderation:

)

I have now seen several different sizes of work units here:

a) >= 2000 seconds CPU time
b) >= 20000 seconds CPU time
c) >= 27000 seconds CPU time
d) >= 33000 seconds CPU time
e) >= 38000 seconds CPU time

Pooh Bear 27

Joined: 20 Mar 05

Posts: 1376

Credit: 20312671

RAC: 0

RE: I have now seen several

7 Dec 2006 12:55:50 UTC

Message 50904 in response to message 50903

(moderation:

)

Quote:

I have now seen several different sizes of work units here:

a) >= 2000 seconds CPU time
b) >= 20000 seconds CPU time
c) >= 27000 seconds CPU time
d) >= 33000 seconds CPU time
e) >= 38000 seconds CPU time

Many factors can change time. Depending on what else the computer is doing at the time. If you are video editing, and crunching simultaneously, your time will be longer, because your CPU is being more heavily used so it slows the process down. Dust and dirt in the fans and heatsink, or heat issues can also cause your times to go up. Make sure your system is clean.

Omikronman

Joined: 23 Nov 06

Posts: 33

Credit: 83254

RAC: 0

Everything is new and clean

9 Dec 2006 10:03:13 UTC

Message 50905 in response to message 50904

(moderation:

)

Everything is new and clean here. I do no other work when Boinc is active. :-)

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2960096072

RAC: 710299

RE: I have now seen several

9 Dec 2006 11:14:19 UTC

Message 50906 in response to message 50903

(moderation:

)

Quote:

I have now seen several different sizes of work units here:

a) >= 2000 seconds CPU time
b) >= 20000 seconds CPU time
c) >= 27000 seconds CPU time
d) >= 33000 seconds CPU time
e) >= 38000 seconds CPU time

Your ~2,000 second results are short WUs, running normally, and your ~20,000 second results are long WUs running normally.

Anything more than that is a problem. For example, look at your result 56773680 (~38,500 seconds). It contains the section:

2006-12-05 09:23:20.7530 [normal]: Start of BOINC application 'einstein_S5R1_4.28_i686-apple-darwin'.
2006-12-05 09:23:20.7676 [normal]: Started search at lalDebugLevel = 0
2006-12-05 09:23:22.1964 [normal]: Found checkpoint-file 'Fstat.out.ckp'
Failed to read checkpoint-counters from 'Fstat.out.ckp'!
2006-12-05 09:23:22.1971 [normal]: No usable checkpoint found, starting from beginning.

So the early results got lost, and a substantial part of the work had to be re-done: that's where the extra time went.

It doesn't say in the text why the program restarted. Possible reasons are:
a) You switched the machine off and went to bed!
b) You are crunching multiple projects, and the machine gave some time to another one.
c) You started doing some other work on the machine, and BOINC is set to exit when the machine is in use.

If the situation is (b) or (c), you could try changing your preferences (this board, 'Your account', click on 'General preferences'). The two to look at are the second and fourth in the top group: "Do work while computer is in use?" and "Leave applications in memory while suspended?". If you change both of these to 'yes' (then save and update the BOINC Manager), you may have a lower chance of this kind of error.

(Some people have said that 'leave in memory' can cause problems on some Macs - keep an eye on it, and be prepared to switch the preferences back if it doesn't work out).

Omikronman

Joined: 23 Nov 06

Posts: 33

Credit: 83254

RAC: 0

Yes, I use the settings "Do

9 Dec 2006 23:41:51 UTC

Message 50907 in response to message 50906

(moderation:

)

Yes, I use the settings "Do work while computer is in use?" and "Leave applications in memory while suspended?" both with "yes" running. I have seen that a work needed to be re-done when I exit BOINC without switching the running work to "pause" before. If I set it to "pause" and then exit BOINC it is possible to continue later. The work infos say that sometimes the work checkpoint canÂ´t be found. O.o

slow work

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports