Boinc 4.45

gravywavy

Joined: 22 Jan 05

Posts: 392

Credit: 68962

RAC: 0

RE: ...I've read the Wiki,

28 Jul 2005 11:39:56 UTC

Message 14421 in response to message 14417

(moderation:

)

Quote:

...I've read the Wiki, I've set my connect time to 2 days (for the last six months, (reasonable for an AMD64 3000) and I fully understand the concept of Short/Long term debt and the scheduler's rules of fetching WU's.

How come E@H receives 45hrs of work to be completed in 7 days when my resource is set at 8% (which is about 14hrs of 7 days)? This leads to overcommitment of boinc, ie panic mode, no work fetch (all projects) and earliest deadline first.

The received wisdom is not that this doesn't happen, clearly it does as you and many have pointed out.

The received wisdom is that it does not matter. After a binge on Einstein wu, the client will abstain from Einstein for a while, then binge again. The long term effect will be to process Einstein for 8% of the time over the long run.

It is a sort of controlled panic - the client goes into panic mode (officially known as EDF, not panic) every so often, but controlled in the long term by the LTD which then prevents more work being downloaded from that project for a while.

As John pointed out, if you get impatient and do a reset of the Einstein project, cos you feel it is about time you saw some Einstein wu again, then yo defeat the whole process.

Quote:

And why does this ONLY happens with E@H?

It is not true to say it only happens to Einstein. Set your resource for SETI to 0.1% and you will see it happen there too. For any given project and any given machine, there will be a resource share below which this effect occurs.

For a given machine, this critical resource share is higher for Einstein than for other projects, leading to the appearance in many cases that it is 'only Einstein'.

The reason Einstein sees this effect at higher resource shares is due to some combination of the following two distinctive features of this project's current server configuration:

a) Einstein has the shortest deadlines, counted in number of wu that can be processed in the deadline interval (eg on a 700MHz box, Einstein takes around 24 hours to run, meaning ~7 wu in the deadline time of 1 week, LHC takes 11 hours, meaning ~30 wu in the deadline time of 2 weeks, Predictor takes ~4 hours, meaning ~40 wu in the deadline time of 1 week)

b) Einstein underestimates its run durations on most platforns (eq on a 700MHz box it estimates 18 hours but takes around 24 per wu) whereas all the other projects I know of underestimate, (LHC est 27 hours, takes 11; Predictor est 6 hours takes 3-4)

Many people have suggested (with varying degrees of politeness) altering (a) by inceasing Einstein deadlines, this is not done because of issues with the database size.

I have suggested before that Einstein simply doubles its quoted estimates of the amount of work in a WU, which would bring it more in line with the practice on other projects and address issue (b). Nobody has told me why this is not done...

However, I emphasize that the effect of either change would not be to remove the effect, it would simply be to make it occur less often. It is understandable that the project admins are not rushing to make adjustments.

~~gravywavy

Gordon Hartman

Joined: 19 Feb 05

Posts: 34

Credit: 34812635

RAC: 2777

RE: Hi Gordon, Still the

28 Jul 2005 13:48:20 UTC

Message 14422 in response to message 14420

(moderation:

)

Quote:

Hi Gordon,

Still the same problem for me as well. Read our disscussion at

http://einsteinathome.org/node/189616

V7

As of today no work!

7/28/2005 9:45:10 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/28/2005 9:45:10 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/28/2005 9:45:20 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

Jim Baize

Joined: 22 Jan 05

Posts: 116

Credit: 582144

RAC: 0

I looked through this thread

28 Jul 2005 15:14:02 UTC

Message 14423 in response to message 14422

(moderation:

)

I looked through this thread but could not find the info. What are your resource shares? How many WU's of each project do you have in queue already?

Quote:

As of today no work!

7/28/2005 9:45:10 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/28/2005 9:45:10 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/28/2005 9:45:20 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

Jim

Gordon Hartman

Joined: 19 Feb 05

Posts: 34

Credit: 34812635

RAC: 2777

RE: I looked through this

28 Jul 2005 17:14:07 UTC

Message 14424 in response to message 14423

(moderation:

)

Quote:

I looked through this thread but could not find the info. What are your resource shares? How many WU's of each project do you have in queue already?

Quote:
As of today no work!

7/28/2005 9:45:10 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/28/2005 9:45:10 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/28/2005 9:45:20 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

Seti is 100 (40%) has 9 WUs, Einstein is 50 (20%)has ZERO WUs, Climate is 100 has 2 WUs(40%). I'm rnning WinXP, with an Intel 3.2mhz. Everytning was fine until I went to 4.45, before on 4.25 I normally had 5 Einstien WUs ( I was also giving each one 33 1/3% each) I also have a Win2K AMD 1.6 mhz that has the same problem, no Einstein units! I'm at a loss!

Jim Baize

Joined: 22 Jan 05

Posts: 116

Credit: 582144

RAC: 0

One thing to note is that

28 Jul 2005 17:24:57 UTC

Message 14425 in response to message 14424

(moderation:

)

One thing to note is that 4.45 will not behave like 4.25. The scheduler code was changed. So, if you are looking for identical behaviour, you're not going to see it.

Another aspect you have to keep in mind is the LT and the ST debts. It may still be trying to equalize those two debts.

Did you also update the second machie to 4.45 at the same time?

Quote:

Seti is 100 (40%) has 9 WUs, Einstein is 50 (20%)has ZERO WUs, Climate is 100 has 2 WUs(40%). I'm rnning WinXP, with an Intel 3.2mhz. Everytning was fine until I went to 4.45, before on 4.25 I normally had 5 Einstien WUs ( I was also giving each one 33 1/3% each) I also have a Win2K AMD 1.6 mhz that has the same problem, no Einstein units! I'm at a loss!

Jim

Gordon Hartman

Joined: 19 Feb 05

Posts: 34

Credit: 34812635

RAC: 2777

RE: One thing to note is

28 Jul 2005 20:46:46 UTC

Message 14426 in response to message 14425

(moderation:

)

Quote:

One thing to note is that 4.45 will not behave like 4.25. The scheduler code was changed. So, if you are looking for identical behaviour, you're not going to see it.

Another aspect you have to keep in mind is the LT and the ST debts. It may still be trying to equalize those two debts.

Did you also update the second machie to 4.45 at the same time?

Quote:
Seti is 100 (40%) has 9 WUs, Einstein is 50 (20%)has ZERO WUs, Climate is 100 has 2 WUs(40%). I'm rnning WinXP, with an Intel 3.2mhz. Everytning was fine until I went to 4.45, before on 4.25 I normally had 5 Einstien WUs ( I was also giving each one 33 1/3% each) I also have a Win2K AMD 1.6 mhz that has the same problem, no Einstein units! I'm at a loss!

No I'm not looking for Idenical behavior, just some work. Yes I updated a WinXP, a Win2K, & a Win98(just Seti) the same day.

I'm leaving it alone for 7 days then I'll change something!

venox7

Joined: 22 Jan 05

Posts: 16

Credit: 10072175

RAC: 0

After all our discussions

29 Jul 2005 7:58:18 UTC

Message 14427

(moderation:

)

After all our discussions here, and reading other threads as well, I'm sure the only problem is the misunderstanding of how the scheduler/resource share actually works. The fact is that it requests the amount of work equal to your reconnect time, and not equal to your reconnect * share * time to deadline. Based on this E@H is the only project that is doing what it is supposed to be doing.

On the other hand, if all the projects were doing that, deadlines will surely be missed....

Gordon Hartman

Joined: 19 Feb 05

Posts: 34

Credit: 34812635

RAC: 2777

RE: RE: Hi Gordon, Still

29 Jul 2005 10:07:07 UTC

Message 14428 in response to message 14422

(moderation:

)

Quote:

Quote:
Hi Gordon,

Still the same problem for me as well. Read our disscussion at

http://einsteinathome.org/node/189616

V7

As of today no work!

7/28/2005 9:45:10 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/28/2005 9:45:10 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/28/2005 9:45:20 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

7/29/2005 6:05:08 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/29/2005 6:05:08 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/29/2005 6:05:10 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

Gordon Hartman

Joined: 19 Feb 05

Posts: 34

Credit: 34812635

RAC: 2777

RE: After all our

29 Jul 2005 10:08:16 UTC

Message 14429 in response to message 14427

(moderation:

)

Quote:

After all our discussions here, and reading other threads as well, I'm sure the only problem is the misunderstanding of how the scheduler/resource share actually works. The fact is that it requests the amount of work equal to your reconnect time, and not equal to your reconnect * share * time to deadline. Based on this E@H is the only project that is doing what it is supposed to be doing.

On the other hand, if all the projects were doing that, deadlines will surely be missed....

Then why am I not getting any work?

7/29/2005 6:05:08 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/29/2005 6:05:08 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/29/2005 6:05:10 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

venox7

Joined: 22 Jan 05

Posts: 16

Credit: 10072175

RAC: 0

Gordon, Check if you're

29 Jul 2005 10:23:30 UTC

Message 14430

(moderation:

)

Gordon,

Check if you're long term debt is negative.

You can use BoincView to do this. Download it at

http://boincview.amanheis.de/

It's a very usefull little program.

If your E@H long term debt is negative, Boinc won't request work for it, unless all other projects (with positive debt) are out of work.

That is a long term way of honouring the resource share settings.

Once the E@H long term debt is positive again, Boinc will automatically request work for it.

The 'problem' starts when E@H downloads enough work to fill your reconnect time, which it is supposed to do according to the current rules of Boinc. Due to this boinc goes into EDF (earliest deadline first) and commits all resources to E@H to make sure it is completed by the deadline. The consequence of this is that E@H builds up a huge long term debt by hoggin your cpu for days to finish before the deadline. As a result the other projects must get the same amount of cpu time (based on your resource share) before boinc requests any further E@H work.

In theory, over time, the resource settings will be honoured, although E@H won't get work for a week, or two or three, depending on your resource settings and reconnect time.

A way to get a more even spread of work on you machine is to lower your reconnect time, but this is only feasible for people with always on internet connections. With 0.1 or 0.2 day reconnect time the scheduler only downloads 1 or 2 workunits at a time which can be easily finished before the deadline, therefore not over committing and going into EDF.

In short, short term debt determines which project get the cpu, and long term debt determines which projects get new work units

Hope this helps.

Boinc 4.45

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports