...I've read the Wiki, I've set my connect time to 2 days (for the last six months, (reasonable for an AMD64 3000) and I fully understand the concept of Short/Long term debt and the scheduler's rules of fetching WU's.
How come E@H receives 45hrs of work to be completed in 7 days when my resource is set at 8% (which is about 14hrs of 7 days)? This leads to overcommitment of boinc, ie panic mode, no work fetch (all projects) and earliest deadline first.
The received wisdom is not that this doesn't happen, clearly it does as you and many have pointed out.
The received wisdom is that it does not matter. After a binge on Einstein wu, the client will abstain from Einstein for a while, then binge again. The long term effect will be to process Einstein for 8% of the time over the long run.
It is a sort of controlled panic - the client goes into panic mode (officially known as EDF, not panic) every so often, but controlled in the long term by the LTD which then prevents more work being downloaded from that project for a while.
As John pointed out, if you get impatient and do a reset of the Einstein project, cos you feel it is about time you saw some Einstein wu again, then yo defeat the whole process.
Quote:
And why does this ONLY happens with E@H?
It is not true to say it only happens to Einstein. Set your resource for SETI to 0.1% and you will see it happen there too. For any given project and any given machine, there will be a resource share below which this effect occurs.
For a given machine, this critical resource share is higher for Einstein than for other projects, leading to the appearance in many cases that it is 'only Einstein'.
The reason Einstein sees this effect at higher resource shares is due to some combination of the following two distinctive features of this project's current server configuration:
a) Einstein has the shortest deadlines, counted in number of wu that can be processed in the deadline interval (eg on a 700MHz box, Einstein takes around 24 hours to run, meaning ~7 wu in the deadline time of 1 week, LHC takes 11 hours, meaning ~30 wu in the deadline time of 2 weeks, Predictor takes ~4 hours, meaning ~40 wu in the deadline time of 1 week)
b) Einstein underestimates its run durations on most platforns (eq on a 700MHz box it estimates 18 hours but takes around 24 per wu) whereas all the other projects I know of underestimate, (LHC est 27 hours, takes 11; Predictor est 6 hours takes 3-4)
Many people have suggested (with varying degrees of politeness) altering (a) by inceasing Einstein deadlines, this is not done because of issues with the database size.
I have suggested before that Einstein simply doubles its quoted estimates of the amount of work in a WU, which would bring it more in line with the practice on other projects and address issue (b). Nobody has told me why this is not done...
However, I emphasize that the effect of either change would not be to remove the effect, it would simply be to make it occur less often. It is understandable that the project admins are not rushing to make adjustments.
Seti is 100 (40%) has 9 WUs, Einstein is 50 (20%)has ZERO WUs, Climate is 100 has 2 WUs(40%). I'm rnning WinXP, with an Intel 3.2mhz. Everytning was fine until I went to 4.45, before on 4.25 I normally had 5 Einstien WUs ( I was also giving each one 33 1/3% each) I also have a Win2K AMD 1.6 mhz that has the same problem, no Einstein units! I'm at a loss!
One thing to note is that 4.45 will not behave like 4.25. The scheduler code was changed. So, if you are looking for identical behaviour, you're not going to see it.
Another aspect you have to keep in mind is the LT and the ST debts. It may still be trying to equalize those two debts.
Did you also update the second machie to 4.45 at the same time?
Quote:
Seti is 100 (40%) has 9 WUs, Einstein is 50 (20%)has ZERO WUs, Climate is 100 has 2 WUs(40%). I'm rnning WinXP, with an Intel 3.2mhz. Everytning was fine until I went to 4.45, before on 4.25 I normally had 5 Einstien WUs ( I was also giving each one 33 1/3% each) I also have a Win2K AMD 1.6 mhz that has the same problem, no Einstein units! I'm at a loss!
One thing to note is that 4.45 will not behave like 4.25. The scheduler code was changed. So, if you are looking for identical behaviour, you're not going to see it.
Another aspect you have to keep in mind is the LT and the ST debts. It may still be trying to equalize those two debts.
Did you also update the second machie to 4.45 at the same time?
Quote:
Seti is 100 (40%) has 9 WUs, Einstein is 50 (20%)has ZERO WUs, Climate is 100 has 2 WUs(40%). I'm rnning WinXP, with an Intel 3.2mhz. Everytning was fine until I went to 4.45, before on 4.25 I normally had 5 Einstien WUs ( I was also giving each one 33 1/3% each) I also have a Win2K AMD 1.6 mhz that has the same problem, no Einstein units! I'm at a loss!
No I'm not looking for Idenical behavior, just some work. Yes I updated a WinXP, a Win2K, & a Win98(just Seti) the same day.
I'm leaving it alone for 7 days then I'll change something!
After all our discussions here, and reading other threads as well, I'm sure the only problem is the misunderstanding of how the scheduler/resource share actually works. The fact is that it requests the amount of work equal to your reconnect time, and not equal to your reconnect * share * time to deadline. Based on this E@H is the only project that is doing what it is supposed to be doing.
On the other hand, if all the projects were doing that, deadlines will surely be missed....
After all our discussions here, and reading other threads as well, I'm sure the only problem is the misunderstanding of how the scheduler/resource share actually works. The fact is that it requests the amount of work equal to your reconnect time, and not equal to your reconnect * share * time to deadline. Based on this E@H is the only project that is doing what it is supposed to be doing.
On the other hand, if all the projects were doing that, deadlines will surely be missed....
If your E@H long term debt is negative, Boinc won't request work for it, unless all other projects (with positive debt) are out of work.
That is a long term way of honouring the resource share settings.
Once the E@H long term debt is positive again, Boinc will automatically request work for it.
The 'problem' starts when E@H downloads enough work to fill your reconnect time, which it is supposed to do according to the current rules of Boinc. Due to this boinc goes into EDF (earliest deadline first) and commits all resources to E@H to make sure it is completed by the deadline. The consequence of this is that E@H builds up a huge long term debt by hoggin your cpu for days to finish before the deadline. As a result the other projects must get the same amount of cpu time (based on your resource share) before boinc requests any further E@H work.
In theory, over time, the resource settings will be honoured, although E@H won't get work for a week, or two or three, depending on your resource settings and reconnect time.
A way to get a more even spread of work on you machine is to lower your reconnect time, but this is only feasible for people with always on internet connections. With 0.1 or 0.2 day reconnect time the scheduler only downloads 1 or 2 workunits at a time which can be easily finished before the deadline, therefore not over committing and going into EDF.
In short, short term debt determines which project get the cpu, and long term debt determines which projects get new work units
RE: ...I've read the Wiki,
)
The received wisdom is not that this doesn't happen, clearly it does as you and many have pointed out.
The received wisdom is that it does not matter. After a binge on Einstein wu, the client will abstain from Einstein for a while, then binge again. The long term effect will be to process Einstein for 8% of the time over the long run.
It is a sort of controlled panic - the client goes into panic mode (officially known as EDF, not panic) every so often, but controlled in the long term by the LTD which then prevents more work being downloaded from that project for a while.
As John pointed out, if you get impatient and do a reset of the Einstein project, cos you feel it is about time you saw some Einstein wu again, then yo defeat the whole process.
It is not true to say it only happens to Einstein. Set your resource for SETI to 0.1% and you will see it happen there too. For any given project and any given machine, there will be a resource share below which this effect occurs.
For a given machine, this critical resource share is higher for Einstein than for other projects, leading to the appearance in many cases that it is 'only Einstein'.
The reason Einstein sees this effect at higher resource shares is due to some combination of the following two distinctive features of this project's current server configuration:
a) Einstein has the shortest deadlines, counted in number of wu that can be processed in the deadline interval (eg on a 700MHz box, Einstein takes around 24 hours to run, meaning ~7 wu in the deadline time of 1 week, LHC takes 11 hours, meaning ~30 wu in the deadline time of 2 weeks, Predictor takes ~4 hours, meaning ~40 wu in the deadline time of 1 week)
b) Einstein underestimates its run durations on most platforns (eq on a 700MHz box it estimates 18 hours but takes around 24 per wu) whereas all the other projects I know of underestimate, (LHC est 27 hours, takes 11; Predictor est 6 hours takes 3-4)
Many people have suggested (with varying degrees of politeness) altering (a) by inceasing Einstein deadlines, this is not done because of issues with the database size.
I have suggested before that Einstein simply doubles its quoted estimates of the amount of work in a WU, which would bring it more in line with the practice on other projects and address issue (b). Nobody has told me why this is not done...
However, I emphasize that the effect of either change would not be to remove the effect, it would simply be to make it occur less often. It is understandable that the project admins are not rushing to make adjustments.
~~gravywavy
RE: Hi Gordon, Still the
)
As of today no work!
7/28/2005 9:45:10 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/28/2005 9:45:10 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/28/2005 9:45:20 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
I looked through this thread
)
I looked through this thread but could not find the info. What are your resource shares? How many WU's of each project do you have in queue already?
Jim
RE: I looked through this
)
Seti is 100 (40%) has 9 WUs, Einstein is 50 (20%)has ZERO WUs, Climate is 100 has 2 WUs(40%). I'm rnning WinXP, with an Intel 3.2mhz. Everytning was fine until I went to 4.45, before on 4.25 I normally had 5 Einstien WUs ( I was also giving each one 33 1/3% each) I also have a Win2K AMD 1.6 mhz that has the same problem, no Einstein units! I'm at a loss!
One thing to note is that
)
One thing to note is that 4.45 will not behave like 4.25. The scheduler code was changed. So, if you are looking for identical behaviour, you're not going to see it.
Another aspect you have to keep in mind is the LT and the ST debts. It may still be trying to equalize those two debts.
Did you also update the second machie to 4.45 at the same time?
Jim
RE: One thing to note is
)
No I'm not looking for Idenical behavior, just some work. Yes I updated a WinXP, a Win2K, & a Win98(just Seti) the same day.
I'm leaving it alone for 7 days then I'll change something!
After all our discussions
)
After all our discussions here, and reading other threads as well, I'm sure the only problem is the misunderstanding of how the scheduler/resource share actually works. The fact is that it requests the amount of work equal to your reconnect time, and not equal to your reconnect * share * time to deadline. Based on this E@H is the only project that is doing what it is supposed to be doing.
On the other hand, if all the projects were doing that, deadlines will surely be missed....
RE: RE: Hi Gordon, Still
)
RE: After all our
)
Then why am I not getting any work?
7/29/2005 6:05:08 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
7/29/2005 6:05:08 AM|Einstein@Home|Requesting 0 seconds of work, returning 0 results
7/29/2005 6:05:10 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Gordon, Check if you're
)
Gordon,
Check if you're long term debt is negative.
You can use BoincView to do this. Download it at
http://boincview.amanheis.de/
It's a very usefull little program.
If your E@H long term debt is negative, Boinc won't request work for it, unless all other projects (with positive debt) are out of work.
That is a long term way of honouring the resource share settings.
Once the E@H long term debt is positive again, Boinc will automatically request work for it.
The 'problem' starts when E@H downloads enough work to fill your reconnect time, which it is supposed to do according to the current rules of Boinc. Due to this boinc goes into EDF (earliest deadline first) and commits all resources to E@H to make sure it is completed by the deadline. The consequence of this is that E@H builds up a huge long term debt by hoggin your cpu for days to finish before the deadline. As a result the other projects must get the same amount of cpu time (based on your resource share) before boinc requests any further E@H work.
In theory, over time, the resource settings will be honoured, although E@H won't get work for a week, or two or three, depending on your resource settings and reconnect time.
A way to get a more even spread of work on you machine is to lower your reconnect time, but this is only feasible for people with always on internet connections. With 0.1 or 0.2 day reconnect time the scheduler only downloads 1 or 2 workunits at a time which can be easily finished before the deadline, therefore not over committing and going into EDF.
In short, short term debt determines which project get the cpu, and long term debt determines which projects get new work units
Hope this helps.
V7