Message from server: No work available (there was work but your computer would not finish it before it is due Review preferences for this project's Resource Share
> What would be really nice would be if the scheduler used the actual work on
> hand on the client rather than a resource share to determine whether and how
> much work to download. Slower machines that could finish one WU on time for
> one project are now going to be limited to one project with no backup. This
> means that if that project goes down for any reason, those machines will be
> sitting idle.
John,
I've spent quite a bit of time on the scheduler during the past couple of days. It now does this, at least to the degree that is possible with the information that is available to the server.
The basic idea is that if a machine does not have any work to do for Einstein@Home, it will at least get one workunit. After that, it will only get work if the estimated completion time of the work (taking into account any work that the machine is already doing) is before the deadline for that work. This is perhaps not ideal in all cases but hopefully will work fairly well most of the time.
> I think, the server actually has this information - it has the number of WUs
> that are sent to that pc, and it can calculate the estimated work time.
>
No it doesn't. The server only knows of his own project. If the client run's multiple projects (as most BOINC-Clients will do), the server doesn't have the chance to calculate correct, because it doesn't really know, is the client-queue empty or nearly full / satisfied.
EDIT: and the server doesn't know, how the resource-share is spread in relation to the other projects
> No it doesn't. The server only knows of his own project. If the client run's
> multiple projects (as most BOINC-Clients will do), the server doesn't have the
> chance to calculate correct, because it doesn't really know, is the
> client-queue empty or nearly full / satisfied.
>
> EDIT: and the server doesn't know, how the resource-share is spread in
> relation to the other projects
Yes, but the client(s) does/do know this. I tried to make that point a couple weeks ago on the developers mailing list and was basically told I don't know what I am talking about. Right now BOINC is suffering from a developer mind set that each project is directed from the information available on the project's severs (database) and should assume full availability of the Participant's computer.
The fact that the BOINC Manager does know about the projects that it is attached to, and therefore how much time is really available to the project, and the resource shares for that client for all of the projects that it is attached to ...
Anyway, if history is any guide, as soon as someone else makes that point, maybe we will see the client side information being provided back to the servers so they can make more intelligent choices ...
maybe, I should make a similar post in the lhc-forum. The Administrator their mentioned out, that the good thing from BOINC is to be attached to more than one project and people should stop crying / wining, when LHC doesn't give out work.
If I remember right, he announced that LHC more often will have phases with delivering work and not delivering work. So, perhaps, sometime we get his focus on this.
So far, as I know, E@H is the first project that uses this calculation on their sever-side. Let it take some days, I guess, then they will also see the need for this value being communicated from the client.
I guess that's where the queue lengths come in. If set too high, there would be no chance in heck to finish all the Wu in that queue. Multiply that by 4 or 5 projects and you can see what i mean. On a connect every 10 days setup with my comp, that would be close to 60 WU's.
Now this is not the fastest computer in the world but it has scheduled completions in 19 hours. Asking for three days worth, and providing a 97% resource share, it seems to me, should result in my having at the very minimum, three work units in my queue - I now have ONE.
I want to thank E@H for putting in the restriction. I've wasted many hours calculating away on E@H w/units that never get done in time on my slower machines (partially due to up to 6 BOINC projects).
One of my slowest machines (397Mhz PII) is currently 54% thru a E@H w/u that was due about 20hrs ago. It's got 23:51:20 left to complete... think that's gonna do any good? Maybe I should reset that one since it's the only E@H w/u on that box? 28hrs of pcu down the flusher! Don't know if it would credit if it finished or not since only 2 machines have reported in on this w/u (325170).
Unencumbered by knowledge...
I do NOT see how the server can do any reasonable estimate w/o detail from each computer. Seems to me it'd need the following items or their equiv:
1)work load (resource debt) for each project
2)resource share (28% for my E@H)
3)cpu benchmarks (should have this data)
4)recent up time % (numbers on account aren't good enough since the machine in question was literally on the 'slow boat from china' for 2 months and only shows 10% up time but is actually running 24/7 now)
5)deadlines for all pending work (maybe)
6)average turnaround times (maybe)
The 5th item brings me to another possible issue/solution. That is, if the client side scheduler could question due dates and if one is 'in danger' override the debt/share scheduling.
Let me give an example to try and be clearer.
A client BOINC has E@H set to 25% resource share.
A client has 1 E@H w/u with 5 hours remaining effort.
The w/u's deadline is 24 hours away.
Since .25 * 24hrs = 6hrs and 6 > 5, it's reasonable to think it will finish in time.
Now add another E@H w/u so we have:
A client BOINC has E@H set to 25% resource share.
A client has 2 E@H w/u with a total of 12(5hrs & 7hrs)hours remaining effort.
The w/u's deadlines are both 24 hours away.
Since .25 * 24 still = 6 but now 6 is less than the 12hrs of work left, a temp override of the schedule could be done to allow both w/units to finish in time.
I can see how this could get complex since this might impact other projects/work and since the example work units are likely to have different due date/times. Maybe it could be kept simple by doing the evaluation ONLY on a single w/u. In my example it would be the 2nd w/u that would get the "emergency mode override" after the 1st one finished (since it would have 7 hours of work to finish in under 12 clock hours left).
So, having written this and again, unencumbered by any actual knowledge of the system, it seems maybe the server side solution suggested by John McL7 and others may be better (less complex). However there may be a client side safety net available.
> > No it doesn't. The server only knows of his own project. If the client
> run's
> > multiple projects (as most BOINC-Clients will do), the server doesn't
> have the
> > chance to calculate correct, because it doesn't really know, is the
> > client-queue empty or nearly full / satisfied.
> >
> > EDIT: and the server doesn't know, how the resource-share is spread in
> > relation to the other projects
>
> Yes, but the client(s) does/do know this. I tried to make that point a
> couple weeks ago on the developers mailing list and was basically told I don't
> know what I am talking about. Right now BOINC is suffering from a developer
> mind set that each project is directed from the information available on the
> project's severs (database) and should assume full availability of the
> Participant's computer.
>
> The fact that the BOINC Manager does know about the projects that it is
> attached to, and therefore how much time is really available to the
> project, and the resource shares for that client for all of the projects that
> it is attached to ...
>
> Anyway, if history is any guide, as soon as someone else makes that point,
> maybe we will see the client side information being provided back to the
> servers so they can make more intelligent choices ...
>
>
>
Paul:
You are right....it is a BOINC issue but kudos to the Einstein people for doing their own work around....I think you will find they are the most participant friendly project around...as are some of the Prime Number projects
and FindaDrug, I would say.
> What would be really nice
)
> What would be really nice would be if the scheduler used the actual work on
> hand on the client rather than a resource share to determine whether and how
> much work to download. Slower machines that could finish one WU on time for
> one project are now going to be limited to one project with no backup. This
> means that if that project goes down for any reason, those machines will be
> sitting idle.
John,
I've spent quite a bit of time on the scheduler during the past couple of days. It now does this, at least to the degree that is possible with the information that is available to the server.
The basic idea is that if a machine does not have any work to do for Einstein@Home, it will at least get one workunit. After that, it will only get work if the estimated completion time of the work (taking into account any work that the machine is already doing) is before the deadline for that work. This is perhaps not ideal in all cases but hopefully will work fairly well most of the time.
Bruce
Director, Einstein@Home
So, would it be a good idea
)
So, would it be a good idea if the the client would tell the server how much work is in the local-queue ?
Supporting BOINC, a great concept !
I think, the server actually
)
I think, the server actually has this information - it has the number of WUs that are sent to that pc, and it can calculate the estimated work time.
Administrator
Message@Home
> I think, the server
)
> I think, the server actually has this information - it has the number of WUs
> that are sent to that pc, and it can calculate the estimated work time.
>
No it doesn't. The server only knows of his own project. If the client run's multiple projects (as most BOINC-Clients will do), the server doesn't have the chance to calculate correct, because it doesn't really know, is the client-queue empty or nearly full / satisfied.
EDIT: and the server doesn't know, how the resource-share is spread in relation to the other projects
Supporting BOINC, a great concept !
> No it doesn't. The server
)
> No it doesn't. The server only knows of his own project. If the client run's
> multiple projects (as most BOINC-Clients will do), the server doesn't have the
> chance to calculate correct, because it doesn't really know, is the
> client-queue empty or nearly full / satisfied.
>
> EDIT: and the server doesn't know, how the resource-share is spread in
> relation to the other projects
Yes, but the client(s) does/do know this. I tried to make that point a couple weeks ago on the developers mailing list and was basically told I don't know what I am talking about. Right now BOINC is suffering from a developer mind set that each project is directed from the information available on the project's severs (database) and should assume full availability of the Participant's computer.
The fact that the BOINC Manager does know about the projects that it is attached to, and therefore how much time is really available to the project, and the resource shares for that client for all of the projects that it is attached to ...
Anyway, if history is any guide, as soon as someone else makes that point, maybe we will see the client side information being provided back to the servers so they can make more intelligent choices ...
Paul, maybe, I should make
)
Paul,
maybe, I should make a similar post in the lhc-forum. The Administrator their mentioned out, that the good thing from BOINC is to be attached to more than one project and people should stop crying / wining, when LHC doesn't give out work.
If I remember right, he announced that LHC more often will have phases with delivering work and not delivering work. So, perhaps, sometime we get his focus on this.
So far, as I know, E@H is the first project that uses this calculation on their sever-side. Let it take some days, I guess, then they will also see the need for this value being communicated from the client.
Supporting BOINC, a great concept !
I guess that's where the
)
I guess that's where the queue lengths come in. If set too high, there would be no chance in heck to finish all the Wu in that queue. Multiply that by 4 or 5 projects and you can see what i mean. On a connect every 10 days setup with my comp, that would be close to 60 WU's.
Perhaps some actual data
)
Perhaps some actual data would help.
Here is the share from one of my computers:
Now this is not the fastest computer in the world but it has scheduled completions in 19 hours. Asking for three days worth, and providing a 97% resource share, it seems to me, should result in my having at the very minimum, three work units in my queue - I now have ONE.
I want to thank E@H for
)
I want to thank E@H for putting in the restriction. I've wasted many hours calculating away on E@H w/units that never get done in time on my slower machines (partially due to up to 6 BOINC projects).
One of my slowest machines (397Mhz PII) is currently 54% thru a E@H w/u that was due about 20hrs ago. It's got 23:51:20 left to complete... think that's gonna do any good? Maybe I should reset that one since it's the only E@H w/u on that box? 28hrs of pcu down the flusher! Don't know if it would credit if it finished or not since only 2 machines have reported in on this w/u (325170).
Unencumbered by knowledge...
I do NOT see how the server can do any reasonable estimate w/o detail from each computer. Seems to me it'd need the following items or their equiv:
1)work load (resource debt) for each project
2)resource share (28% for my E@H)
3)cpu benchmarks (should have this data)
4)recent up time % (numbers on account aren't good enough since the machine in question was literally on the 'slow boat from china' for 2 months and only shows 10% up time but is actually running 24/7 now)
5)deadlines for all pending work (maybe)
6)average turnaround times (maybe)
The 5th item brings me to another possible issue/solution. That is, if the client side scheduler could question due dates and if one is 'in danger' override the debt/share scheduling.
Let me give an example to try and be clearer.
A client BOINC has E@H set to 25% resource share.
A client has 1 E@H w/u with 5 hours remaining effort.
The w/u's deadline is 24 hours away.
Since .25 * 24hrs = 6hrs and 6 > 5, it's reasonable to think it will finish in time.
Now add another E@H w/u so we have:
A client BOINC has E@H set to 25% resource share.
A client has 2 E@H w/u with a total of 12(5hrs & 7hrs)hours remaining effort.
The w/u's deadlines are both 24 hours away.
Since .25 * 24 still = 6 but now 6 is less than the 12hrs of work left, a temp override of the schedule could be done to allow both w/units to finish in time.
I can see how this could get complex since this might impact other projects/work and since the example work units are likely to have different due date/times. Maybe it could be kept simple by doing the evaluation ONLY on a single w/u. In my example it would be the 2nd w/u that would get the "emergency mode override" after the 1st one finished (since it would have 7 hours of work to finish in under 12 clock hours left).
So, having written this and again, unencumbered by any actual knowledge of the system, it seems maybe the server side solution suggested by John McL7 and others may be better (less complex). However there may be a client side safety net available.
> > No it doesn't. The server
)
> > No it doesn't. The server only knows of his own project. If the client
> run's
> > multiple projects (as most BOINC-Clients will do), the server doesn't
> have the
> > chance to calculate correct, because it doesn't really know, is the
> > client-queue empty or nearly full / satisfied.
> >
> > EDIT: and the server doesn't know, how the resource-share is spread in
> > relation to the other projects
>
> Yes, but the client(s) does/do know this. I tried to make that point a
> couple weeks ago on the developers mailing list and was basically told I don't
> know what I am talking about. Right now BOINC is suffering from a developer
> mind set that each project is directed from the information available on the
> project's severs (database) and should assume full availability of the
> Participant's computer.
>
> The fact that the BOINC Manager does know about the projects that it is
> attached to, and therefore how much time is really available to the
> project, and the resource shares for that client for all of the projects that
> it is attached to ...
>
> Anyway, if history is any guide, as soon as someone else makes that point,
> maybe we will see the client side information being provided back to the
> servers so they can make more intelligent choices ...
>
>
>
Paul:
You are right....it is a BOINC issue but kudos to the Einstein people for doing their own work around....I think you will find they are the most participant friendly project around...as are some of the Prime Number projects
and FindaDrug, I would say.