Message from server: No work available (there was work but your computer would not finish it before it is due Review preferences for this project's Resource Share

Michael Berger
Michael Berger
Joined: 22 Jan 05
Posts: 36
Credit: 37252
RAC: 0

Resource share "If you

Resource share

"If you participate in multiple BOINC projects, this is the proportion of your resources used by Einstein@Home"

Proportion is the key word here. The maximum number that can be assigned to the Resource share may be larger than 100 and is required to be a value greater than zero. Don't confuse this setting with a percent value. 100 may not equal 100%

example:

LHC@Home is currently down, my Resource share at LHC is set to 100. Because the scheduler does not respond, it can't be changed. I want Einstein@Home to use BOINC 100% of the time. I only participate in these two projects. What do I do to achieve this proportion?

Einstein@Home Resource share = 1000000
LHC@Home Resource share = 100

The results produced from these settings are:

Einstein@Home = 99.99%
LHC@Home = 0.01%

There's a fine line between fishing and standing on the shore looking like an idiot -- Steven Wright

Ziran
Ziran
Joined: 26 Nov 04
Posts: 194
Credit: 615123
RAC: 1329

> I've spent quite a bit of

Message 3061 in response to message 3050

> I've spent quite a bit of time on the scheduler during the past couple of
> days. It now does this, at least to the degree that is possible with the
> information that is available to the server.
>
> The basic idea is that if a machine does not have any work to do for
> Einstein@Home, it will at least get one workunit. After that, it will only
> get work if the estimated completion time of the work (taking into account any
> work that the machine is already doing) is before the deadline for that work.
> This is perhaps not ideal in all cases but hopefully will work fairly well
> most of the time.
>
> Bruce

I just got the “ No work available (there was work but your computer would not finish it before it is due)� message on my pIII450. I pressed the update button then i discovered this and a new WU downloaded fine. Would it be possible to include in your workaround something like: if less then 20 min remain on current WU, let computer download one new WU. Copied the message window so you could see how the scheduler behaved in my case.

--- - 2005-02-16 19:04:35 - Starting BOINC client version 4.19 for windows_intelx86
Einstein@Home - 2005-02-16 19:04:38 - Project prefs: no separate prefs for home; using your defaults
Einstein@Home - 2005-02-16 19:04:39 - Host ID is 5349
--- - 2005-02-16 19:04:42 - General prefs: from Einstein@Home (last modified 2005-02-07 00:21:01)
--- - 2005-02-16 19:04:42 - General prefs: no separate prefs for home; using your defaults
Einstein@Home - 2005-02-16 19:04:57 - Resuming computation for result H1_0081.9__0082.2_0.1_T25_Test02_5 using einstein version 4.75
--- - 2005-02-16 22:51:34 - May run out of work in 0.01 days; requesting more
Einstein@Home - 2005-02-16 22:51:34 - Requesting 335 seconds of work
Einstein@Home - 2005-02-16 22:51:34 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2005-02-16 22:51:35 - Computation for result H1_0081.9__0082.2_0.1_T25_Test02 finished
Einstein@Home - 2005-02-16 22:51:35 - Started upload of H1_0081.9__0082.2_0.1_T25_Test02_5_0
Einstein@Home - 2005-02-16 22:51:39 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2005-02-16 22:51:39 - Message from server: No work available (there was work but your computer would not finish it before it is due
Einstein@Home - 2005-02-16 22:51:39 - Project prefs: no separate prefs for home; using your defaults
Einstein@Home - 2005-02-16 22:51:39 - Got request to delete file: H1_0081.9
Einstein@Home - 2005-02-16 22:51:39 - No work from project
Einstein@Home - 2005-02-16 22:51:39 - Deferring communication with project for 1 hours, 0 minutes, and 0 seconds
Einstein@Home - 2005-02-16 22:51:43 - Finished upload of H1_0081.9__0082.2_0.1_T25_Test02_5_0
Einstein@Home - 2005-02-16 22:51:43 - Throughput 423 bytes/sec
Einstein@Home - 2005-02-16 23:29:44 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2005-02-16 23:29:48 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2005-02-16 23:29:48 - Got request to delete file: H1_0081.9
Einstein@Home - 2005-02-16 23:29:48 - Started download of einstein_4.79_windows_intelx86.exe
Einstein@Home - 2005-02-16 23:29:48 - Started download of einstein_4.79_windows_intelx86.pdb
Einstein@Home - 2005-02-16 23:30:03 - Finished download of einstein_4.79_windows_intelx86.pdb
Einstein@Home - 2005-02-16 23:30:03 - Throughput 213125 bytes/sec
Einstein@Home - 2005-02-16 23:30:03 - Started download of H1_0953.4
Einstein@Home - 2005-02-16 23:30:04 - Finished download of einstein_4.79_windows_intelx86.exe
Einstein@Home - 2005-02-16 23:30:04 - Throughput 108637 bytes/sec
Einstein@Home - 2005-02-16 23:30:42 - Finished download of H1_0953.4
Einstein@Home - 2005-02-16 23:30:42 - Throughput 375248 bytes/sec
Einstein@Home - 2005-02-16 23:30:43 - Starting result H1_0953.4__0953.5_0.1_T00_Test02_1 using einstein version 4.79

Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

Ziran, I thought it would

Message 3062 in response to message 3061

Ziran,

I thought it would be worthwhile for me to explain the logic behind this, not so much for you as for others who may be wondering what's going on behind the scenes. So my comments are interspersed below in your message log.


> --- - 2005-02-16 19:04:35 - Starting BOINC client version 4.19 for
> windows_intelx86
> Einstein@Home - 2005-02-16 19:04:38 - Project prefs: no separate prefs for
> home; using your defaults
> Einstein@Home - 2005-02-16 19:04:39 - Host ID is 5349
> --- - 2005-02-16 19:04:42 - General prefs: from Einstein@Home (last modified
> 2005-02-07 00:21:01)
> --- - 2005-02-16 19:04:42 - General prefs: no separate prefs for home; using
> your defaults
> Einstein@Home - 2005-02-16 19:04:57 - Resuming computation for result
> H1_0081.9__0082.2_0.1_T25_Test02_5 using einstein version 4.75
> --- - 2005-02-16 22:51:34 - May run out of work in 0.01 days; requesting more
> Einstein@Home - 2005-02-16 22:51:34 - Requesting 335 seconds of work
> Einstein@Home - 2005-02-16 22:51:34 - Sending request to scheduler:
> http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
> Einstein@Home - 2005-02-16 22:51:35 - Computation for result
> H1_0081.9__0082.2_0.1_T25_Test02 finished
> Einstein@Home - 2005-02-16 22:51:35 - Started upload of
> H1_0081.9__0082.2_0.1_T25_Test02_5_0
> Einstein@Home - 2005-02-16 22:51:39 - Scheduler RPC to
> http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
> Einstein@Home - 2005-02-16 22:51:39 - Message from server: No work available
> (there was work but your computer would not finish it before it is due

Let's see what led to this:
2005-02-16 21:44:20 [normal ] [HOST#5349] got request for 334.990120 seconds of work; available disk 0.707837 GB
2005-02-16 21:44:20 [debug ] est cpu dur 127051.428571; running_frac 0.200000; rsf 1.000000; est 635257.142857

2005-02-16 21:44:20 [debug ] [WU#348347 H1_0249.4__0249.6_0.1_T29_Test02] needs 635257 seconds on [HOST#5349]; delay_bound is 604800 (request.estimated_delay is 747.976118)
2005-02-16 21:44:20 [normal ] [HOST#5349] Sent 0 results

So the scheduler has concluded that your computer is available LESS than 20% of the time (running_frac). This is the product of the fraction of the time it is turned on, times the fraction of the time that BOINC is not running because you are using the computer for other things. Now the WU that was being considered is estimated to take 127051 CPU seconds on your machine. Hence dividing this by 0.2 the estimated completion time to finish this work is 635257 seconds, which is longer than one week (the deadline).

(Note that 747 seconds is the estimated time to complete the E@H work that was still on your computer.)

2005-02-16 21:44:20 [debug ] [WU#348347 H1_0249.4__0249.6_0.1_T29_Test02] needs 635257 seconds on [HOST#5349]; delay_bound is 604800 (request.estimated_delay is 747.976118)
2005-02-16 21:44:20 [normal ] [HOST#5349] Sent 0 results

So these numbers explain the reply just above. You would not be able to complete the work before the delay bound of one week = 604800 seconds.

> Einstein@Home - 2005-02-16 22:51:39 - Project prefs: no separate prefs for
> home; using your defaults
> Einstein@Home - 2005-02-16 22:51:39 - Got request to delete file: H1_0081.9
> Einstein@Home - 2005-02-16 22:51:39 - No work from project
> Einstein@Home - 2005-02-16 22:51:39 - Deferring communication with project for
> 1 hours, 0 minutes, and 0 seconds
> Einstein@Home - 2005-02-16 22:51:43 - Finished upload of
> H1_0081.9__0082.2_0.1_T25_Test02_5_0
> Einstein@Home - 2005-02-16 22:51:43 - Throughput 423 bytes/sec

Now your computer COMPLETES its E@H work and has no further E@H work to do. So....

> Einstein@Home - 2005-02-16 23:29:44 - Sending request to scheduler:
> http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi

At this point your computer has no further work, and so it sends another request to the scheduler.

> Einstein@Home - 2005-02-16 23:29:48 - Scheduler RPC to
> http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

Now the E@H scheduler is programmed (as described above) to completely IGNORE the fraction of time you are on, and your resource share, and always send ONE result IF you have no further E@H work to do.

2005-02-16 22:22:30 [normal ] [HOST#5349] got request for 591.666881 seconds of work; available disk 0.707837 GB

2005-02-16 22:22:30 [debug ] [HOST#5349] Sending app_version einstein windows_intelx86 479
2005-02-16 22:22:30 [debug ] est cpu dur 127051.428571; running_frac 0.200000; rsf 1.000000; est 635257.142857
2005-02-16 22:22:30 [normal ] [HOST#5349] Sending [RESULT#1187443 H1_0953.4__0953.5_0.1_T00_Test02_1] (fills 635257.14 seconds)
2005-02-16 22:22:30 [normal ] [HOST#5349] Sent 1 results

The estimate is that this WU will take 127051 CPU seconds on your machine.
Since your machine is able to run BOINC only 20% of the time, this is estimated to take 635257 seconds (more than 1 week!).

2005-02-16 22:22:30 [normal ] [HOST#5349] Sending [RESULT#1187443 H1_0953.4__0953.5_0.1_T00_Test02_1] (fills 635257.14 seconds)
2005-02-16 22:22:30 [normal ] [HOST#5349] Sent 1 results

The basic conclusion is this: if you want to get more work from the scheduler, you need to either leave your computer turned on for longer, or alternatively you need to ensure that BOINC spends more of its time running and less of its time waiting for the computer to be "Free".

I hope this helps to explain what's going on at the server end!

Cheers,
Bruce

Director, Einstein@Home

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

> Perhaps some actual data

Message 3063 in response to message 3057

> Perhaps some actual data would help.
>
> Here is the share from one of my computers:
>
>
>
> Now this is not the fastest computer in the world but it has scheduled
> completions in 19 hours. Asking for three days worth, and providing a 97%
> resource share, it seems to me, should result in my having at the very
> minimum, three work units in my queue - I now have ONE.

James, I thought I'd provide a detailed reply to give you and others a snapshot of how this works from the server side. I hope this is helpful.

The key point is that 19 hours is the CPU time to complete the result, NOT the wallclock time. Apparently, according the the core client, your computer is only able to run boinc jobs less than 20% of the time. Hence the 19 hours of CPU time is more like 100 hours of wallclock time.

2005-02-16 14:44:45 [debug ] est cpu dur 69433.173333; running_frac 0.200000; rsf 0.967742; est 358738.038306

The key number is the 0.2000000 which is the product of the fraction of time that your computer is turned on, times the fraction of time that it is running BOINC. If less than 0.20, it is set to 0.20 by the scheduler. Hence although the job is estimated to take 69433 seconds of CPU time on your machine, the estimated time to completion is 358738 seconds, which is around four days.

2005-02-16 14:44:45 [debug ] [WU#348347 H1_0249.4__0249.6_0.1_T29_Test02] needs 358738 seconds on [HOST#10548]; delay_bound is 604800 (request.estimated_delay is 259197.491119)

The work currently on your machine is estimated to take 259197 seconds (around 3 days) to complete. Hence it does not make sense to send new work NOW, since it would not finish by the deadline of 1 week (604800 seconds).

2005-02-16 14:44:45 [normal ] [HOST#10548] Sent 0 results
2005-02-16 14:44:45 [normal ] sending delay request 51839

The delay request is 1/5 of the time before your machine will need new work. This is intended to ensure that if your machine works faster than expected, it will get plenty of new work.

Bottom line: to get more work, leave your computer on more, and make sure that BOINC runs a larger fraction of the time.

Cheers,
Bruce

Director, Einstein@Home

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

Isaiah, Yeti, Thank you

Message 3064 in response to message 3059

Isaiah, Yeti,

Thank you for your replies ...

> You are right....it is a BOINC issue but kudos to the Einstein people for
> doing their own work around....I think you will find they are the most
> participant friendly project around...as are some of the Prime Number projects
>
> and FindaDrug, I would say.

Since I am a "hard" physics nut I am likely to be altering my shares to do Einstein@Home and LHC@Home as the largest share when they get out of the starting gate. And I am CURRENTLY attached to 4 projects, 2 of them on all computers, 3 of them on all but two, and EInstein@Home only one one.

CPDN = 5 Computers
Predictor@Home = 5
SETI@Home = 6
Einstein@Home = 1

I have one computer that resets itself all of the time (cheap motherboard, I plan to replace that soon ...) so I dont' run long WU on that machine ... I should make the Macintosh run Predictor@Home I suppose ...

I do like the multi-project feature a lot. The point I have been unsuccessful in making is that if you do subscribe to multiple projects that even though the design intent is to foster multi-project participation; that it does not work as it was intended to work. And if you do participate in many projects it does in fact not work that well because of the Project-server-centric mindset.

And, Yes, I do like Bruce Allen & company as they are, at the moment, the most forthcoming with the project participants. As was LHC@Home when they were live and not Memorex ...

Anyway, My participation is oriented toward CPDN right now, with SETI@Home as a smaller part of my allocations. I was more aggressive with Einstein@Home but I was running into problems with the deadlines and so backed off. With the many fixes I am back with one computer to see if the problems are smaller.

Anyway, if you are interested in what I had to say along these lines, get the archive of the mailing list for the BOINC Developers and look for one dated January 30 this year named "Feature Request" ...

Skip Da Shu
Skip Da Shu
Joined: 18 Jan 05
Posts: 152
Credit: 1044665853
RAC: 702971

> Ziran, > I thought it would

Message 3065 in response to message 3062

> Ziran,
> I thought it would be worthwhile for me to explain the logic behind this, not
> so much for you as for others who may be wondering what's going on behind the
> scenes. So my comments are interspersed below in your message log.
LINES DELETED
>
> > --- - 2005-02-16 19:04:35 - Starting BOINC client version 4.19 for
> > windows_intelx86
e that BOINC spends more of its time running and less of its
> time waiting for the computer to be "Free".
>
> I hope this helps to explain what's going on at the server end!
>
> Cheers, Bruce

Sweet... thanx for the good info.

Skip Da Shu
Skip Da Shu
Joined: 18 Jan 05
Posts: 152
Credit: 1044665853
RAC: 702971

Looks like all the bla bla

Looks like all the bla bla bla I had earlier about "emergency" schedule overrides is just that.. bla bla bla... as it seems the next release of the BOINC client GUI will have the ability to "Suspend" by project. This would allow one to manually get the "in trouble" w/u caught up by suspending some or all the other projects.

S@NL - EJG
S@NL - EJG
Joined: 18 Jan 05
Posts: 34
Credit: 93500
RAC: 0

> as it seems the next

Message 3067 in response to message 3066

> as it seems the next release of the BOINC client GUI will have
> the ability to "Suspend" by project.

Indeed, it does (you even can suspend single WU's). I already used this to crunch a WU before it's deadline once.
And after that I decreased my cache size of course. ;-)

You also can abort single WU's, also WU's that Boinc did not start to crunch yet.

Yeti
Yeti
Joined: 17 Nov 04
Posts: 59
Credit: 1370775225
RAC: 1290524

@Paul Thanks for your

@Paul

Thanks for your invitation to read Feature Request, but sorry, I'm a little bit tired about "fighting" for features.

I'm with boinc since version 1.04, I have spent a lot of time with boinc. And with my farm of computers, each update of the client costs me a lot of time.

I see it from this side: My clients get connected to all boinc projects. If a project doesn't deliver WUs or doesn't like my clients, this doesn't really hurt me. With minimum of 4 projects on my clients, they will always have work, special because CPDN is one of the 4.

If I see things that could get better or need attention, I point it out. But that's it ! If developers don't have time for this, it's okay. BOINC is in an early stage and every one can see, that they have a lot to do to get all things working as they should.

I guess, it will take some time and then, one of the project-administrators will recognize the need of a feature and bring it in.

Keep on crunching

Yeti

Supporting BOINC, a great concept !

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.