Multiproject handover inconsistent at WU finish

RandyC
RandyC
Joined: 18 Jan 05
Posts: 6603
Credit: 111139797
RAC: 0
Topic 187283

(Posted at both SETI and Einstein boards)

I've just added the Einstein@home project to one of my systems previously doing SETI only and I've noticed an inconsistancy between how the SETI and Einstein clients handle project swapping when they finish a result. To summarize:

SETI client handover when finishing a result:
1. Resume calculations from previous pause
2. WU is completed in nnn minites
3. WU is uploaded to Server and new WU calc starts
4. SETI client crunches new WU for 1 hour and pauses

Einstein client handover when finishing a result:
1. Resume calculations from previous pause
2. WU is completed in yyy minites
3. SETI client resumes crunching
4. Einstein WU is uploaded to Server while SETI continues crunching
5. SETI client pauses after 1 hour

Obviously the separate clients are using different rules for handover when they finish a WU. Has anybody noticed similar differences (oops, that's an oxymoron) for other projects?

Turnover logs for each client:

SETI client turnover...
SETI@home - 2005-01-18 23:22:35 - Resuming result 26mr04aa.22534.24880.253394.18_1 using setiathome version 4.08
Einstein@Home - 2005-01-18 23:22:35 - Pausing result H1_0205.4__0205.7_0.1_T08_Test02_4 (left in memory)
SETI@home - 2005-01-19 00:14:46 - Computation for result 26mr04aa.22534.24880.253394.18 finished
SETI@home - 2005-01-19 00:14:47 - Starting result 27mr04aa.17618.5345.897148.234_3 using setiathome version 4.08
SETI@home - 2005-01-19 00:14:47 - Started upload of 26mr04aa.22534.24880.253394.18_1_0
SETI@home - 2005-01-19 00:14:53 - Finished upload of 26mr04aa.22534.24880.253394.18_1_0
SETI@home - 2005-01-19 00:14:53 - Throughput 4722 bytes/sec
Einstein@Home - 2005-01-19 01:14:47 - Resuming result H1_0205.4__0205.7_0.1_T08_Test02_4 using einstein version 4.71
SETI@home - 2005-01-19 01:14:47 - Pausing result 27mr04aa.17618.5345.897148.234_3 (left in memory)
Einstein@Home - 2005-01-19 02:14:47 - Pausing result H1_0205.4__0205.7_0.1_T08_Test02_4 (left in memory)
SETI@home - 2005-01-19 02:14:47 - Resuming result 27mr04aa.17618.5345.897148.234_3 using setiathome version 4.08
Einstein@Home - 2005-01-19 03:14:47 - Resuming result H1_0205.4__0205.7_0.1_T08_Test02_4 using einstein version 4.71
SETI@home - 2005-01-19 03:14:47 - Pausing result 27mr04aa.17618.5345.897148.234_3 (left in memory)

Einstein client turnover:
Einstein@Home - 2005-01-19 12:43:11 - Resuming result H1_0205.4__0205.7_0.1_T08_Test02_4 using einstein version 4.71
SETI@home - 2005-01-19 12:43:11 - Pausing result 27mr04aa.17618.5345.897148.233_2 (left in memory)
Einstein@Home - 2005-01-19 13:36:02 - Computation for result H1_0205.4__0205.7_0.1_T08_Test02 finished
SETI@home - 2005-01-19 13:36:02 - Resuming result 27mr04aa.17618.5345.897148.233_2 using setiathome version 4.08
Einstein@Home - 2005-01-19 13:36:02 - Started upload of H1_0205.4__0205.7_0.1_T08_Test02_4_0
Einstein@Home - 2005-01-19 13:36:08 - Finished upload of H1_0205.4__0205.7_0.1_T08_Test02_4_0
Einstein@Home - 2005-01-19 13:36:08 - Throughput 9585 bytes/sec
SETI@home - 2005-01-19 14:36:02 - Pausing result 27mr04aa.17618.5345.897148.233_2 (left in memory)
Einstein@Home - 2005-01-19 14:36:02 - Starting result H1_0205.4__0205.8_0.1_T08_Test02_4 using einstein version 4.71
SETI@home - 2005-01-19 15:36:02 - Resuming result 27mr04aa.17618.5345.897148.233_2 using setiathome version 4.08
Einstein@Home - 2005-01-19 15:36:02 - Pausing result H1_0205.4__0205.8_0.1_T08_Test02_4 (left in memory)

Seti Classic Final Total: 11446 WU.

S@NL - Marleen
S@NL - Marleen
Joined: 18 Jan 05
Posts: 25
Credit: 4068135
RAC: 0

Multiproject handover inconsistent at WU finish

Hmm, interesting, I just started with Einstein and I haven't finished an Einstein result, yet. But I did turn in and download some Seti results.
I run the 4.15 BOINC client on XP sp1.

So I did take a good look at my log and noticed several things:
Resuming Seti WU early after downloading Seti results
Einstein@Home - 2005-01-20 21:59:19 - Resuming result H1_0054.4__0054.5_0.1_T03_Test02_3 using einstein version 4.72
SETI@home - 2005-01-20 21:59:19 - Pausing result 28mr04aa.6343.22848.809642.91_2 (left in memory)
--- - 2005-01-20 22:20:51 - May run out of work in 2.00 days; requesting more
SETI@home - 2005-01-20 22:20:51 - Requesting 20052 seconds of work
SETI@home - 2005-01-20 22:20:51 - Sending request to scheduler: http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
SETI@home - 2005-01-20 22:20:55 - Scheduler RPC to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
SETI@home - 2005-01-20 22:20:55 - Started download of 28mr04aa.6343.28368.153414.37
SETI@home - 2005-01-20 22:20:55 - Started download of 28mr04aa.6343.28368.153414.30
SETI@home - 2005-01-20 22:21:03 - Finished download of 28mr04aa.6343.28368.153414.37
SETI@home - 2005-01-20 22:21:03 - Throughput 45159 bytes/sec
Einstein@Home - 2005-01-20 22:21:03 - Pausing result H1_0054.4__0054.5_0.1_T03_Test02_3 (left in memory)
SETI@home - 2005-01-20 22:21:03 - Resuming result 28mr04aa.6343.22848.809642.91_2 using setiathome version 4.08
SETI@home - 2005-01-20 22:21:06 - Finished download of 28mr04aa.6343.28368.153414.30
SETI@home - 2005-01-20 22:21:06 - Throughput 33285 bytes/sec

I was not out of Seti workunits at that point (although I was working the last one) so I don't think it's some kind of "catching up". And the three changeovers before this one were exactly one hour apart.

Also interesting are the next log entries, when the Seti WU finishes:
Very short run of Einstein WU at Seti turnover
SETI@home - 2005-01-20 22:48:17 - Computation for result 28mr04aa.6343.22848.809642.91 finished
Einstein@Home - 2005-01-20 22:48:18 - Resuming result H1_0054.4__0054.5_0.1_T03_Test02_3 using einstein version 4.72
SETI@home - 2005-01-20 22:48:18 - Started upload of 28mr04aa.6343.22848.809642.91_2_0
Einstein@Home - 2005-01-20 23:48:18 - Pausing result H1_0054.4__0054.5_0.1_T03_Test02_3 (left in memory)
SETI@home - 2005-01-20 23:48:18 - Starting result 28mr04aa.6343.28368.153414.37_0 using setiathome version 4.08
SETI@home - 2005-01-20 22:48:24 - Finished upload of 28mr04aa.6343.22848.809642.91_2_0
SETI@home - 2005-01-20 22:48:24 - Throughput 5232 bytes/sec
Einstein@Home - 2005-01-21 00:48:18 - Resuming result H1_0054.4__0054.5_0.1_T03_Test02_3 using einstein version 4.72
SETI@home - 2005-01-21 00:48:18 - Pausing result 28mr04aa.6343.28368.153414.37_0 (left in memory)

At the end of the Seti WU, the Einstein WU runs for less than one second. Then Seti takes over again and crunches for one hour.
Now I've looked at your Seti turnover log a little better, and there's a very short run of Einstein, too. So my Seti turnover looks the same.
I haven't seen the Einstein turnover yet, but I will take a look at it when it occurs.

John McLeod VII
John McLeod VII
Moderator
Joined: 10 Nov 04
Posts: 547
Credit: 632255
RAC: 0

At the end of a WU, the debt

At the end of a WU, the debt is rechecked. Whichever project has the largest gets the next hour. I have seen all projects hand off to another or keep going.

RandyC
RandyC
Joined: 18 Jan 05
Posts: 6603
Credit: 111139797
RAC: 0

> At the end of a WU, the

Message 1383 in response to message 1382

> At the end of a WU, the debt is rechecked. Whichever project has the largest
> gets the next hour. I have seen all projects hand off to another or keep
> going.
>

Debt? I don't understand.

Seti Classic Final Total: 11446 WU.

Heffed
Heffed
Joined: 18 Jan 05
Posts: 257
Credit: 12368
RAC: 0

> Debt? I don't

Message 1384 in response to message 1383

> Debt? I don't understand.

Resource debt.

BOINC keeps track of the CPU time devoted to each project. If a project has been out of work for a while, it will have more debt (less CPU cycles devoted to it while it was down) than another project. As soon as it gets some work, BOINC will run this project more to balance out the resource debt. (dependent on resource shares of course)

So what's happening is that BOINC finishes a WU, then looks at all the projects and switches to whatever is most in debt. Even if it might have only switched from that project a few minutes before.

Pretty saavy little application...

RandyC
RandyC
Joined: 18 Jan 05
Posts: 6603
Credit: 111139797
RAC: 0

> > Debt? I don't

Message 1385 in response to message 1384

> > Debt? I don't understand.
>
> Resource debt.
>
> BOINC keeps track of the CPU time devoted to each project. If a project has
> been out of work for a while, it will have more debt (less CPU cycles devoted
> to it while it was down) than another project. As soon as it gets some work,
> BOINC will run this project more to balance out the resource debt. (dependent
> on resource shares of course)
>
> So what's happening is that BOINC finishes a WU, then looks at all the
> projects and switches to whatever is most in debt. Even if it might have only
> switched from that project a few minutes before.
>
> Pretty saavy little application...
>
>
OK. I understand now. I've just never heard it referred to as 'debt' before.

Seti Classic Final Total: 11446 WU.

Heffed
Heffed
Joined: 18 Jan 05
Posts: 257
Credit: 12368
RAC: 0

> OK. I understand now. I've

Message 1386 in response to message 1385

> OK. I understand now. I've just never heard it referred to as 'debt' before.

That's what the programmers call it. :)

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

> OK. I understand now. I've

Message 1387 in response to message 1385

> OK. I understand now. I've just never heard it referred to as 'debt' before.

If you look in the glossary (hmmm) I don't think I have my set-up here yet ... we will do it the OLD way ...

If you goto my documentation site, look in the glossary to find the explanations of resource debt and resource share ...

Site

For all your spare time to read the manual ...

Oh, and about a good half of the expanation comes from John ... but I still own all the mistakes ...

John McLeod VII
John McLeod VII
Moderator
Joined: 10 Nov 04
Posts: 547
Credit: 632255
RAC: 0

> > OK. I understand now.

Message 1388 in response to message 1387

> > OK. I understand now. I've just never heard it referred to as 'debt'
> before.
>
> If you look in the glossary (hmmm) I don't think I have my set-up here yet ...
> we will do it the OLD way ...
>
> If you goto my documentation site, look in the glossary to find the
> explanations of resource debt and resource share ...
>
> Site
>
> For all your spare time to read the manual ...
>
> Oh, and about a good half of the expanation comes from John ... but I still
> own all the mistakes ...
>
Yes, but that was a while back. Meanwhile, the programmers have changed everything. The resource debt is calculated once an hour or when a WU is finished crunching or downloading. The project with the largest resource debt at any of the trigger points gets to run for a while. If a project has no work downloaded then the resource debt is set to 0. The calculation of the resource debt is DebtNew = DebtOld + WallTime * ResourceFraction - CPUTime. CPUTime is the time that the CPU spent working on that WU during the last period of time. The debts are then shifted so that the lowest one is 0 (to avoid a slow creep based on CPU time != wall time).

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

> Yes, but that was a while

Message 1389 in response to message 1388

> Yes, but that was a while back. Meanwhile, the programmers have changed
> everything. The resource debt is calculated once an hour or when a WU is
> finished crunching or downloading. The project with the largest resource debt
> at any of the trigger points gets to run for a while. If a project has no
> work downloaded then the resource debt is set to 0. The calculation of the
> resource debt is DebtNew = DebtOld + WallTime * ResourceFraction - CPUTime.
> CPUTime is the time that the CPU spent working on that WU during the last
> period of time. The debts are then shifted so that the lowest one is 0 (to
> avoid a slow creep based on CPU time != wall time).

So, I need to change the glossary I guess.

Ok, I will look at it, can you review it when I get done?

John McLeod VII
John McLeod VII
Moderator
Joined: 10 Nov 04
Posts: 547
Credit: 632255
RAC: 0

> > Yes, but that was a while

Message 1390 in response to message 1389

> > Yes, but that was a while back. Meanwhile, the programmers have changed
> > everything. The resource debt is calculated once an hour or when a WU
> is
> > finished crunching or downloading. The project with the largest resource
> debt
> > at any of the trigger points gets to run for a while. If a project has
> no
> > work downloaded then the resource debt is set to 0. The calculation of
> the
> > resource debt is DebtNew = DebtOld + WallTime * ResourceFraction -
> CPUTime.
> > CPUTime is the time that the CPU spent working on that WU during the
> last
> > period of time. The debts are then shifted so that the lowest one is 0
> (to
> > avoid a slow creep based on CPU time != wall time).
>
> So, I need to change the glossary I guess.

Yes. However, it may change again.
>
> Ok, I will look at it, can you review it when I get done?
>

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.