Server state: Over

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 6
Topic 228869

I have running at the moment some of the big "Binary Radio Pulsar Search (Arecibo, large)" WUs...

They've gone past their deadline and the server status gives:

Server state: Over

 

Should they be aborted?

Or is it still worthwhile to leave running even though past the deadline?

 

One example is:

Task 1404932359

Name: p2030.20200628.G47.06-03.53.C.b5s0g0.00000_2080_0

Workunit ID: 697989736

 

 

Thanks,

Martin

 

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5043
Credit: 19041432471
RAC: 6650609

No abort them if they are

No abort them if they are past deadline.  You are wasting cpu cycles and wasting up/dn load server resources if you try to return them past deadline.  The tasks have no value to the project now.

 

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 6

Thanks for clarifying that,

Thanks for clarifying that, good to know.

They're now aborted.

 

Thanks,

Martin

 

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 6

Keith Myers wrote: No abort

Keith Myers wrote:

No abort them if they are past deadline.  You are wasting cpu cycles ...

From this example, that is not the full story...

 

See: Task 1430264656

[...]

Report deadline: 5 Mar 2023 16:15:48 UTC

Received: 5 Mar 2023 21:47:24 UTC

Server state: Over

Outcome: Success

Client state: Done

[...]

Validation state: Valid

Granted credit: 693

[...]

 Note that had the server state listed as "over" and yet was accepted even though reported a few hours late.

 

My guess is that you can still meaningfully return a result provided noone else has already returned a result and provided you can still see the details listed for your particular task.

 

Happy crunchin'!

Martin

 

 

 

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5043
Credit: 19041432471
RAC: 6650609

Yes, this can happen  . . .

Yes, this can happen  . . . .  rarely when your returned task is validated after the deadline.  But you have to catch the task before any other host has been sent the followup wingman task iteration.

I wouldn't count on the circurmstances happening consistently.

 

mikey
mikey
Joined: 22 Jan 05
Posts: 12831
Credit: 1884037515
RAC: 1027048

Keith Myers wrote: Yes, this

Keith Myers wrote:

Yes, this can happen  . . . .  rarely when your returned task is validated after the deadline.  But you have to catch the task before any other host has been sent the followup wingman task iteration.

I wouldn't count on the circurmstances happening consistently. 

Especially with people running very small caches these days at most projects.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5043
Credit: 19041432471
RAC: 6650609

Also depends on the

Also depends on the project. 

I have several hundred "server cancelled" wingman tasks on a couple of projects where the schedulers get antsy when the wingman task deadline is approaching expiration within a couple of hours but hasn't returned the task yet and the projects scheduler pre-issue another wingman task, and then the first wingman task squeeks in under the task deadline limit after the secondary, tertiary et al wingman task is issued.

So the latest wingman task is cancelled because consensus was achieved by the original tasks and after the issuance of the latest wingman task return.

And you might have a fast host that has already completed the task all for nought and get no credit.

So it all depends on project schedulers and circumstances.

 

mikey
mikey
Joined: 22 Jan 05
Posts: 12831
Credit: 1884037515
RAC: 1027048

Keith Myers wrote: Also

Keith Myers wrote:

Also depends on the project. 

I have several hundred "server cancelled" wingman tasks on a couple of projects where the schedulers get antsy when the wingman task deadline is approaching expiration within a couple of hours but hasn't returned the task yet and the projects scheduler pre-issue another wingman task, and then the first wingman task squeeks in under the task deadline limit after the secondary, tertiary et al wingman task is issued.

So the latest wingman task is cancelled because consensus was achieved by the original tasks and after the issuance of the latest wingman task return.

And you might have a fast host that has already completed the task all for nought and get no credit.

So it all depends on project schedulers and circumstances.

That's true I had forgotten about that!!

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 6

For another example that

For another example that didn't die:

 

Task 1444966435

From:

Name: h1_1188.80_O3aC01Cl0In0__O3MD1V2a_VelaJr1_1189.50Hz_96_0

Workunit ID: 717664498

Created: 22 Mar 2023 19:56:32 UTC

Sent: 22 Mar 2023 22:30:23 UTC

Report deadline: 5 Apr 2023 22:30:23 UTC

Received: 1 Jan 1970 0:00:00 UTC

Server state: Over

Outcome: No reply

Client state: New

Exit status: 0 (0x00000000)

Computer: 13071036

Run time (sec): 0.00

CPU time (sec): 0.00

Peak working set size (MB): 0

Peak swap size (MB): 0

Peak disk usage (MB): 0

Validation state: Initial

Granted credit: 0

Application: Multi-Directional Gravitational Wave search on O3 (CPU) v1.03 (GW-SSE2)

To:

Name: h1_1188.80_O3aC01Cl0In0__O3MD1V2a_VelaJr1_1189.50Hz_96_0

Workunit ID: 717664498

Created: 22 Mar 2023 19:56:32 UTC

Sent: 22 Mar 2023 22:30:23 UTC

Report deadline: 5 Apr 2023 22:30:23 UTC

Received: 9 Apr 2023 17:35:41 UTC

Server state: Over

Outcome: Success

Client state: Done

Exit status: 0 (0x00000000)

Computer: 13071036

Run time (sec): 175,043.60

CPU time (sec): 152,640.40

Peak working set size (MB): 3324.66

Peak swap size (MB): 3343.07

Peak disk usage (MB): 15.36

Validation state: Valid

Granted credit: 1,000

Application: Multi-Directional Gravitational Wave search on O3 (CPU) v1.03 (GW-SSE2)
x86_64-pc-linux-gnu

 

 

 

So... Despite being 4 days late, for that particular example the result was still accepted and still gained credit.

 

Happy crunchin'!

Martin

 

 

 

 

 

 

 

 

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1095
Credit: 18445883
RAC: 11009

ML1 schrieb:So... Despite

ML1 wrote:
So... Despite being 4 days late, for that particular example the result was still accepted and still gained credit.

There's no wingman for O3MD1 CPU tasks. e@h's scheduler is in no hurry to reassign such delayed tasks. I think that's also because einstein@home heavily modified the BOINC server software to optimize for task locality. The very large data files for O3MD1 CPU tasks should be duplicated for as few user hosts as possible (saving network bandwidth), so that client hosts process as many tasks as possible that are close together (different parameters, same raw data). Waiting many days past the deadline before reassigning a task seems to be a smart strategy for O3MD1 CPU tasks.

It's different for other science apps, like BRP4, BRP4X64where there are wingmans and each workunit processes different raw data files. No task locality...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.