pending credit

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253250781

RAC: 39198

26 Nov 2008 11:21:38 UTC

Topic 194053

(moderation:

)

I'm afraid that the main reasons for the sometimes large amount of pending credits lies in our "monster crunchers": ATLAS and the two GRID accounts (eScience and Britta Daudert).

ATLAS' normal usage is quite bursty and barely predictable, and so is it's contribution to Einstein@home. This would suggest a small work cache size, but I found this to stress our scheduler too much in times when ATLAS runs Einstein@home almost exclusively (on >50% of the nodes). The current cache size of 4 days keeps the project healthy, but leads to a lot of tasks held "captive" when the normal load rises and ATLAS is not running Einstein@home.

The GRID jobs run for a fixed number of seconds, and thus they do request work for only this amount of time. The scheduler will give them the least amount of work that runs larger than the requested duration. With the large number of machines there will be many tasks hanging over the cliff that will never be completed after the "job" terminates. In addition, BOINC's duration prediction isn't perfect to begin with, and the job scheduling is something that BOINC was not designed for, making it even worse.

For small (number of) machines you probably wouldn't notice, but these accounts have literally thousands of machines and are the greatest single contributions to Einstein@home, so any fluctuation there is highly noticeable to the whole project. You probably saw the pendings rise (and the oldest unsent result getting older) when we took ATLAS of E@H completely for a while.

If anybody knows a reasonably easy solution to lower the pendings that arise from the facts stated above I'd do my best to implement it.

Novasen169

Joined: 14 May 06

Posts: 43

Credit: 2767204

RAC: 0

pending credit

26 Nov 2008 13:24:44 UTC

Message 88655

(moderation:

)

I don't really have a solution, but it isn't that bad to have pending credits, right? You get them anyway after a while, and whether you get them now or in a few weeks... I don't really see the problem. Especially since ATLAS is making a big contribution to the project.

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

I have more problems getting

26 Nov 2008 14:22:10 UTC

Message 88656

(moderation:

)

I have more problems getting credits at SETI. Deadlines are longer and all Astropulse results, even if crunched by two users, need a third result to validate. So I am not worried even if ATLAS is a frequent wingman.
Tullio

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3014827265

RAC: 1062243

RE: ... and all Astropulse

26 Nov 2008 16:31:51 UTC

Message 88657 in response to message 88656

(moderation:

)

Quote:

... and all Astropulse results, even if crunched by two users, need a third result to validate ...

Sorry Tullio, but I think this is a red herring, not relevant to the Einstein question.

So far as I'm aware (and I've been pretty closely involved in Astropulse testing, both the project's own applications and the third-party optimised applications), the only time "all" Astropulse results need a third validator is when one of the paricipants is using a third-party optimised Linux build, which unfortunately was released into the wild without the same thorough testing as the third-party optimised Windows builds.

There are other Astropulse issues, such as a project application upgrade (equivalent to S5R3/S5R4 here) without adequate demarcation between the two - they don't validate against each other. But the only universal problem is the buggy Linux build, and the solution is in the users' own hands: don't run it.

With regard to the Einstein issue: am I right in thinking that the major problem, especially with the GRID nodes, is that once the allocated number of seconds has passed, the virual machine is effectively wiped clean, and starts afresh the next time it is tasked with an Einstein time-slice? As contrasted with an ordinary home PC, which preserves its state and data files between runs? If so, the only solution would seem to be a mechanism for sweeping Einstein 'work-in-progress' files off to a server or NAS box before the time runs out, and retrieving them at the start of the next run.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 815018942

RAC: 1270991

Hi! I was thinking before

26 Nov 2008 17:04:31 UTC

Message 88658

(moderation:

)

Hi!

I was thinking before that the cluster accounts (more precisely, all the host that belong to them) could be marked(e.g. my a script) with a special location code that is not selectable from the web GUI, like meaning "CLUSTER" in addition to the standard ones (like Home, work, school). This would be a first step to mark them as special and allow customized scheduler code to take this into account and treat them differently.

But how should they be treated differently? One way would be to favor pairing cluster hosts with cluster wingmen. Clusters don't care for credit or complain about about pending credits (and download bandwidth should also not be a top concern), so one way would be to reserve or prioritize certain subsets/segments of the overall set of results to be computed by cluster hosts.

Would that work? Should not be that much effort I guess.

Bikeman

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253250781

RAC: 39198

RE: With regard to the

26 Nov 2008 17:06:51 UTC

Message 88659 in response to message 88657

(moderation:

)

Quote:

With regard to the Einstein issue: am I right in thinking that the major problem, especially with the GRID nodes, is that once the allocated number of seconds has passed, the virual machine is effectively wiped clean, and starts afresh the next time it is tasked with an Einstein time-slice? As contrasted with an ordinary home PC, which preserves its state and data files between runs? If so, the only solution would seem to be a mechanism for sweeping Einstein 'work-in-progress' files off to a server or NAS box before the time runs out, and retrieving them at the start of the next run.

Honestly I'm not sure how they manage their hostids and if this could be easily implemented. A task reported from a different host than it was previously assigned to would not be accepted by the server.

I'm not sure that this helps, but I'll propose to them to send the client a message to abort all tasks in progress or detach from the project before the job actually terminates. This will at least report the tasks as client errors and create a new unsent task immediately instead of waiting for the deadline timeout.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3014827265

RAC: 1062243

RE: RE: With regard to

26 Nov 2008 17:31:56 UTC

Message 88660 in response to message 88659

(moderation:

)

Quote:

Quote:
With regard to the Einstein issue: am I right in thinking that the major problem, especially with the GRID nodes, is that once the allocated number of seconds has passed, the virual machine is effectively wiped clean, and starts afresh the next time it is tasked with an Einstein time-slice? As contrasted with an ordinary home PC, which preserves its state and data files between runs? If so, the only solution would seem to be a mechanism for sweeping Einstein 'work-in-progress' files off to a server or NAS box before the time runs out, and retrieving them at the start of the next run.

Honestly I'm not sure how they manage their hostids and if this could be easily implemented.

If the statefiles could be preserved between runs, then the hostid wouldn't be a problem - the hostid is stored in client_state.xml, after all.

archae86

Joined: 6 Dec 05

Posts: 3164

Credit: 7371041687

RAC: 2217440

I don't have a giant fleet,

26 Nov 2008 17:32:33 UTC

Message 88661

(moderation:

)

I don't have a giant fleet, but in the small sample given by my three serious hosts, the main source of pending in the last couple of months has not been long waits for unresponsive quorum partners (cluster or not), but rather long delay in issue.

In current conditions is seems far more common than in the past for there to be a delay of multiple days from the issue of the _0 result to a first quorum partner until the issue of the _1 result to the second. If one of the two initial partners errors out, it is rather common for there to be a multiple day delay for the _2 result to issue. When things got really bad I had long strings of consecutive results which waited about ten days for partner issue. Even now two to three days is very, very common.

Perhaps this situation (which was not the norm in the farther past) is somehow an indirect effect of the impact of the clusters on the dynamics of work allocation--but it is not just a matter of simple waiting for cluster partners, at least not in my case.

As I know nothing of how the code works, I can only wave my arms vigorously, but to the extent there is priorization among the competing goals of avoiding sending out new datafiles (such as l1_0728.30_S5R4) and of avoiding too long a delay in providing the next result issue for a WU with a combination of a sent and an unsent result, the weighting penalty assigned to new datafile issue relative to the weighting given to delayed issue could be altered.

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

RE: So far as I'm aware

26 Nov 2008 23:47:09 UTC

Message 88662 in response to message 88657

(moderation:

)

Quote:

So far as I'm aware (and I've been pretty closely involved in Astropulse testing, both the project's own applications and the third-party optimised applications), the only time "all" Astropulse results need a third validator is when one of the paricipants is using a third-party optimised Linux build, which unfortunately was released into the wild without the same thorough testing as the third-party optimised Windows builds.

There are other Astropulse issues, such as a project application upgrade (equivalent to S5R3/S5R4 here) without adequate demarcation between the two - they don't validate against each other. But the only universal problem is the buggy Linux build, and the solution is in the users' own hands: don't run it.

I would not use it if the standard app weren't so slow (115 hours) on my Opteron 1210 CPU. Now it takes 55 hours.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5887

Credit: 119365312612

RAC: 25848888

RE: RE: With regard to

27 Nov 2008 7:14:59 UTC

Message 88663 in response to message 88659

(moderation:

)

Quote:

Quote:
With regard to the Einstein issue: am I right in thinking that the major problem, especially with the GRID nodes, is that once the allocated number of seconds has passed, the virtual machine is effectively wiped clean, and starts afresh the next time it is tasked with an Einstein time-slice? As contrasted with an ordinary home PC, which preserves its state and data files between runs? If so, the only solution would seem to be a mechanism for sweeping Einstein 'work-in-progress' files off to a server or NAS box before the time runs out, and retrieving them at the start of the next run.

Honestly I'm not sure how they manage their hostids and if this could be easily implemented. A task reported from a different host than it was previously assigned to would not be accepted by the server.

Actually, a task can be completed by a totally different host. It's something that I do quite regularly. If I want to add a new and faster host to my fleet, I will quite often pick an older, slower machine to shut down and transfer the entire BOINC folder (including any partially crunched tasks) from the old host to the new one (via a network share). It's a quick and painless process and because the hostID is transferred as part of client_state.xml, there is no wastage of hostIDs.

I think you should look seriously at Bikeman's suggestion. Why couldn't a grid node which is about to complete its time slice simply stop BOINC and then save the entire BOINC directory to a network share? Then a node about to start a time slice could go find one of these saved directories, load it up, and then resume crunching from where the previous node left off. The new node would simply adopt the hostID that it finds in the state file that it inherits and absolutely nothing would be wasted.

You just need a mechanism for managing the flock of saved directories (a numeric naming system that increments in some fashion for example) and some way of ensuring (a lock mechanism) that two nodes can't simultaneously grab the same saved directory.

Quote:

I'm not sure that this helps, but I'll propose to them to send the client a message to abort all tasks in progress or detach from the project before the job actually terminates. This will at least report the tasks as client errors and create a new unsent task immediately instead of waiting for the deadline timeout.

BM

I'm sure you don't need to abort tasks in progress. That's just a waste. You should be able to save and reuse.

Cheers,
Gary.

Huff

Joined: 5 Jan 06

Posts: 36

Credit: 1378476

RAC: 0

Thank you, that sure sheds a

27 Nov 2008 7:24:11 UTC

Message 88664

(moderation:

)

Thank you, that sure sheds a little light on the issue. If that work could be saved, it would make the whole project more efficient.

pending credit

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner