pending credit

Novasen169
Novasen169
Joined: 14 May 06
Posts: 43
Credit: 2767204
RAC: 0

RE: Thank you, that sure

Message 88665 in response to message 88664

Quote:
Thank you, that sure sheds a little light on the issue. If that work could be saved, it would make the whole project more efficient.


It would only make ATLAS's contribution to it more efficient. The fact that there are pending credits in itself doesn't cause inefficiency or affect the speed of the project in any way.

But still saving that time (possibly a few hours per timeslice) would be really nice :)

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

I looked at some of

I looked at some of archae86's hosts and did see some "anonymous" hosts of Opteron 1212, Opteron 275, and Xeon X5355 flavors all running Linux, which are likely cluster nodes. They might not be, but it's a reasoned guess... :-)

Anyway, a few of those had timed out, but others showed "client detached". For those that were detached, like he had mentioned, there was sometimes a significant delay between the detach and the issuance of another replication.

I still think there are multiple things that can be done here. The discussion about sending aborts and having the nodes report in that they've been aborted is a good one. I also believe that getting the 6.06 app out to the Windows user base will also help. The faster the data is crunched, the more quickly hosts can become available for a new data set.

Once the 6.06 app is out and becomes the stock application, I think a reduction in the deadline back to 14 days is also appropriate and will also help reduce the level of pending credit...

As always, IMO, YMMV, etc, etc, etc...

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9352143
RAC: 0

I don't know if setting the

I don't know if setting the deadline tighter is such a good idea. Given my fleet, that's the last thing I want to see! ;-)

All that does is reduce the host population which is effectively running the project. It either cuts some hosts off from participation completely, or in the case of hosts which run a lot of projects makes their contribution more spotty due to accumulating debt issues when EAH comes up for a run slot. IOW's the host grabs a task, which 'hogs' the machine to beat the tight deadline given its situation, then disappears from the project for an extended period to pay back the debt to the other projects.

IMHO, if there is no time pressure to get the work back quickly (like MW for example) then the deadline should be set as loose as possible in keeping with the science goals of the project and the ability of the backend to track all the outstanding work without grinding to a halt.

I've said it many times before, who cares how long a task stays pending as long as you get credit for it ultimately? Over the course of weeks, months, years, and/or decades, it makes virtually zero difference to any of the metrics you can measure (including RAC) for a host. :-)

I just cleared a bunch of pendings which had been sitting around for a month or so. So my RAC was dropping some while I was waiting for wingmen to catch up. Now it's back where it usually is, and the world didn't stop turning in the mean time! :-D

Alinator

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: I don't know if setting

Message 88668 in response to message 88667

Quote:
I don't know if setting the deadline tighter is such a good idea. Given my fleet, that's the last thing I want to see! ;-)

Is it possible that the fact that you have multiple slower hosts adds some bias to your opinion? That seems to be what you're stating...

Ever since the deadline was increased from 14 days to 18 days, the number of complaints about EDF dropped off significantly, perhaps even nearly eliminated.

One of the conditions I stated a long time ago when I asked for the deadline increase was that when the Windows app had SSE, then I felt it could be brought back down to 14 days. I didn't bring the issue up in S5R3 and early on in S5R4 because there was still a significant difference between the Windows and Linux apps. That difference has been greatly reduced now, so I think now is the appropriate time to consider bringing it back to where it was.

To further explain why, I believe that at the time I requested the increase my AMD system was taking 12-15 hours per task. I'm now down to 8-9 hours per task.

Bernd said that deadlines can easily be changed during the run. If changing it back to 14 causes a problem, it can be brought back up.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 0

While we're suggesting

While we're suggesting multiple classes of host, why not do a fast/slow host split as well? The fast host could be set with a 1 week deadline, and the slow host could have something like 3 or 4 weeks so that people still crunching their p2-400's or with EDFobia and 50 projects running would be able to complete work normally. Default's could be done automatically via initial benchmarks and reported runtime/duty cycles with user overrides available of the obsessive compulsive.

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: While we're suggesting

Message 88670 in response to message 88669

Quote:
While we're suggesting multiple classes of host, why not do a fast/slow host split as well? The fast host could be set with a 1 week deadline, and the slow host could have something like 3 or 4 weeks so that people still crunching their p2-400's or with EDFobia and 50 projects running would be able to complete work normally. Default's could be done automatically via initial benchmarks and reported runtime/duty cycles with user overrides available of the obsessive compulsive.

While a good thought, it sets up a scenario where someone will complain because their system got classed either way, they feel it should be the other, and they got short-changed because they missed a deadline and the 3rd host reported in before they got their replication back...

It is really rare that there are "Einstein is hogging my CPU" posts now, so 14 days should be enough once the switching app becomes the stock app, just like how 18 days is enough now, particularly in light of the sheer number of P4 machines connected to this project that will benefit from SSE2...

Novasen169
Novasen169
Joined: 14 May 06
Posts: 43
Credit: 2767204
RAC: 0

RE: IMHO, if there is no

Message 88671 in response to message 88667

Quote:

IMHO, if there is no time pressure to get the work back quickly (like MW for example) then the deadline should be set as loose as possible in keeping with the science goals of the project and the ability of the backend to track all the outstanding work without grinding to a halt.

I've said it many times before, who cares how long a task stays pending as long as you get credit for it ultimately? Over the course of weeks, months, years, and/or decades, it makes virtually zero difference to any of the metrics you can measure (including RAC) for a host. :-)


^

Even though I don't have hosts that even get near the deadline (usually they're < 1 day average turnaround time), I completely agree with you.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 735640271
RAC: 1278788

I don't really care about

I don't really care about pending credits (or credits anywa), however, the very existence and length of this thread indicates that many people DO care and do not feel comfortable with a large backlog of pending credits (for whatever reasons).

CU
Bikeman

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: I don't really care

Message 88673 in response to message 88672

Quote:
I don't really care about pending credits (or credits anywa), however, the very existence and length of this thread indicates that many people DO care and do not feel comfortable with a large backlog of pending credits (for whatever reasons).

Yep. It's a balancing act between the complaints about "hogging the CPU" (defined as EDFobia) and about not getting that gratification of seeing credits go up in a "timely fashion". Deadlines should be responsible, not too long, not too short. I don't think that going back to 14 days along with having most everyone using SSE2 is going to be detrimental. It will take 4 days off of the wait time for an absent host. It might increase the pressure on hosts with larger caches, but the performance has grown by at least 30% over the course of S5R4, while 30% of 18 is 5.4. In other words, reduction of the deadline to 14 days is less than the amount of improvement, so it is still erring on the side of favoring the slower / less-allocated hosts.

mikey
mikey
Joined: 22 Jan 05
Posts: 12706
Credit: 1839111974
RAC: 3614

RE: RE: I don't really

Message 88674 in response to message 88673

Quote:
Quote:
I don't really care about pending credits (or credits anywa), however, the very existence and length of this thread indicates that many people DO care and do not feel comfortable with a large backlog of pending credits (for whatever reasons).

Yep. It's a balancing act between the complaints about "hogging the CPU" (defined as EDFobia) and about not getting that gratification of seeing credits go up in a "timely fashion". Deadlines should be responsible, not too long, not too short. I don't think that going back to 14 days along with having most everyone using SSE2 is going to be detrimental. It will take 4 days off of the wait time for an absent host. It might increase the pressure on hosts with larger caches, but the performance has grown by at least 30% over the course of S5R4, while 30% of 18 is 5.4. In other words, reduction of the deadline to 14 days is less than the amount of improvement, so it is still erring on the side of favoring the slower / less-allocated hosts.

BUT if the Project has a lot of older pc's crunching for it, your idea could eliminate some of them. That is also a balancing act for the Project. Make it a 1 day deadline and only the best of the best pc's and those connected to no other projects could crunch here. That would eliminate most of the people here. As we go thru the years more and more of the older pc's will have to be dropped off as unable to keep up, but that point will always be contentious. Some people really do believe in the Projects idea and contribute because of that. Others just crunch because they can.
One thing the Project could do, may be a Boinc thing though, is pair pc's up a little better. Pair up a pc that returns units in less than say 7 days with another that is doing the same thing, then those that take more than 7 days with others doing the same. As I said this may be a Boinc thing, not a Project thing. That would help solve the problem of those with fast pc's complaining and those with slower pc's would already understand. A little education on the Projects part would go a long way to getting people used to the idea.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.