What's up with LHC at Home?

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 317951182

RAC: 390674

Does all this mean that they

26 Jan 2009 22:05:21 UTC

Message 89936

(moderation:

)

Does all this mean that they send out 5, and when 3 come in they cancel the other 2 ??

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Phil

Joined: 24 Feb 05

Posts: 176

Credit: 1817881

RAC: 0

RE: Does all this mean that

26 Jan 2009 22:22:29 UTC

Message 89937 in response to message 89936

(moderation:

)

Quote:

Does all this mean that they send out 5, and when 3 come in they cancel the other 2 ??

Cheers, Mike.

They send 5, and after they get a quorum of 3 returned, when the other two hosts contact the server any work thats not been started is cancelled.
This means that anyone who returns a correct result earns credit, but could be considered wasteful as that result is not used in any way.

Theres a few other projects do this but only with work that lasts a few minutes so that queues of jobs on hosts can be managed quickly.

Dagorath

Joined: 22 Apr 06

Posts: 146

Credit: 226423

RAC: 0

RE: Does all this mean that

26 Jan 2009 22:55:32 UTC

Message 89938 in response to message 89936

(moderation:

)

Quote:

Does all this mean that they send out 5, and when 3 come in they cancel the other 2 ??

Cheers, Mike.

They try to cancel the other 2 but if the host has already started crunching the task then it doesn't get canceled, which is only fair to the user because he starts the task on the presumption that it's needed so he should get credit for any work he does even if it's not really needed. It's fair as far credits are concerned but still a very wasteful strategy for getting the batch of work units done quickly. There are other strategies that get the batch finished just as quick but have next to 0 waste.

BOINC FAQ Service
Official BOINC wiki
Installing BOINC on Linux

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 317951182

RAC: 390674

Thanks for that. There's a

26 Jan 2009 23:23:27 UTC

Message 89939

(moderation:

)

Thanks for that. There's a few queries on this following scenario :

- quorum returned already. I'm one of the two later ones. I contact the server. Having started a WU ( say ~ half way by now ) I'll get credit from this point only if (a) I complete AND (b) I'm correct. ??

- if so, then the quorum is going to determine 'correctness' by that point?

versus this scenario :

- quorum not returned already. I'm one of the three earlier ones. I contact the server. Having finished a WU I'll get credit from this point only if (a) I'm correct. ??

- if so, then the quorum ( which is part me ) is going to determine 'correctness' by this stage?

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Donald A. Tevault

Joined: 17 Feb 06

Posts: 439

Credit: 73516529

RAC: 0

RE: RE: RE: RE: Quote

26 Jan 2009 23:26:23 UTC

Message 89940 in response to message 89935

(moderation:

)

Quote:

Quote:
Quote:
Quote:
Quote:
How is having more people wanting to crunch LHC work than LHC has meaningful work to provide an LHC problem? They could increase replication or create junk WUs to keep a continual supply, but the only results would be increased server load on their end and less real science done boincwide.

What you mean "they could"? They already do replicate more tasks than necessary and because of it less real science gets done boincwide.

Agreed, having the IR/MQ at 5/3 is retarded and a complete waste of 40% of the donated resources they get every time they make a science run.

This has been pointed out to them more than once, and if they couldn't care less what the impact of their bad decision is on other projects (not to mention wasting the money it takes for participants to run the useless trailers), then I couldn't care less about running their work (and don't). ;-)

Alinator

Actually, the last time I had any workunits from them, they had started canceling any uncompleted workunits that were no longer needed.

Wrong. LHC@home cancels unneeded tasks that have not started crunching. The problem is hosts regularly start tasks even though the work unit has already achieved quorum. Therefore the wasted effort continues, as many of us predicted it would when they announced that they were looking at implementing cancels.

Well, that's actually what I meant. I just didn't word it well.

Donald A. Tevault

Joined: 17 Feb 06

Posts: 439

Credit: 73516529

RAC: 0

RE: RE: But, have you

26 Jan 2009 23:30:13 UTC

Message 89941 in response to message 89934

(moderation:

)

Quote:

Quote:

But, have you tried accessing their site lately? Here, every time I try to connect, I get a "connection error" message. And, it's been like that for the past several months. It shouldn't be due to anything on my end, since everything else I access works just fine.

Seriously, the site is fine, both the website and with BOINC Manager.
Can you reach them on your BOINC Manager and get a "No Work" message from them, or is neither working?

No, even with BOINC, I get connection errors.

Not that I'm worried about missing any work. Mainly, I'm just curious about the project's status.

Dagorath

Joined: 22 Apr 06

Posts: 146

Credit: 226423

RAC: 0

RE: RE: RE: But, have

27 Jan 2009 0:26:55 UTC

Message 89942 in response to message 89941

(moderation:

)

Quote:

Quote:
Quote:

But, have you tried accessing their site lately? Here, every time I try to connect, I get a "connection error" message. And, it's been like that for the past several months. It shouldn't be due to anything on my end, since everything else I access works just fine.

Seriously, the site is fine, both the website and with BOINC Manager.
Can you reach them on your BOINC Manager and get a "No Work" message from them, or is neither working?

No, even with BOINC, I get connection errors.

Not that I'm worried about missing any work. Mainly, I'm just curious about the project's status.

Looks like a DNS problem. Try "traceroute 138.37.50.115" and post the output.

BOINC FAQ Service
Official BOINC wiki
Installing BOINC on Linux

Alinator

Joined: 8 May 05

Posts: 927

Credit: 9352143

RAC: 0

RE: Thanks for that.

27 Jan 2009 1:07:40 UTC

Message 89943 in response to message 89939

(moderation:

)

Quote:

Thanks for that. There's a few queries on this following scenario :

- quorum returned already. I'm one of the two later ones. I contact the server. Having started a WU ( say ~ half way by now ) I'll get credit from this point only if (a) I complete AND (b) I'm correct. ??

- if so, then the quorum is going to determine 'correctness' by that point?

If the quorum is already formed you have to be:

1.) In before your deadline.

2.) At least weakly similar to the Canonical Result.

Quote:

versus this scenario :

- quorum not returned already. I'm one of the three earlier ones. I contact the server. Having finished a WU I'll get credit from this point only if (a) I'm correct. ??

- if so, then the quorum ( which is part me ) is going to determine 'correctness' by this stage?

Cheers, Mike.

The tricky part here is the criteria for making a quorum is a successful outcome for the MQ number of tasks (ie no errors), but not necessarily 'correct' (pass validation). If no strongly similar pair is found in the MQ, the whole checking process is run again on all successful outcomes as additional replications are returned.

However, the catch is that even if you are part of an MQ of 3 the WU can clear validation if the other 2 were strongly similar. It's even possible for you as third quorum member to not get credit for the WU if you turned out not be at least weakly similar to the one chosen as canonical.

IOW, once a canonical is chosen, there is no point in running the task if it hasn't started yet (the whole idea behind '221' Redundant Result aborts). If you have started running it, your host is wasting it's time since it's output is not used for determining the 'correctness' of the science or helping to set the 'proper' credit (for projects where that makes a difference).

Alinator

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 317951182

RAC: 390674

Thanks Alinator. I guess

27 Jan 2009 1:50:06 UTC

Message 89944

(moderation:

)

Thanks Alinator. I guess that's workable ( as per equity ) if the 'weak similarity' test is the same tolerance either scenario. Point taken though that : work after quorum resolution doesn't contribute to scientific correctness.

[musing]

Quote:

I guess one could award some points for later returners ( based on fraction of WU thus far, say ), cancel ongoing calculations for the current WU, and then get them immediately over to fresh WU's. But you wouldn't know if a partial return was going to be correct ...... there could be a case for those who formed the quorum to cry foul.

[/musing]

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Alinator

Joined: 8 May 05

Posts: 927

Credit: 9352143

RAC: 0

RE: Thanks Alinator. I

27 Jan 2009 2:38:24 UTC

Message 89945 in response to message 89944

(moderation:

)

Quote:

Thanks Alinator. I guess that's workable ( as per equity ) if the 'weak similarity' test is the same tolerance either scenario. Point taken though that : work after quorum resolution doesn't contribute to scientific correctness.

[musing]
Quote:
I guess one could award some points for later returners ( based on fraction of WU thus far, say ), cancel ongoing calculations for the current WU, and then get them immediately over to fresh WU's. But you wouldn't know if a partial return was going to be correct ...... there could be a case for those who formed the quorum to cry foul.
[/musing]

Cheers, Mike.

Yep, that's why virtually all the projects today have gone to an IR/MQ of 2/2 (or less in some cases).

IIRC, the rationale for an MQ of three or more was that back in the early days of BMT scoring this would give a better representation of the 'true' credit value of the task (assuming they all passed validation).

The rationale for excess replication by default was that there were so many individual task failures in the early days of BOINC due to an extensive list of problems (both client and server), as well as the small host population back then, it was felt that having backup wingmen right from the start (rather than waiting for one to get shot down first) helped expedite troubleshooting and debugging efforts. It also helped to keep your pending list from growing too large! ;-)

LOL... People like to complain about how many they have nowadays. I remember when it wasn't uncommon to have three, four, or more times the number of tasks pending than I had in progress at any given time! :-D

Here on EAH now, it's unusual for me to have more than one maybe two, and typically that only happens when one of the big clusters wanders off to do something else for awhile or develops a problem. :-)

Alinator

What's up with LHC at Home?

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner