I think the validator

Crunch3r

Joined: 22 Jan 05

Posts: 90

Credit: 30237616

RAC: 0

RE: P.S. The peak is

24 Feb 2007 1:51:53 UTC

Message 61156 in response to message 61155

(moderation:

)

Quote:

P.S. The peak is obviously very high; weâ€™ll see how broad it gets over the next few days. Itâ€™s a little like watching the light-curve of a supernova â€¦

Well .. i still hope that we get an answer to what has caused the issue in the first place.

As for now i'm sure that the issues were caused by switching from boinc 506 to 507 and introducing a "bug" that hasn't been there before (that's the server version of course).

So please tell use what caused the issues in the first place...

Thanks.

Pooh Bear 27

Joined: 20 Mar 05

Posts: 1376

Credit: 20312671

RAC: 0

Matt L post Not a guess,

24 Feb 2007 2:00:48 UTC

Message 61157

(moderation:

)

Matt L post

Not a guess, but an actual post by a BOINC developer explaining that there was database issues. If you read the whole thread, it does explain Einstein is part of this issue.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2989663115

RAC: 698911

RE: Hereâ€™s an impressive

24 Feb 2007 10:13:53 UTC

Message 61158 in response to message 61155

(moderation:

)

Quote:

Hereâ€™s an impressive illustration of how great the backlog was, from the E@h overview page on BOINCstats:

â€”it appears that over sixty million credits were granted yesterday.

P.S. The peak is obviously very high; weâ€™ll see how broad it gets over the next few days. Itâ€™s a little like watching the light-curve of a supernova â€¦

I expect it will be a one-day wonder. From what I could see, they held off the stats export until the validator had finished catching up - and very sensible too.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2989663115

RAC: 698911

RE: Matt L post Not a

24 Feb 2007 10:32:37 UTC

Message 61159 in response to message 61157

(moderation:

)

Quote:

Matt L post

Not a guess, but an actual post by a BOINC developer explaining that there was database issues. If you read the whole thread, it does explain Einstein is part of this issue.

Matt's post explained what happened, but not why, nor how it came to escape into the wild without checking.

We've just finished 95% of a 16 million WU run with no problems at all: and I think it's generally accepted that the problems with the final 5% were more attributable to data traffic load and fileserver problems, rather than database issues.

So I don't see why the database has such problems with the new 7 million WU run. I'm afraid I just don't buy the 'tipping point' theory, that Moore's Law has speeded up the user base just enough to outrun the servers.

I'm with Crunch3r on this one: I suspect that a BOINC back-end upgrade went out the door without being fully tested. Having seen the state of readiness of the 5.8.x client range when they were declared fit for public use, it wouldn't surprise me if the same thing had happened in the server code.

If that analysis is correct, I hope the BOINC team (by which I mean the integraters and releasers of the volunteer code) learn the lesson: More Haste, Less Speed.

Pooh Bear 27

Joined: 20 Mar 05

Posts: 1376

Credit: 20312671

RAC: 0

Bruce Allen's post on the

24 Feb 2007 22:26:45 UTC

Message 61160

(moderation:

)

Bruce Allen's post on the subject

archae86

Joined: 6 Dec 05

Posts: 3161

Credit: 7305098356

RAC: 2291032

RE: Bruce Allen's post on

25 Feb 2007 1:08:04 UTC

Message 61161 in response to message 61160

(moderation:

)

Quote:

Bruce Allen's post on the subject

If I understand Bruce's post correctly, the late-stage problems in the previous run were directly related to one of this run's problem. When the supply of "short" (as we call them) work units ran lower than the demand (from slow systems), the behavior of the BOINC code very badly bogged down the system with massive slow searches.

Annika

Joined: 8 Aug 06

Posts: 720

Credit: 494410

RAC: 0

Yep, I understood it the same

25 Feb 2007 2:51:14 UTC

Message 61162

(moderation:

)

Yep, I understood it the same way. Maybe that's why the old P3 here didn't get any more work. No idea, but it sounds kinda logical. Don't worry, it hasn't been idle, my Dad developed a liking for Quantum Monte Carlo so the box was busy there. The post just made me wonder.
Well, never mind. What is more important is that everything now seems to be up and running again and we even got some very good, detailed information about the problems and how they were fixed. Thanks to all the project staff and especially to Dr. Bruce Allen for his latest post (hope you read it over here) and... well, a belated "Welcome to Germany".

Bruce Allen

Moderator

Joined: 15 Oct 04

Posts: 1119

Credit: 172127663

RAC: 0

RE: Thanks to all the

27 Feb 2007 10:08:51 UTC

Message 61163 in response to message 61162

(moderation:

)

Quote:

Thanks to all the project staff and especially to Dr. Bruce Allen for his latest post (hope you read it over here) and... well, a belated "Welcome to Germany".

You're welcome -- and thank you!

Director, Einstein@Home

I think the validator

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner