"Resent Lost Result"

Collin
Collin
Joined: 4 Dec 05
Posts: 19
Credit: 71274
RAC: 0
Topic 191790

I did a search and perhaps didn't phrase it correctly...

What does "Resent Lost Result" mean beyond the obvious? What does it exactly mean from the server? Does this mean the server lost the result or...?

Michael Karlinsky
Michael Karlinsky
Joined: 22 Jan 05
Posts: 888
Credit: 23502182
RAC: 0

"Resent Lost Result"

Quote:

I did a search and perhaps didn't phrase it correctly...

What does "Resent Lost Result" mean beyond the obvious? What does it exactly mean from the server? Does this mean the server lost the result or...?

No, you lost the result. Either by reinstalling and reattaching to E@H (and merging your hosts) or it was lost during transfer. The scheduler checks, if you have all results you should have according to the database. If not the missing results are sent again.

Michael

Collin
Collin
Joined: 4 Dec 05
Posts: 19
Credit: 71274
RAC: 0

RE: RE: I did a search

Message 45057 in response to message 45056

Quote:
Quote:

I did a search and perhaps didn't phrase it correctly...

What does "Resent Lost Result" mean beyond the obvious? What does it exactly mean from the server? Does this mean the server lost the result or...?

No, you lost the result. Either by reinstalling and reattaching to E@H (and merging your hosts) or it was lost during transfer. The scheduler checks, if you have all results you should have according to the database. If not the missing results are sent again.

Michael

Most likely they were lost during transfer for whatever reason. None of the other two possibilities you mentioned apply. :)

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

RE: RE: RE: I did a

Message 45058 in response to message 45057

Quote:
Quote:
Quote:

I did a search and perhaps didn't phrase it correctly...

What does "Resent Lost Result" mean beyond the obvious? What does it exactly mean from the server? Does this mean the server lost the result or...?

No, you lost the result. Either by reinstalling and reattaching to E@H (and merging your hosts) or it was lost during transfer. The scheduler checks, if you have all results you should have according to the database. If not the missing results are sent again.

Michael

Most likely they were lost during transfer for whatever reason. None of the other two possibilities you mentioned apply. :)

Colin, the occaisonal lost result is nothing to worry about. But if this is happening frequently or always, then it is an indication that something is wrong. In this case please ask for help.

Cheers,
Bruce

Director, Einstein@Home

arcturus
arcturus
Joined: 11 Feb 05
Posts: 44
Credit: 1008160
RAC: 0

After more or less flawless

After more or less flawless operation I'm getting this ...

2007-07-03 08:25:22 [Einstein@Home] Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
2007-07-03 08:25:22 [Einstein@Home] Reason: To fetch work
2007-07-03 08:25:22 [Einstein@Home] Requesting 172800 seconds of new work
2007-07-03 08:25:28 [Einstein@Home] Scheduler request succeeded
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__340_S5R2c_1
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__340_S5R2c_1
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__347_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__347_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__339_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__339_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__334_S5R2c_1
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__334_S5R2c_1
2007-07-03 08:25:30 [Einstein@Home] Started download of file einstein_S5R2_4.21_i686-pc-linux-gnu
2007-07-03 08:25:30 [Einstein@Home] Started download of file einstein_S5R2_4.21_i686-pc-linux-gnu.so
2007-07-03 08:25:44 [Einstein@Home] Finished download of file einstein_S5R2_4.21_i686-pc-linux-gnu.so
2007-07-03 08:25:44 [Einstein@Home] Throughput 166243 bytes/sec
2007-07-03 08:25:55 [Einstein@Home] Finished download of file einstein_S5R2_4.21_i686-pc-linux-gnu
2007-07-03 08:25:55 [Einstein@Home] Throughput 231012 bytes/sec
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Using earliest-deadline-first scheduling because computer is overcommitted.
2007-07-03 08:25:56 [Einstein@Home] Starting task h1_0502.00_S5R2__340_S5R2c_1 using einstein_S5R2 version 421
2007-07-03 08:25:59 [---] Suspending work fetch because computer is overcommitted.

This is the second time I've reset the project but it seems the server wants to send the same (possibly corrupt?) work units back. Note this is a linux box.

Resolution?

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 812105525
RAC: 1268723

RE: After more or less

Message 45060 in response to message 45059

Quote:

After more or less flawless operation I'm getting this ...

2007-07-03 08:25:22 [Einstein@Home] Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
2007-07-03 08:25:22 [Einstein@Home] Reason: To fetch work
2007-07-03 08:25:22 [Einstein@Home] Requesting 172800 seconds of new work
2007-07-03 08:25:28 [Einstein@Home] Scheduler request succeeded
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__340_S5R2c_1
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__340_S5R2c_1
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__347_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__347_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__339_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__339_S5R2c_2
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__334_S5R2c_1
2007-07-03 08:25:28 [Einstein@Home] Message from server: Resent lost result h1_0502.00_S5R2__334_S5R2c_1
2007-07-03 08:25:30 [Einstein@Home] Started download of file einstein_S5R2_4.21_i686-pc-linux-gnu
2007-07-03 08:25:30 [Einstein@Home] Started download of file einstein_S5R2_4.21_i686-pc-linux-gnu.so
2007-07-03 08:25:44 [Einstein@Home] Finished download of file einstein_S5R2_4.21_i686-pc-linux-gnu.so
2007-07-03 08:25:44 [Einstein@Home] Throughput 166243 bytes/sec
2007-07-03 08:25:55 [Einstein@Home] Finished download of file einstein_S5R2_4.21_i686-pc-linux-gnu
2007-07-03 08:25:55 [Einstein@Home] Throughput 231012 bytes/sec
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Rescheduling CPU: files downloaded
2007-07-03 08:25:56 [---] Using earliest-deadline-first scheduling because computer is overcommitted.
2007-07-03 08:25:56 [Einstein@Home] Starting task h1_0502.00_S5R2__340_S5R2c_1 using einstein_S5R2 version 421
2007-07-03 08:25:59 [---] Suspending work fetch because computer is overcommitted.

This is the second time I've reset the project but it seems the server wants to send the same (possibly corrupt?) work units back. Note this is a linux box.

Resolution?


The science app was updated, I guess this could also be the reason for re-fetching some workunits. I'd give it a try as it is now.

CU

BRM

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5887
Credit: 119303813030
RAC: 25541915

RE: After more or less

Message 45061 in response to message 45059

Quote:

After more or less flawless operation I'm getting this ...

This is the second time I've reset the project ...

I'm just curious as to why you felt it necessary to reset twice if things were more or less "flawless"? ... :).

Resetting should be regarded as a "last resort" action when all other avenues have failed. I can't remember when I last used it so I don't have an up-to-date understanding of the precise details but I seem to recall that essentially all data and executables are thrown away and a fresh lot of everything is downloaded. I'm not sure but I believe that the server would not send you the very same data after a reset.

The "resending lost results" mechanism is a way that the server and client parts of BOINC reconcile any differences between them. For example, if the server has sent something but the client hasn't actually successfully received it, this can be automatically rectified using this mechanism. Also if a user has a disk problem which causes data to be corrupted but without losing the rest of the BOINC folder, the data will be resent. Another possibility is that if a user simply decides to delete some of his data without telling BOINC about it, BOINC will simply replace what has been deleted.

Is it possible that data was deleted *before* the reset was performed? Under those circumstances BOINC might simply replace what has been deleted.

Cheers,
Gary.

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: The science app was

Message 45062 in response to message 45060

Quote:
The science app was updated, I guess this could also be the reason for re-fetching some workunits. I'd give it a try as it is now.

If you look at the results that show compute errors, they have this:

Quote:

5.4.11

process got signal 11

called
SIGABRT: abort called

They were already on version 4.21 of the Linux app when it happened.

I don't think I'd so much worry about the resends as I would about the SIGABRT issue...

arcturus
arcturus
Joined: 11 Feb 05
Posts: 44
Credit: 1008160
RAC: 0

RE: I'm just curious as to

Message 45063 in response to message 45061

Quote:
I'm just curious as to why you felt it necessary to reset twice if things were more or less "flawless"? ... :).

Simple, you misinterpreted :). Everything was going just fine before doing the first reset. Obviously by the 2nd reset things weren't going well.

Quote:
Resetting should be regarded as a "last resort" action when all other avenues have failed.

LOL are you always this verbose? Frankly I'm not interested so much in the cause as I am in possible solutions. Have any?

arcturus
arcturus
Joined: 11 Feb 05
Posts: 44
Credit: 1008160
RAC: 0

RE: [They were already on

Message 45064 in response to message 45062

Quote:

[They were already on version 4.21 of the Linux app when it happened.

I don't think I'd so much worry about the resends as I would about the SIGABRT issue...

It does seem more than coincidental that SIGABRT issues popped up around the same time. I'll do some memory diagnostics as there doesn't appear to be any widespread issues with work units currently.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5887
Credit: 119303813030
RAC: 25541915

RE: RE: I'm just curious

Message 45065 in response to message 45063

Quote:
Quote:
I'm just curious as to why you felt it necessary to reset twice if things were more or less "flawless"? ... :).

Simple, you misinterpreted :). Everything was going just fine before doing the first reset. Obviously by the 2nd reset things weren't going well.

Actually, I didn't "misinterpret" at all but I fear you may have. I wasn't in any way attacking you or poking fun at you. I was simply curious as to exactly what event occurred that caused you to choose the reset option the first time.

Quote:
Quote:
Resetting should be regarded as a "last resort" action when all other avenues have failed.

LOL are you always this verbose?

Absolutely - it doesn't matter at all to me whether you think the verbosity was justified or not. There are potentially many more than you reading this, many of whom do not have your level of experience and expertese. Many of those would not have come across the "resending lost results" issue so I simply took the opportunity to more fully explain it. I also took the opportunity to put in a plug for not immediately selecting the reset option by trying to explain the consequences. Obviously you have taken this to be an insult to your intelligence, which was certainly not intended in the slightest.

Quote:
Frankly I'm not interested so much in the cause as I am in possible solutions. Have any?

That's a very interesting point of view. To me it's much easier to find a solution to a problem if you understand what caused it in the first place. I simply go back to my original query. Exactly what event or sequence of events happened that caused you to chose the first reset? I'm not saying it was wrong to reset. I simply want to understand what it was that caused the initial disruption. My feeling is that the messages snippet you posted was well after that stage.

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.