Results shows 27 in progress, Boinc only shows 14?

Mystic
Mystic
Joined: 24 Feb 05
Posts: 7
Credit: 806007
RAC: 0
Topic 190556

Hello all,

I have two new issues that showed up this morning. I requested new work, and recieved three new WUs, but y results page now shows 16 new WUs, 13 of which never made it to me?

Second issue, I completed 3 WUs overnight, signed on and uploaded them, and it keeps showing me this:

Einstein@Home - 2006-01-08 12:05:22 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2006-01-08 12:05:28 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2006-01-08 12:05:28 - Already have result r1_1002.0__919_S4R2a_0
Einstein@Home - 2006-01-08 12:05:28 - Already have result r1_1002.0__918_S4R2a_0
Einstein@Home - 2006-01-08 12:05:28 - Already have result r1_1002.0__917_S4R2a_0
Einstein@Home - 2006-01-08 12:07:07 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2006-01-08 12:07:12 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2006-01-08 12:07:12 - Already have result r1_1002.0__919_S4R2a_0
Einstein@Home - 2006-01-08 12:07:12 - Already have result r1_1002.0__918_S4R2a_0
Einstein@Home - 2006-01-08 12:07:12 - Already have result r1_1002.0__917_S4R2a_0
Einstein@Home - 2006-01-08 12:12:09 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2006-01-08 12:12:13 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2006-01-08 12:12:14 - Already have result r1_1002.0__919_S4R2a_0
Einstein@Home - 2006-01-08 12:12:14 - Already have result r1_1002.0__918_S4R2a_0
Einstein@Home - 2006-01-08 12:12:14 - Already have result r1_1002.0__917_S4R2a_0

Any ideas for a culprit or a fix for these?

Sharky T
Sharky T
Joined: 19 Feb 05
Posts: 159
Credit: 1187722
RAC: 0

Results shows 27 in progress, Boinc only shows 14?

Maybe its that way old 4.19 client who wants to retire.. ;)


Mystic
Mystic
Joined: 24 Feb 05
Posts: 7
Credit: 806007
RAC: 0

He has kicked butt through

He has kicked butt through all this, I was hopin he would make it a little longer...

I just figured out, the three it says "already have result" are the three that I did recieve this morning. But then why is it trying to resend those three and not the ones I didnt get, if that is what its doing? And why wont it sent in the results for the 3 that I did complete overnight?

Sharky T
Sharky T
Joined: 19 Feb 05
Posts: 159
Credit: 1187722
RAC: 0

Maybe EAH servers are about

Maybe EAH servers are about to crap out.
Even your average download speed looks ,well.. lets say a bit fast. LOL
With ones do you try to upload.(I try to look in the scheduler logs if I can find any clues.)
*edit* Found this in scheduler logs:
2006-01-08 18:06:52.8874 [PID=26165] [debug ] REQUEST_METHOD=POST CONTENT_TYPE=application/octet-stream HTTP_ACCEPT= HTTP_USER_AGENT=
2006-01-08 18:06:52.8874 [PID=26165] [debug ] CONTENT_LENGTH=6719
2006-01-08 18:06:54.1277 [PID=26165] [normal ] Handling request: host 494017, platform windows_intelx86, version 4.19.0, RSF 1.000000
2006-01-08 18:06:54.1277 [PID=26165] [normal ] OS version Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00)
2006-01-08 18:06:54.1335 [PID=26165] [debug ] Request [HOST#494017] Database [HOST#494017] Request [RPC#50] Database [RPC#49]
2006-01-08 18:06:54.1345 [PID=26165] [normal ] Processing request [HOST#494017] [RPC#50] core client version 4.19.0
2006-01-08 18:06:54.1408 [PID=26165] [normal ] [HOST#494017] [RESULT#13706602 r1_1002.0__959_S4R2a_0] got result
2006-01-08 18:06:54.1408 [PID=26165] [CRITICAL] [HOST#494017] [RESULT#13706602 r1_1002.0__959_S4R2a_0] got result twice
2006-01-08 18:06:54.1409 [PID=26165] [normal ] [HOST#494017] [RESULT#13706614 r1_1002.0__958_S4R2a_0] got result
2006-01-08 18:06:54.1409 [PID=26165] [CRITICAL] [HOST#494017] [RESULT#13706614 r1_1002.0__958_S4R2a_0] got result twice
2006-01-08 18:06:54.1409 [PID=26165] [normal ] [HOST#494017] [RESULT#13910792 r1_1002.0__954_S4R2a_1] got result
2006-01-08 18:06:54.1409 [PID=26165] [CRITICAL] [HOST#494017] [RESULT#13910792 r1_1002.0__954_S4R2a_1] got result twice
2006-01-08 18:06:54.1417 [PID=26165] [normal ] sending delay request 61.000000

Anybody out there who can decode this?
I read it like you are sending the same results over again..??
At 18:05 it already got those results.And it does this again at 18:11..


Mystic
Mystic
Joined: 24 Feb 05
Posts: 7
Credit: 806007
RAC: 0

Ok, I really dont understand

Ok, I really dont understand the logs, so forgive me if Im totally off base here. I was looking at my initial contact and found this:

17:59:47.5281 [PID=23678] [debug ] get_working_set_filename(): returning r1_1120.5
2006-01-08 17:59:47.5282 [PID=23678] [debug ] send_new_file_working_set will try filename r1_1120.5
2006-01-08 17:59:47.5376 [PID=23678] [debug ] in_send_results_for_file(r1_1120.5, 0) prev_result.id=0
2006-01-08 17:59:47.7402 [PID=23678] [debug ] est cpu dur 18103.792152; running_frac 0.997095; rsf 1.000000; est 18156.536892
2006-01-08 17:59:47.7409 [PID=23678] [debug ] [HOST#494017] Sending app_version albert windows_intelx86 437
2006-01-08 17:59:47.7421 [PID=23678] [debug ] est cpu dur 18103.792152; running_frac 0.997095; rsf 1.000000; est 18156.536892
2006-01-08 17:59:47.7421 [PID=23678] [normal ] [HOST#494017] Sending [RESULT#14108942 r1_1120.5__1310_S4R2a_2] (fills 18156.54 seconds)
2006-01-08 17:59:47.7432 [PID=23678] [normal ] [HOST#494017] Sent 13 results [scheduler ran 27.638383 seconds]
2006-01-08 17:59:47.7442 [PID=23678] [normal ] sending delay request 61.000000

I recieved nothing from this batch, also, for this time period, it seems the times on the right hand side seems to jump around a lot, is this normal, jsut something I hadnt noticed before?

Then, a few minutes later, since my computer didtn recieve the origional work, it requested additional new work, and since I had already reiceved "13" it sent out three more, which I did recieve fine, to make the daily quota of 16.

18:01:41.4521 [PID=24417] [normal ] [HOST#494017] Sent 3 results [scheduler ran 2.872020 seconds]

Thoughts?

Sharky T
Sharky T
Joined: 19 Feb 05
Posts: 159
Credit: 1187722
RAC: 0

You mean the times on the

You mean the times on the left side,don't you ;)
I've seen jumps like that before,but this one looks big.
Maybe the server had too much to do right at the time.
I noticed one thing though.It requested a delete of the r1_0099.0 after
the 12 r1_1002.0 unit and then the last r1_1120.5 unit.
Then it requests another delete of the same r1_0099.0 2 minutes later,when you got your 3 successful downloads.
There's one idea I have in my head of what might have happend but I really don't think so..
But what a heck,here's the story..
If you limited your HD space in prefs or in other way refused it to accept more data it couldn't recieve the downloaded stuff until the r1_0099.0 file was gone.And the last r1_1120.0 unit was to fast behind it to succeed.
2 minutes walks by and now you could download some more r1_1002.0 units.
But while I was writing this I saw this line in there "available disk 117.698731 GB",so I guess that crashes my theory.. LOL
Man.. I think I go watch some TV..


Ziran
Ziran
Joined: 26 Nov 04
Posts: 194
Credit: 615123
RAC: 904

Something is real fishy here.


Something is real fishy here. First of all, I think you got a ghost WU problem. A fix for that is implemented in later versions of BOINC, so an upgrade would hopefully fix that. Do a search for ghost WU, it was discussed a couple of months ago. Are you behind a proxy?

Looking at the logs something is really messed up here. The hosts logs, in your original post refers to the 3 last results you downloaded at 18:01

The logs from the scheduler refers to the 3 last results you reported at 18:05. According to the logs on the scheduler you are trying to report the 3 results you reported at 18:05 a second time.

So the host and the scheduler isn’t talking about the same 3 results. This is how far I can help you. Any one else having any suggestions?

Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.

Mystic
Mystic
Joined: 24 Feb 05
Posts: 7
Credit: 806007
RAC: 0

Lol, yeah, left, the other

Lol, yeah, left, the other right. I have had it set for leaving 10 Gb open with a normal ~120GBs open, so I dont think thats the issue. (I have also set that to 200 for now to stop any more work from coming in till this is figured out.)
No proxy involved.
And I did read about the ghost workunits when all that became a big issue, and I have DLed the newest version, I was just hoping to make it to 100,000 credits on the origional client I started with.

Update:
Completed work still hanging around: I have since finished another WU, it uploaded and was sent in fine, but the client will not remove it from the work screen. The results page shows it as recieved and pending, but its still there with the other 3 from this morning that will not disappear.

On the ghost WU side of the issue, I still get this, which are the 3 that I have recieved:

Einstein@Home - 2006-01-08 18:02:23 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2006-01-08 18:02:27 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2006-01-08 18:02:27 - Already have result r1_1002.0__919_S4R2a_0
Einstein@Home - 2006-01-08 18:02:27 - Already have result r1_1002.0__918_S4R2a_0
Einstein@Home - 2006-01-08 18:02:27 - Already have result r1_1002.0__917_S4R2a_0

Mystic
Mystic
Joined: 24 Feb 05
Posts: 7
Credit: 806007
RAC: 0

I was going through the logs

I was going through the logs on the client and found this, this should ahve happened on the last upload that worked properly. Any ideas what it means, lol?

Einstein@Home - 2006-01-07 18:51:35 - Error on file upload: [r1_1002.0__961_S4R2a_0_0] locked by file_upload_handler PID=14480

And I have since lower my "connect to network" to .1 days, but the client does not see this. I have updated it a couple times and ut keeps saying:

--- - 2006-01-08 22:21:44 - May run out of work in 2.50 days; requesting more
Einstein@Home - 2006-01-08 22:21:44 - Requesting 216342 seconds of work
Einstein@Home - 2006-01-08 22:21:44 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2006-01-08 22:21:48 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2006-01-08 22:21:48 - Already have result r1_1002.0__919_S4R2a_0
Einstein@Home - 2006-01-08 22:21:48 - Already have result r1_1002.0__918_S4R2a_0
Einstein@Home - 2006-01-08 22:21:48 - Already have result r1_1002.0__917_S4R2a_0
Einstein@Home - 2006-01-08 22:21:48 - No work from project
Einstein@Home - 2006-01-08 22:21:48 - Deferring communication with project for 13 minutes and 53 seconds

Anyone really understand this stuff, 'cause it's sure going over my head?

S@NL - Marleen
S@NL - Marleen
Joined: 18 Jan 05
Posts: 25
Credit: 4068135
RAC: 0

RE: I was going through the

Message 23799 in response to message 23798

Quote:

I was going through the logs on the client and found this, this should ahve happened on the last upload that worked properly. Any ideas what it means, lol?

Einstein@Home - 2006-01-07 18:51:35 - Error on file upload: [r1_1002.0__961_S4R2a_0_0] locked by file_upload_handler PID=14480

Looks like a process (file_upload_handler, which I guess is a part of BOINC) on your computer holds the result file open ("locked"). That means the file cannot be deleted etc by another (BOINC) process and Einstein gets confused.

I think you didn't log out or restarted your computer while this was going on? Because that will end such a process and release the file.
You can also open the Task Manager and kill that process.

After this, I expect that Einstein will maybe repeat that message "Already have result" once, but then it should start working normally.
If you get that "locked by file_upload_handler" again, reinstall BOINC or upgrade to a newer version.

Good luck!

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

Mystic, it looks as if things

Mystic, it looks as if things are working for you again. You may have had a file upload problem. Could you confirm that all is now OK?

Director, Einstein@Home

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.