[error] Can't parse workunit in scheduler reply

Steven Bradbury
Steven Bradbury
Joined: 15 Aug 10
Posts: 7
Credit: 169538212
RAC: 0
Topic 197442

One of my computers has 20 completed work units but will not upload them and retrieve new work. I see this in the log:

3/11/2014 11:18:06 AM | Einstein@Home | [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
3/11/2014 11:18:06 AM | Einstein@Home | [error] No close tag in scheduler reply

Any idea on correcting this?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 3008191262
RAC: 766719

[error] Can't parse workunit in scheduler reply

You have updated 7 of your 10 computers from BOINC v7.2.39 to v7.2.42

If this is one of the ones you haven't updated yet, I suggest you do so and try again. v7.2.39 has a bug which can block communications.

Steven Bradbury
Steven Bradbury
Joined: 15 Aug 10
Posts: 7
Credit: 169538212
RAC: 0

This particular computer is

This particular computer is running v7.2.42

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

Mind telling which hostID it

Mind telling which hostID it is? Perhaps that its scheduler log shows us something.

Steven Bradbury
Steven Bradbury
Joined: 15 Aug 10
Posts: 7
Credit: 169538212
RAC: 0

ID: 6242068

ID: 6242068

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

OK, it shows that host asking

OK, it shows that host asking for work.

Quote:
2014-03-11 16:17:59.6346 [PID=7717 ] [send] CPU: req 261360.00 sec, 12.00 instances; est delay 0.00
2014-03-11 16:17:59.6346 [PID=7717 ] [send] CUDA: req 21780.00 sec, 1.00 instances; est delay 0.00
2014-03-11 16:17:59.6346 [PID=7717 ] [send] work_req_seconds: 261360.00 secs


It shows that all the work it just reported was already reported before.

Quote:
2014-03-11 16:17:59.6428 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_384_0 refused: result already reported as success
2014-03-11 16:17:59.6428 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_383_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.35_S6Directed__S6CasAf40a_840.75Hz_713_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_382_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.35_S6Directed__S6CasAf40a_840.75Hz_712_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_381_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_380_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.35_S6Directed__S6CasAf40a_840.75Hz_711_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_379_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_378_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_377_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.40_S6Directed__S6CasAf40a_840.85Hz_676_2 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_376_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_375_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.35_S6Directed__S6CasAf40a_840.75Hz_710_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result p2030.20131017.G50.89-00.62.S.b0s0g0.00000_176_1 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result h1_0840.30_S6Directed__S6CasAf40a_841.05Hz_374_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result PA0092_00651_102_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result LATeah0058C_32.0_1854_-5.02e-10_0 refused: result already reported as success
2014-03-11 16:17:59.6429 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Completed result p2030.20131017.G50.89-00.62.S.b5s0g0.00000_64_1 refused: result already reported as success


And it shows it sent that host 1 lost task

Quote:
2014-03-11 16:17:59.6408 [PID=7717 ] [HOST#6242068] Sending [RESULT#425797446 h1_0840.25_S6Directed__S6CasAf40a_841.4Hz_1_2] (est. dur. 36997.34 seconds)
2014-03-11 16:17:59.6430 [PID=7717 ] [debug] [HOST#6242068] MSG(high) Resent lost task h1_0840.25_S6Directed__S6CasAf40a_841.4Hz_1_2
2014-03-11 16:17:59.6430 [PID=7717 ] Sending reply to [HOST#6242068]: 1 results, delay req 60.00


So what does it now show in the event log of that host?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 3008191262
RAC: 766719

All those messages sound

All those messages sound right for a host which has sent in a sched_request, but failed to parse the sched_reply - and hence sent the same set of requests in again.

If it doesn't clear up by itself (perhaps after restarting BOINC), we might have to ask for copies of those two files.

Steven Bradbury
Steven Bradbury
Joined: 15 Aug 10
Posts: 7
Credit: 169538212
RAC: 0

I ended up just uninstalling,

I ended up just uninstalling, removing everything in c:\program data\BOINC, and re-installing. Not sure what happened, but it's working again.

Thanks,

Steven Bradbury
Steven Bradbury
Joined: 15 Aug 10
Posts: 7
Credit: 169538212
RAC: 0

Well, I take that back, it's

Well, I take that back, it's still not working correctly. It downloaded two work units and now I'm getting this again:

[error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
[error] No close tag in scheduler reply

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

Sounds as if there's a lack

Sounds as if there's a lack of an EOL character in the sched_reply*.xml file, just before the last end-tag. But why it's on your system only, I don't know. I'll try to get one of the admins to look in here, and if need be, you'll have to send your sched_request_einstein.phys.uwm.edu.xml and sched_reply_einstein.phys.uwm.edu.xml files to him, so he can check that they're doing what they're supposed to do.

Don't post them here, please, as they contain your account key. With that anyone reading could do harm to your account.

Now, I see that your hostID changed, it's now #10728002, if I am not mistaken. So then here's a question in general, why is hostID #6242068 now all of a sudden an anonymous platform, with zero credit?

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

OK, an update. There's been a

OK, an update. There's been a flurry of activity in the background, since I emailed the administrators & moderators about this. This would appear to be a problem when the scheduler reply doesn't fit into the XML buffer of the BOINC client. This happens when there are many files required to download for a task (as it is the case for GW (S6CasA) work) and there are many download mirrors available (but only 4 in this case).

There are a couple of options to remedy this. Some through the server-side, one requires an as-off-yet-not-available updated BOINC client (so no need to look for it) and one could be done by you. The administrators prefer to try the server sided changes first. Just in case someone else runs into the same problem.

In any case, don't do anything yet. I'll post a further update later on.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.