Unexpected XML tag or syntax

Grenadier
Grenadier
Joined: 9 Feb 05
Posts: 14
Credit: 2823344
RAC: 0
Topic 195200

One of my hosts is getting the following error message when trying to report a result.

7/21/2010 10:11:35 AM [error] Task h1_1105.75_S5R4__737_S5GC1a: bad command line
7/21/2010 10:11:35 AM [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
7/21/2010 10:11:35 AM [error] No close tag in scheduler reply

I googled around, and found that Primegrid had a similar problem. They ended up downgrading the scheduler to fix it. Did Einstein recently upgrade to a newer scheduler and get the same bug?

Grimm
Grimm
Joined: 22 Jan 05
Posts: 40
Credit: 237320425
RAC: 69050

Unexpected XML tag or syntax

I am also getting the same errors. The workunit referenced is not in my Task pane.

07/21/10 11:08:07 AM Einstein@Home Requesting new tasks
07/21/10 11:08:09 AM [error] Task h1_1138.45_S5R4__828_S5GC1a: bad command line
07/21/10 11:08:09 AM Einstein@Home [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
07/21/10 11:08:09 AM Einstein@Home [error] No close tag in scheduler reply

S@NL - Marleen
S@NL - Marleen
Joined: 18 Jan 05
Posts: 25
Credit: 4068135
RAC: 0

I have the same error. I

I have the same error. I think it is a downloading error, because I currently don't have finished workunits for Einstein and it occurred when asking for more work.

21/07/2010 16:40:42	Einstein@Home	[sched_op_debug] Starting scheduler request
21/07/2010 16:40:42	Einstein@Home	Sending scheduler request: To fetch work.
21/07/2010 16:40:42	Einstein@Home	Requesting new tasks
21/07/2010 16:40:42	Einstein@Home	[sched_op_debug] CPU work request: 340.52 seconds; 0 idle CPUs
21/07/2010 16:40:48		[error] Task h1_1094.00_S5R4__728_S5GC1a: bad command line
21/07/2010 16:40:48	Einstein@Home	[error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
21/07/2010 16:40:48	Einstein@Home	[error] No close tag in scheduler reply
21/07/2010 16:40:48	Einstein@Home	[sched_op_debug] Deferring communication for 1 min 0 sec
21/07/2010 16:40:48	Einstein@Home	[sched_op_debug] Reason: can't parse scheduler reply

This was the second attempt to download that unit, the previous attempt was at 14:28:01 (GMT+2:00), with the same result.
The workunit in question is in my workunit list here on the site, but NOT in the boinc manager list on my computer, so it didn't download.

Best regards,
Marleen

Fat B
Fat B
Joined: 22 Jan 05
Posts: 45
Credit: 2687926
RAC: 0

I have the same problem, but

I have the same problem, but it's only on 1 machine, the other 3 are working fine?

Quote:
21/07/2010 16:50:15 Einstein@Home Sending scheduler request: To fetch work.
21/07/2010 16:50:15 Einstein@Home Requesting new tasks
21/07/2010 16:50:19 Einstein@Home [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
21/07/2010 16:50:19 Einstein@Home [error] No close tag in scheduler reply
21/07/2010 16:58:59 Einstein@Home Sending scheduler request: To fetch work.
21/07/2010 16:58:59 Einstein@Home Requesting new tasks
21/07/2010 16:59:04 Einstein@Home [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
21/07/2010 16:59:04 Einstein@Home [error] No close tag in scheduler reply

Hartmut Geissbauer
Hartmut Geissbauer
Joined: 5 Jan 06
Posts: 31
Credit: 152941307
RAC: 0

The same behaviour here. Only

The same behaviour here. Only one machine of four is affected.

3x MacOSX (Tiger, Leopard, Snow Leopard)
1x Ubuntu 10.04 x86_64

The Leopard is affected.

Hardy

Atheist
Atheist
Joined: 19 Feb 05
Posts: 1
Credit: 1255303
RAC: 0

Bump

Bump

Ensor
Ensor
Joined: 9 Feb 05
Posts: 49
Credit: 1450362
RAC: 0

Same problem here when trying

Same problem here when trying to report finished tasks or trying to request new work.

TTFN - Pete.


Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118390868724
RAC: 25655771

RE: 7/21/2010 10:11:35

Quote:
7/21/2010 10:11:35 AM [error] Task h1_1105.75_S5R4__737_S5GC1a: bad command line
7/21/2010 10:11:35 AM [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
7/21/2010 10:11:35 AM [error] No close tag in scheduler reply


I saw the same messages on a couple of my machines so I reported the details to Bernd by email about 12 hours ago. I wont know the outcome until I get to work in about an hour and check my email.

Ir would seem to be a server problem. The mentioned tasks are allocated on the server (you can see them in your tasks list on the server) but fail to download correctly to the client. Seems to be affecting GC1 tasks and not ABP2.

Cheers,
Gary.

Grenadier
Grenadier
Joined: 9 Feb 05
Posts: 14
Credit: 2823344
RAC: 0

Thanks for the update, Gary.

Thanks for the update, Gary. Please post back here if there's something we need to do to resolve this.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118390868724
RAC: 25655771

RE: Thanks for the update,

Message 98645 in response to message 98644

Quote:
Thanks for the update, Gary. Please post back here if there's something we need to do to resolve this.


I've had no reply from Bernd as yet but I've now had time to analyse the problem and I'm quite sure there is nothing that a user can do but wait for the Devs to fix whatever is broken.

The error messages that you see in BM messages tab refer to problems with the scheduler reply. You can browse this reply in the file named 'sched_reply_einstein.phys.uwm.edu.xml' which you can find in your BOINC data folder. Do a search for the tag as this is the start of the data for a new task that is being sent to your host. Within the complete block you will find a tag which contains the full command line string that the science app will use to process the data contained in the large data files that exist on your host.

Now it turns out that this command line string has been corrupted - probably by the WU generator. There should be a closing tag to delimit the end of the complete command line string but in the example I looked at, there isn't. Here's what my string looked like.

 --Freq=1140.4174671 --FreqBand=0.05 --dFreq=6.71056161393e-06 --f1dot=-2.64248266531e-09 --f1dotBand=2.90673093185e-09 --df1dot=5.77553186099e-10 --skyGridFile=skygrid_1150Hz_S5GC1.dat --numSkyPartitions=887 --partitionIndex=673 --tStack=90000 --nStacksMax=205 --gammaRefine=1399 --ephemE=earth --ephem±

The string is obviously truncated (scroll to the end and look) with some garbage at the end and certainly has no closing tag. The tag that is there should start on a new line and should be after the missing closing tag. I would imagine the corrupted --ephem flag should actually read --ephemS=sun by analogy with the 'earth' flag. I'm not sure if there should be even more flags in a complete command line.

So we just need to wait until somebody fixes whatever is causing this corruption of the command line. I've also sent a further email to Bernd giving him the above details.

EDIT:
I've just had a look in the state file (client_state.xml) where you can see fully legal examples of blocks. Yes, there should be many more flags after the --ephemS=sun flag so there is no way a user could even attempt to 'fix' things :-).

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118390868724
RAC: 25655771

RE: ... something we need

Message 98646 in response to message 98644

Quote:
... something we need to do to resolve this.


Not something you need to do but rather something you need to be aware of.

I've still had no response to any emails I've sent so I guess this wont be fixed for a few hours more yet.

When things do get fixed, you will get some messages that some of you may be concerned about. At the moment, you would think that completed work is not being reported properly because the tasks are still visible in BM with a 'ready to report' status. However, work is being reported properly. It's just that the mangled sched_reply is causing the client to think that the server hasn't got the report. So the client keeps trying to report these tasks.

When things get fixed and a correct sched_reply gets delivered to your client, it will contain the info that the latest attempts to report completed work are being refused because these very tasks have been received successfully on a previous connection. There will probably be quite a few of these messages and they will be in red and it may appear to some that your completed tasks are being junked. Just read the messages carefully and understand that nothing has been lost at all.

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.