but the download server had a compressed version on disk.
It's actually worse: The file length refers to the plain, uncompressed version, the md5 checksum to the compressed version of the file. As both values are already in the DB and being replicated to every new task this is impossible to fix in a consistent way. I canceled all affected workunits of the three "manually" compressed beams. Something was wrong with the script that generated these. The new workunits produced from our pre-processing pipeline all look ok, though.
The earlier beams gave "download failed" here as well, but the new one is looking good - this task is currently running on this host (not fast, but should be done in about 85 minutes, ~ 16:40UTC). I'll do an 'UPDATE' then to report it in.
Hi There,
I too have had an Issue with Downloading. It started after I Requested UPDATE for E@H ...
.. Whilst viewing The Schedule Tab, I noticed that there were Two Items Actice at one time (which is normal for my PC, including the Running of Tasks - that is two at once). However, it kep Stopping and moving onto the next one on the list (I am referring to The Schedule Tab Downloads). It also was saying that it would attempt to Retry In a Certain Number of Seconds/Minutes. Some of these I Clicked the Retry Now Button (usually when it was going to Back-Off within the following few minutes), and some I did not). Some resulted in Back-Off.
I have Saved A Copy of The EVENT LOG for the Period In Question. There was about 11 Lines in the List in The Schedule Tab following my Update Request, so, the Event Log is fairly lengthy (8 Pages): There is also a Screenshot of my Tasks Tab List after this happened.
Where could I Send It to so that you could Analyse It: I thought it may be of assistance to you, being able to see Physical Evidence of what has happened on at our end.
ps .. It's after 4:15am Tuesday 29th here in Australia, so I won't be able to Send It at this moment, but could Send It sometime Tuesday afternoon or Tuesday Night (Our Time), or any time after that.
This is not a problem with the app or even with the particular tasks that were affected. The actual stderr output is
7.0.38
WU download error: couldn't get input files:
stochastic_full.bank
-200
]]>
and from this you can see that the problem was with a file that is required for all BRP4 tasks in your cache - stochastic_full.bank.
It's not clear why this file should suddenly be missing (or corrupt) as you seem to have other BRP4 tasks in your cache at the time and that would guarantee that BOINC would be keeping that file at all times. It would normally be deleted if you had no BRP4 tasks on board but would be downloaded again if you then got further new BRP4 tasks. If you browse the file stdoutdae.txt in your BOINC data directory, you may find additional clues as to what happened to the file so as to cause BOINC to trash two tasks that depended on it.
Gary,
THX for the answer.
As I see in the stats for this particular PC, there are two more download errors and 4 or 5 not validated SSE - wu'.
Looks like I should do some tests with this system now ...
Something unusual seems to be happening with the compressed data files for BRP4 tasks. Initially, they were in the expected range of around 400KB to 700 KB each. I've just watched a bunch of new work being downloaded (increase in work cache size on several hosts) and the new data files have been around 1.7MB to 1.8MB.
Is there any reason why the compression now seems almost non-existent? I hope this is just a temporary regression :-).
Seeing the same thing
)
Seeing the same thing here.
1/28/2013 07:55:51 | Einstein@Home | [error] File p2030.20110112.G194.37-00.87.N.b4s0g0.00000_2089.bin4 has wrong size: expected 2098320, got 957502
RE: Thanks for the report.
)
Looks more as if something (a workunit generator) used the old value when generating the file info block:
but the download server had a compressed version on disk.
Couldn't repair that problem,
)
Couldn't repair that problem, canceled the workunits in order to not waste more bandwidth. Will be put back in the queue with other names.
First compressed beam will be p2030.20120224.G192.84-03.16.S.b4s0g0
BM
Edit: Here are more:
p2030.20111211.G194.63-00.87.N.b3s0g0.00000
p2030.20111231.G194.23-00.65.C.b0s0g0.00000
p2030.20120103.G193.36-03.15.S.b6s0g0.00000
p2030.20120103.G193.50-03.38.N.b1s0g0.00000
p2030.20111116.G192.57-02.72.C.b4s0g0.00000
p2030.20111210.G175.94-04.37.S.b3s0g0.00000
p2030.20120104.G194.01-02.46.N.b2s0g0.00000
p2030.20120104.G194.01-02.46.N.b3s0g0.00000
p2030.20120104.G194.01-02.46.N.b4s0g0.00000
p2030.20120104.G194.01-02.46.N.b6s0g0.00000
p2030.20120104.G194.39-01.78.N.b0s0g0.00000
p2030.20120104.G194.39-01.78.N.b2s0g0.00000
p2030.20120104.G194.39-01.78.N.b4s0g0.00000
p2030.20120104.G194.39-01.78.N.b5s0g0.00000
p2030.20120104.G194.39-01.78.N.b6s0g0.00000
p2030.20120104.G194.39-01.78.N.b1s0g0.00000
p2030.20120104.G194.39-01.78.N.b3s0g0.00000
BM
RE: RE:
)
It's actually worse: The file length refers to the plain, uncompressed version, the md5 checksum to the compressed version of the file. As both values are already in the DB and being replicated to every new task this is impossible to fix in a consistent way. I canceled all affected workunits of the three "manually" compressed beams. Something was wrong with the script that generated these. The new workunits produced from our pre-processing pipeline all look ok, though.
BM
BM
Thank you for your fast
)
Thank you for your fast response.
WUs
p2030.20120224.G192.84-03.16.S.b4s0g0.00000_3736 - 3743.bin4 and
p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1656 - 1663.bin4
arrived here with no error, filesize each about 686.xxx and 707.xxx bytes.
Also earlier, but at the same time (14:08 local) my first post, with no error!?
WU p2030.20120224.G192.84-03.16.S.b4s0g0.00000_2496-2504-.bin4, 686.xx bytes.
Strange.
28.01.2013 15:26:44 | Einstein@Home | Started download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1657.bin4
28.01.2013 15:26:44 | Einstein@Home | Started download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1658.bin4
28.01.2013 15:26:46 | Einstein@Home | Finished download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1657.bin4
28.01.2013 15:26:46 | Einstein@Home | Started download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1659.bin4
28.01.2013 15:26:47 | Einstein@Home | Finished download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1658.bin4
28.01.2013 15:26:47 | Einstein@Home | Started download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1660.bin4
28.01.2013 15:26:48 | Einstein@Home | Finished download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1659.bin4
28.01.2013 15:26:48 | Einstein@Home | Started download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1661.bin4
28.01.2013 15:26:49 | Einstein@Home | Finished download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1660.bin4
28.01.2013 15:26:49 | Einstein@Home | Started download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1662.bin4
28.01.2013 15:26:51 | Einstein@Home | Finished download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1661.bin4
28.01.2013 15:26:51 | Einstein@Home | Started download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1663.bin4
28.01.2013 15:26:52 | Einstein@Home | Finished download of p2030.20120224.G192.84-03.16.S.b4s0g0.00000_1662.bin4
The earlier beams gave
)
The earlier beams gave "download failed" here as well, but the new one is looking good - this task is currently running on this host (not fast, but should be done in about 85 minutes, ~ 16:40UTC). I'll do an 'UPDATE' then to report it in.
Hi There, I too have had an
)
Hi There,
I too have had an Issue with Downloading. It started after I Requested UPDATE for E@H ...
.. Whilst viewing The Schedule Tab, I noticed that there were Two Items Actice at one time (which is normal for my PC, including the Running of Tasks - that is two at once). However, it kep Stopping and moving onto the next one on the list (I am referring to The Schedule Tab Downloads). It also was saying that it would attempt to Retry In a Certain Number of Seconds/Minutes. Some of these I Clicked the Retry Now Button (usually when it was going to Back-Off within the following few minutes), and some I did not). Some resulted in Back-Off.
I have Saved A Copy of The EVENT LOG for the Period In Question. There was about 11 Lines in the List in The Schedule Tab following my Update Request, so, the Event Log is fairly lengthy (8 Pages): There is also a Screenshot of my Tasks Tab List after this happened.
Where could I Send It to so that you could Analyse It: I thought it may be of assistance to you, being able to see Physical Evidence of what has happened on at our end.
ps .. It's after 4:15am Tuesday 29th here in Australia, so I won't be able to Send It at this moment, but could Send It sometime Tuesday afternoon or Tuesday Night (Our Time), or any time after that.
Cheers :-)
Gary, Check your TASKS List
)
Gary,
Check your TASKS List to see if it DID Download. I have some that appeared to Download Successfully, but Did not Download at all.
RE: This is not a problem
)
Gary,
THX for the answer.
As I see in the stats for this particular PC, there are two more download errors and 4 or 5 not validated SSE - wu'.
Looks like I should do some tests with this system now ...
Something unusual seems to be
)
Something unusual seems to be happening with the compressed data files for BRP4 tasks. Initially, they were in the expected range of around 400KB to 700 KB each. I've just watched a bunch of new work being downloaded (increase in work cache size on several hosts) and the new data files have been around 1.7MB to 1.8MB.
Is there any reason why the compression now seems almost non-existent? I hope this is just a temporary regression :-).
Cheers,
Gary.