Hallo !
Since long time I have trouble to crunch LAT-files. Most of them endup with crunching error.
27.02.2012 15:19:41 | Einstein@Home | Aborting task LATeah2222A_96.0_21600_-2.7e-11_0: exceeded disk limit: 52.37MB > 19.07MB
27.02.2012 15:19:41 | Einstein@Home | Aborting task LATeah2222A_96.0_21550_-4.8e-11_0: exceeded disk limit: 41.61MB > 19.07MB
27.02.2012 15:19:42 | Einstein@Home | Computation for task LATeah2222A_96.0_21600_-2.7e-11_0 finished
27.02.2012 15:19:42 | Einstein@Home | Output file LATeah2222A_96.0_21600_-2.7e-11_0_0 for task LATeah2222A_96.0_21600_-2.7e-11_0 absent
27.02.2012 15:19:42 | Einstein@Home | Output file LATeah2222A_96.0_21600_-2.7e-11_0_1 for task LATeah2222A_96.0_21600_-2.7e-11_0 absent
27.02.2012 15:19:42 | Einstein@Home | Computation for task LATeah2222A_96.0_21550_-4.8e-11_0 finished
27.02.2012 15:19:42 | Einstein@Home | Output file LATeah2222A_96.0_21550_-4.8e-11_0_0 for task LATeah2222A_96.0_21550_-4.8e-11_0 absent
27.02.2012 15:19:42 | Einstein@Home | Output file LATeah2222A_96.0_21550_-4.8e-11_0_1 for task LATeah2222A_96.0_21550_-4.8e-11_0 absent
27.02.2012 15:19:42 | Einstein@Home | Starting task LATeah2222A_96.0_21700_-1.8e-11_0 using hsgamma_FGRP1 version 23
27.02.2012 15:19:42 | Einstein@Home | Starting task LATeah2222A_96.0_21700_-1.6e-11_0 using hsgamma_FGRP1 version 23
This error remark exceeded disk limit: 41.61MB > 19.07MB is peculiar, as there are more than 50GB free available for BOINC and EaH. Peculiar is also, that 2 LAT-tasks can be crunched in parallel (processor i3-2100, dual core with HT, so four cores are emulated). With S6Bucket I had no problems at all.
I carried out a repair of BOINC and let all EaH files running out regularily, deleted all EaH-files form the directory and let reload fresh files from the server, with no effect. Furthermore I took attention that the 4 LAT-tasks did not start close in time, with no effect.
It seem to me that the files end up with crunching error when the result became written the first time to the outputfiles to become stored on disc. This is set at me for all 5 min.
Noticable is also, that the progressbar for LAT-tasks is stepping up only in multiples or 2.000%.
In the taksmanager all things seems to run properly.
As the LAT application is running for so long time without reporting such difficulties from other participants, I´m venting the idea, that my processor shows an individual malfunctioning. But S6Buccket-tasks were running well! So there must be also something happen in the programming. The S6Bucket is obviously more tollerant to this error.
I will be pleased to get some ideas to overcome this malfunctioning.
Kind regards
Martin
Copyright © 2024 Einstein@Home. All rights reserved.
Severe trouble with LAT-files
)
I renounced processing LAT files on my Linux box since they all end in validate error. I am processing Arecibo pulsar data and gravitational wave searches with no problem.
Tullio
RE: This error remark
)
Where do you see those 50 GiB? Did you check the startup messages in your BOINC event log?
(example from mine: 26/02/2012 01:19:44||Preferences limit disk usage to 1.86GB)
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
19.07 MB is the binary
)
19.07 MB is the binary equivalent of the maximum anount of disk space a single LAT workunit is allowed to occupy on disk:
That limit is set by the project when the workunits are made: it's nothing to do with your choices of how much disk space is allocated to BOINC generally.
It sounds more likely that the LAT application is generating garbage output (a large amount of it!) on your system, which might be a clue to the validate errors (garbage in result files) too.
Hallo to all helpers! Hallo
)
Hallo to all helpers!
Hallo Gundolf! I know my limits are much higher than logically needed. I set it for purpose of test so high. The eventlog sayes
Realy needed are 6 to 11GB only.
Furthermeore, why this error remark
Why is this obviously needed output file not generated by the program? Is this an indication for a programming error, or what?
Hallo Tulio! I also had disarmed LAT. But at now S6Bucket-file are becoming rare and this unchecking is no longer working.
I will be very much pleased to get your comments and suggestions.
Kind regards
Martin
RE: Why is this obviously
)
It's not generated, because the application has crashed - or, in your particular case, been forcibly terminated by BOINC for generating vastly more output than expected for a valid run.
It would be interesting to
)
It would be interesting to see how the wingman was doing with this particular WU that had disk quota problems.
The per-task disk limit is a dafeguard against a taks running wild (because of programming bug or malfunction of the host) and trash all the remaining allowed diskspace, so it makes sense that this is enforced. If this particular WU happens to generate huge legitimate output files, it would fail with the same error for all wingmen. At least here: http://einsteinathome.org/workunit/116558405 this is not the case. It could still be a sporadic bug, tho, that mysteriously is more likely to happen on your host.
CU
HB