15/04/2008 04:13:14|Einstein@Home|Sending scheduler request: Requested by user. Requesting 38435 seconds of work, reporting 0 completed tasks 15/04/2008 04:13:20|Einstein@Home|Scheduler request succeeded: got 1 new tasks 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file h1_0885.75_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file l1_0885.75_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file h1_0885.80_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file l1_0885.80_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file h1_0885.85_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file l1_0885.85_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file h1_0885.90_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file l1_0885.90_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file h1_0885.95_S5R3 15/04/2008 04:13:20|Einstein@Home|Got server request to delete file l1_0885.95_S5R3 15/04/2008 04:13:22|Einstein@Home|Started download of skygrid_0890Hz_S5R3.dat 15/04/2008 04:13:22|Einstein@Home|Started download of h1_0886.10_S5R3 15/04/2008 04:15:41|Einstein@Home|Finished download of h1_0886.10_S5R3 15/04/2008 04:15:41|Einstein@Home|Started download of h1_0886.15_S5R3 15/04/2008 04:15:42|Einstein@Home|Sending scheduler request: To fetch work. Requesting 15238 seconds of work, reporting 0 completed tasks 15/04/2008 04:15:47|Einstein@Home|Scheduler request succeeded: got 0 new tasks 15/04/2008 04:15:47|Einstein@Home|Message from server: No work sent 15/04/2008 04:15:47|Einstein@Home|Message from server: (won't finish in time) Computer on 45.9% of time, BOINC on 96.3% of that 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file h1_0886.00_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file l1_0886.00_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file h1_0886.05_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file l1_0886.05_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file h1_0886.10_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file l1_0886.10_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file h1_0886.15_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file l1_0886.15_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file h1_0886.20_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file l1_0886.20_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file h1_0886.25_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file l1_0886.25_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file h1_0886.30_S5R3 15/04/2008 04:15:47|Einstein@Home|Got server request to delete file l1_0886.30_S5R3
Can anyone explain why the server would give me a new WU then ask for it to be deleted within minutes? If you look a the log above you can see a WU was deleted when a new WU was given but then deleted straight away. Problem is, the downloads are still going and its still in my task list so is this WU obsolete? Should i abort it? Several times now im crunching a WU for this project for the server to mysteriously delete it after approx 20hrs of work. Is this normal? My CPU time could be better spent on WU's that will actually finish! Any help would be much appreciated
ps, http://einsteinathome.org/workunit/38019293 may be helpful
Copyright © 2024 Einstein@Home. All rights reserved.
Wierd activity
)
Because the servers have no more tasks to send out that need that data file. Once you finish the tasks you have on hand that file will just be wasting space on your hard drive.
BOINC WIKI
BOINCing since 2002/12/8
RE: Can anyone explain why
)
Hi Richard,
You are misunderstanding the difference between tasks and the large data files on which they depend. The two things are quite separate and quite different.
When a computer is first attached to the E@H project, it is actually sent four different groups of files. These are:-
* various support files - earth_05_09, sun_05_09, skygrid_0xxxHz_S5R3.dat,
* a significant number of large data files at slightly different frequency steps (LIGO data) eg see the names being requested for deletion, and
* an actual task to crunch - what you are calling a WU - which is so small and fast to download that you probably don't even notice it and its name is not listed in the log - just the fact that you got it.
A task is actually a set of instructions (ie a set of parameters or flag values) that tells the science app how to process the LIGO data. Over time you may receive quite a number of tasks with different parameters that will all relate to the same set of LIGO data. These are listed on the tasks tab of BOINC Manager and you will notice they have a sequence number as part of their name which identifies them from other tasks that belong to the same frequency band.
When all the different tasks relating to the one set of LIGO data have been issued, that set of large data files will be marked for deletion and a new set will be issued in a different set of frequency steps. Usually data can last for weeks or even months, depending on the appetites of other computers that are sharing the same pool of frequencies. A fast machine devoted entirely to E@H can get hundreds of tasks from the one data set whilst a slow machine (or more importantly a machine with a low resource share) may get very few before the data needs replacing.
When you look at your log snippet, you will see deletions of expended data files and not the deletion of any task. This is quite normal. First a comment about the period prior to the log snippet. Your machine returned its previous task (freq 0885.75, seq# __136) on April 10. Five days later it is now requesting further work and this is where your log snippet starts. One new task was downloaded (freq 0886.00, seq# __101). Notice the change in frequency by 0.25Hz from the earlier one. Obviously tasks for the former lower frequency are now all gone so the data files for those now completed tasks will be deleted, some of them at least. Some of the frequency steps will have been retained (eg 0886.00 and 0886.05) and will have been added to by the acquisition of files at slightly higher steps, eg 0886.10 and 0886.15. This was all happening around 04:13:20 and was to ensure that all data for the new task (seq# __101) was in place for the start of crunching.
There is a small gap in your log between 04:13:22 and 04:15:42 - just over two minutes. The completion of the downloading of the skygrid file should have occurred somewhere in that interval but this isn't shown for some reason. There should also be an entry showing the start of downloading of l1_0886.10_S5R3. This should have happened immediately the skygrid download finished.
Have you left out some log entries for this period?
There is an entry for the start of download of h1_0886.15_S5R3 but no entry for its finish. There should also be entries for the start and finish of downloading of l1_0886.15_S5R3.
I can't really comment further until I know whether or not you have left out any entries.
There are other unusual aspects that need a comment from you in order to explain things fully. For example, at the start of the log snippet, the scheduler request was initiated by the user rather than BOINC itself and the request was for an unusually large amount of work - 38435 secs. How did you prevent BOINC from making a much earlier request for a much smaller amount? Perhaps you had set "No New Tasks" (NNT) and were now allowing tasks, or perhaps you had just made an increase in cache size, or perhaps you had suspended other projects so that BOINC turned to E@H for work all of a sudden?
Can you explain the circumstances behind your "updating" of E@H which triggered the large request for work?
The sending of the first task (under normal circumstances) would have more than satisfied the 38435 sec request. The previous task had taken 45Ksecs so BOINC's estimate of what it had received should certainly be greater than what it had asked for. There shouldn't have been any reason to make a further request. Yet at 04:15:42, BOINC is requesting more work. You do have a dual core machine so this is pointing towards the BOINC client thinking that it needed work for both cores. By the look of things, you support a lot of projects so it's hard to understand why the BOINC client was trying to get so much for E@H. It points very strongly towards user intervention of some sort.
The interesting thing was that the server vetoed what the client was trying to do. I'm not at all surprised at that, given that you must have a quite low resource share for E@H and the tasks do take quite a few hours to complete.
If you can give more detail about the points I have posed and answer the questions asked, I'm sure that there would be a good reason for what BOINC has done. In the meantime, you should not take any action to delete or suspend work and you should not interfere with any downloads that BOINC decides to do.
I'm quite happy to respond further once you give more details.
Cheers,
Gary.
RE: 15/04/2008
)
The server said that it didn't give you a new WU (see highlighted passage above).
What was downloaded was a data file, which can be used by multiple WUs in succession. It puzzles me, however, that file h1_0886.00_S5R3 was requested to be deleted, since your current WU (I made it clickable above) is h1_0886.00_S5R3__101_S5R3b_2. But I'm no Einstein@home guru; perhaps someone else can figure it out and explain it to us.
Gruß,
Gundolf
PS: Gary answered while I was still editing my answer!
Computer sind nicht alles im Leben. (Kleiner Scherz)
RE: It puzzles me, however,
)
I've noticed exactly this behaviour several times previously and I interpret it to be a request in advance for the client to act upon, at an appropriate time in the future. In other words, the server is saying that, at the current time, it has no more tasks to issue that depend on that data set so the client can delete that data set as soon as the client has overseen the safe completion of any tasks in the cache that depend on it.
There is a weakness in this however that I've also observed. There is no guarantee that the server won't have resend tasks in the future that also depend on this exact same data set. Some of the initial hosts with tasks from a particular data set may not return them within the deadline or may return them with a client error, or there even may be "checked but no consensus" issues. This means there may be no alternate hosts with that data set any more if they've all been instructed to delete the data and have actually done the deletion. The upshot is that the data sets may need to be sent out to more hosts just to get a (hopefully) small number of resends crunched. I'd like to see some mechanism where the requests for deletion only get sent out after the server has determined that every last task is returned and validated. I guess this would be a lot more work for the server.
Cheers,
Gary.
RE: I've noticed exactly
)
I suspected so, too. Thanks for the explanation.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)