I just started to download a new set of files to a computer that had not run Einstein for quite some time. As expected, the old files were deleted and the new files began to arrive. Nearly 400 files making up an estimated 1.5 GB of new stuff were scheduled to download. That is way more than I have ever picked up before. Even though I have a moderately fast connection, this would have taken about three hours. That is way too much, so I aborted the transfer and connected to a different project. I suppose that it is a result of the tail end of the current run as explained by Gary Roberts in another thread.
I am still running Einstein on other machines but I'm not going to try a new batch until the next run brings the downloads to a reasonable size. I'm not complaining here. I just want people to be aware of the possibility of some really big and time-consuming transfers. Imagine if you were on a dial-up connection. It could have taken, quite literally, days. Even a slower DSL connection might run into a monthly cap problem.
Plus SETI Classic = 21,082 WUs
Copyright © 2024 Einstein@Home. All rights reserved.
Huge download
)
How big a Cache have you got set?, ie how many days?
Claggy
Hi! Note that you
)
Hi!
Note that you configure a quota of download volume for newer BOINC versions: In the web interface , it is :
Transfer at most XXX Mbytes every XXX days
(Enforced by version 6.10.46+)
Note this will apply to all BOINC projects, tho.
CU
HB
I have a two day cache set
)
I have a two day cache set plus one day of buffer. I've never had so many work units try to download at one time. It's just too much.
My other machines are not pulling in nearly this much at a time so I think it may be a combination of restarting a machine that hasn't been used for Einstein for a long time and the end-of-run clean up alluded to before.
If I were just starting Einstein for the first time ever, seeing a set of files to be downloaded like this would certainly scare me off.
Thanks for your help.
Plus SETI Classic = 21,082 WUs
RE: I just started to
)
Your problem is precisely the same as the one I was describing to Oliver in this message recently. Here is the key part from that post that describes what happens.
When you restarted after a considerable gap, none of your LIGO data files would be likely to be still current. The problem you saw occurs because your client was asking for multiple tasks with no current data. The scheduler will choose some random available frequency band and will assign whatever tasks it has for that band (probably between 1-3 tasks). Quite often it will be just 1 task because the scheduler has a habit of choosing a single resend in these sort of cases. I've seen it happen many times. The data download for this single task will be 48 LIGO files and 1 skygrid file, assuming you still have the sun and earth files from previously.
With that frequency band now effectively "used up" (temporarily), the scheduler will move on to an entirely different frequency band and repeat the whole process until it has enough tasks to fill the work request. The scheduler is not smart enough to realise that it could use tasks from a related band (0.05Hz higher) which would only require 4 more LIGO files rather than the full 48.
The easy way to prevent the problem is to temporarily reduce your cache size to 0.05 days total. Then when you request work you're only requesting 1 task and that's exactly what you will get - 1 task and about 180MB of data files. Now that this data is described with all the blocks in your state file, you can ask for more tasks and the scheduler will send them for this same set of frequencies as for the first task, provided the first task wasn't an isolated resend with nothing else available in a nearby frequency band. You can cop extra full downloads if that happens on the first task.
There is an even simpler way to reinstate a machine like yours and enjoy the benefits of zero downloads. All you do is use the data from one of your other hosts. Here is a step by step set of instructions using a pen drive or network share as the transfer medium.
* Take a copy of the state file from the donor host so that you can extract the blocks that describe the skygrid file and all your chosen LIGO files
* Make sure the host to be revived is not running BOINC yet. Shut BOINC down if necessary.
* Delete all old LIGO files on that host and seed the project directory with the new files of your chosen frequency that you harvested.
* Edit the state file on that host and delete all the old blocks for both data and skygrid and then insert the replacement blocks.
* While you have the state file open you might like to correct the section so that you can pretend that the machine hasn't had a big holiday after all :-).
* Save your changes and restart BOINC.
When BOINC starts up and makes a work request, all the tasks will come from the chosen frequency. You can quite easily have zero downloads.
I've just done something very similar to about a dozen machines today. For the last six weeks they were all crunching tasks for a particular frequency that is now virtually exhausted. I have saved data and blocks for a different frequency that still has plenty of tasks left. So rather than risk the sort of problem you experienced, I added the new frequency set to the existing frequency set. So, when the tasks for the old set do run out, all these hosts will transition to the new frequency when they are good and ready and there will be no big downloads - in fact no downloads at all. You can allow the transition to occur by itself or there are tricks you can use to make it happen exactly when you want it to. But that's getting a bit complicated :-).
Cheers,
Gary.
Yes, thank you, Gary Roberts.
)
Yes, thank you, Gary Roberts. That was exactly the message that I was referring to in my post at the top of this thread. I appreciate both the confirmation and the elaboration in your explanation. You did an excellent job of filling in the gaps in my knowledge. Thanks again.
Plus SETI Classic = 21,082 WUs