Since I rejoined BOINC in August, my monthly bandwith usage increased roughly five fold and my router is refusing connections from places like the Czech republic and Beijing and Austria. Could any of this be BOINC related. I have detached all the projects for now, I hope this means none of the BOINC related servers will try to connect with me.
I shudder at the what my internet overage charges would have been if I did not have an unlimited plan.
Copyright © 2024 Einstein@Home. All rights reserved.
How much is downloaded
)
If you wish to support science projects by crunching data, you should expect that there will be some extra bandwidth used, depending on the data that needs to be sent to you. Einstein has quite large data requirements but even so, for just 1 computer, the extra usage would hardly be noticeable. I run around 70 computers and I do employ some data saving strategies and my monthly consumption is around 250GB - ie less than 4GB per computer per month. On average, each machine does around 15 tasks per day. So I figure that 4GB per host per month is really quite modest these days. How much bandwidth per month is your machine using? It's actually quite possible (by taking advantage of the 'locality scheduling' used for GW Tasks and a little micro-management) to have virtually negligible ongoing data requirements.
I would think not.
The servers of BOINC projects don't contact you. Your BOINC client contacts the servers if it wants to request work or to return completed work. The server never initiates the conversation.
They were never doing that in the first place. Is your primary concern the extra bandwidth or the unwanted connection attempts? BOINC shouldn't be responsible for the latter.
I thought internet charges in Australia were supposed to be pretty bad but it only costs me $35 for 150GB (30GB peak, 120GB offpeak). What are your costs like in Canada?
Cheers,
Gary.
And else use the maximum
)
And else use the maximum download preferences (only available on projects with up-to-date server software, so not available on all projects and therefore best set through the local advanced preferences in BOINC Manager):
Einstein preferences;
Edit them;
Add your maximum value in megabytes and amount of days that it should be able to use these in the "Transfer at most --- Mbytes every --- days" line;
Save the changes to the site with the "Update preferences" button.
When done through BOINC Manager advanced preferences, do know that these override all of the same web-preferences. So you will have to set your CPU usage, network usage and disk & memory usage preferences here according to how you changed it through the web-preferences. The option can be set through the Network usage tab.
Used as such:
Setting 100MB per 2 days will only allow a maximum download and upload of 10 megabytes of data every 2 days. Get close to 100MB, or go over it and your BOINC will stop the transfer of data. It will continue to crunch the data in the queue, just not upload & report it and download new work for the duration of the rest of the 2 days.
In case anybody was wondering
)
In case anybody was wondering about what I meant by the "data saving strategies" and "taking advantage of the 'locality scheduling' used for GW Tasks" references in my previous message, I thought I'd add a bit of detail. If you're not interested in the detail, please just ignore, as it's bound to be rather long winded :-).
If someone is joining the project afresh or rejoining after an absence, the new S5GC1HF run will require you to take a considerable download 'hit' just to get your first task. You will need the new apps themselves as well as the 'sun' 'earth' and 'skygrid' files as a 'once off' download and then you will need quite a large slab of LIGO data, perhaps as many as around 40 individual files of about 4MB each. All up you would be looking at perhaps 200-300MB total download.
Locality scheduling means that you get to reuse all the LIGO data for (hopefully) many extra tasks for the same frequency bin. If you look at the name of the task in BOINC Manager, you get an idea of how many more tasks might be available. As an example, if you look at the OP's task list on the website, you can see these three tasks (oldest issue at the bottom, newest on top)
* h1_1310.85_S5R4__1072_S5GC1HFa_0
* h1_1310.85_S5R4__1125_S5GC1HFa_1
They all belong to the same frequency bin - 1310.85Hz.
They all have an inbuilt sequence number (seq#) (the bit immediately after the double underscore) which for the HF run will be starting well over 1000.
They all require the same set of LIGO data (this is what locality scheduling tries to achieve).
If you were able to look into the Einstein project directory of the OP's computer, you could see all these LIGO data files with names like h1_1310.85_S5R4, h1_1310.85_S5R7, l1_1310.85_S5R4, l1_1310.85_S5R7, ... with that sequence of 4 being repeated for a whole lot of frequency bins above 1310.85 - such as 1310.90, 1310.95, ... up to perhaps as high as 1311.30 or so. These are NOT tasks - they are all data files that must be on board before any task sent can be crunched. There are actually no separate files representing tasks. Tasks are just a collection of switches and parameters that the science app uses to crunch the LIGO data files. The details for all tasks sent are simply stored within the state file.
The tasks with the highest seq# - such as 1125 above - are distributed first and on each subsequent request for work, the scheduler will attempt to send you the next available (lower) seq# for the same freq bin. From time to time you may also get 'resend' tasks - tasks that have failed on someone else's computer that are sent to you because you already have the 'right' data. Such tasks are easily recognisable because they have a suffix of _2 or higher. _0 and _1 are always 'first issue' tasks.
There are always multiple hosts sharing a freq bin so at times, the seq#s can roll down quite quickly, depending on how many hosts are feeding and their appetites.
Eventually the seq# will roll down to zero at which point all initial tasks for that freq bin will have been issued. Even then the LIGO data files are still needed because there are bound to be quite a few failures of 'initial issue' tasks over the next few days to weeks. These failures will generate 'resends' over a surprisingly long period.
The work unit generator (WUG) doesn't generate all the seq#s at once. It creates a very small number of tasks per bin on demand as required. If the scheduler is asked for a particular freq bin, it may discover no available tasks. It then sets a flag which the WUG notices and within maybe a minute or so, the WUG will have generated a few more tasks having the next available seq#s. If you happen to be the host asking for a task which isn't immediately available, the flag will be set but you will actually get tasks from the next highest freq bin. This will usually result in a server request for the 4 data files for the current freq bin to be "" in your state file (client_state.xml). It will also result in the download of 4 replacement LIGO data files at the high end of the freq range. So if you already had the 4 files for 1311.30, you would now get the 4 files for 1131.35.
Files are not deleted immediately. The client will only carry out the order once all tasks in your cache that depend on those data files are completed, returned and acknowledged. This could actually take quite a while. In most cases the delete request is premature. There are possibly hundreds of tasks still to come in the series. The lack of tasks is quite transient. If the flag is manually removed from the state file, a subsequent work request will get tasks for this frequency because more will have been created in the interim. The particularly bad thing about the flag is not that it has been set but rather that it cannot be automatically unset and it prevents the client from requesting tasks for that particular freq bin, even although the data is still physically on the host and likely to remain there for a while to come. That is quite a waste of downloaded data.
Another bit of useful information is that if multiple tasks are being requested, which can easily result in several freq bin shifts, the flag is NOT set for those shifts as long as at least one task was available before a bin was 'emptied'. In other words, you can jump through a number of bins, exhausting their contents without them being . Within a minute or two, those bins will be replenished, ready for further work fetches at those frequencies.
By using these particular 'features' of the scheduler, I can start with a group of adjacent freq bins and have hosts continue to feed on this range of frequencies, jumping between bins as required without ever needing to download additional LIGO data files for other frequencies. By sharing this cache of LIGO data between a number of hosts (around 20 at the moment) the data downloads for these 20 hosts have been essentially zero for the last week after having set up the data caches. It's actually rather trivial to have a client ask for tasks for a particular group of frequencies and then have the scheduler play ball and keep sending just those tasks until they really are exhausted. That can easily continue for several weeks or longer.
Cheers,
Gary.
RE: Another bit of useful
)
Interesting. Thanks. Is that why many of the computers processing the same tasks as mine have about 80 tasks on the go and time out on great swathes of them? Seems a bit selfish if it is.
NG
NG
RE: RE: Another bit of
)
Don't know the answer to your question but I do know that as those people have units that 'time out' they will get fewer and fewer units to crunch. This protects the project itself from pc's that just return junk or don't return units on time from clogging up the system. Now as they start to return units properly they will get more units to crunch again. It is a self fixing algorithm, so the units available to those pc's will go up and down over time.
RE: Interesting. Thanks. Is
)
I wouldn't think that what you are noticing has anything to do with what I was describing in my previous message.
I can't speak for others but in my case, all my quad core hosts have around 100 tasks or more on the go because they can crunch around 16-20 tasks per day and I tend to keep a 5-6 day cache of work. It would be quite rare for any of my hosts (about 70 in total - not all quads) to have tasks that time out since they are usually monitored sufficiently and any problems corrected before any deadlines are missed. So, there isn't really any problem for a productive host to have quite a lot of tasks on board.
There are probably at least a couple of scenarios (and quite likely more) which would account for large numbers of tasks timing out in the way you describe.
Firstly, hosts crunching 24/7 and not being regularly monitored can lock up or crash without the owner being aware of it until after the deadline has passed. There are probably large numbers of volunteers who 'set and forget' so a few examples of this would not be surprising. In my case, I rarely have time to look at individual machines on a regular basis. So, to make problems more visible, I wrote a shell script which runs on one machine and contacts all the others on the LAN twice per day. The script pings the IP address of each host to make sure the host is alive. It also uses the boinccmd utility to do things like allowing and disallowing new tasks so that I can take maximum advantage of my ISP's off-peak period where downloads aren't charged. The script produces a log of neverything it does so I can simply browse this log to see if there are any problems.
The second scenario is to do with specific circumstances that apply at the moment - ABP2 tasks have essentially finished and there is no GPU work until new data (from Parkes telescope) and new apps are released. Take a close look at this thread and you will see lots of people complaining about excessive numbers of CPU tasks being downloaded by BOINC as it tries to get some (non-existent) GPU tasks. This is a problem with BOINC and the workaround is to (temporarily) change the E@H preferences so as not to use the GPU and hence stop BOINC from trying to get any GPU tasks. There are probably many people being affected by this and probably a lot of them don't even realise that there is a problem. It would be a bit harsh to brand these people as 'selfish' because of a BOINC problem. Those who do notice and who seek advice would be aborting the excess tasks. Those who don't notice or don't take any action would have excess tasks that would be simply timing out - probably about now. This should stop being a problem as soon as the new GPU app is released and more GPU work becomes available.
Cheers,
Gary.
Thanks for both mikey's and
)
Thanks for both mikey's and Gary's replies.
My experience was probably due to the "Scheduler went nuts" problem. The outstanding validations have now mostly caught up and it's more usual for others to be waiting for my machine to report! I have just one outstanding validation remaining: a task sent out on the 10th of this month, which I completed on the 12th. The other computer's "All tasks" page shows (as I write) 28 tasks in progress, 7 timeouts, 2 abortions, 3 too-lates, 1 error, and 4 validations. All but one of the failures and one of the validated tasks were sent out at intervals of just over a minute on the 5th and 27 of the currently in-progress tasks were sent at a similar rate on the 10th. The machine does indeed have a GPU co-processor.
I happily abandon any idea of selfishness on the part of the other owners and comiserate with them over what must have been a worrying and frustrating time. :)
NG