Task deadlines and comissioning/upgrading hardware

Gordon Haverland
Gordon Haverland
Joined: 28 Oct 16
Posts: 20
Credit: 428489605
RAC: 0
Topic 211022

Living in the northern hemisphere, November is winter (I live at 56N, so a long way from the equator).  In the summer, too many things related to weather keep me away from serious computer work.  But, I can buy hardware.  Maybe.

 

This year I was lucky, and bought some hardware.  My desktop machine was intended to be running a RX-460, but issues meant it had a R7-250 in it.  With a dual CPU processor.  The upgrades meant putting in a 8 core CPU and putting the RX-460 back in.  I've been running SETI@Home for a long time, in part because I support getting to the Moon permanently.  Einstein@Home seems to have the best policy with respect to getting GPUs to work.  If neither source are providing CPU jobs, I will register World Community Project with a machine, but they seldom have GPU jobs (have they ever?).

 

Okay, I swapped the CPU and put the RX-460 back in, and then upgraded the OS to Devuan/Ceres (think Debian/unstable without systemd).  I got jobs from places (at first, I had a vsyscall problem, which I have since fixed - well, until BOINC jobs quit requiring vsyscall).  And new hardware and new OS, I was expecting problems.  So I think I dealt with them.

 

But since things started seeming to work, I've seen this computer spend almost all of its time on Einstein@Home.  I can see SETI jobs or World Community Grid jobs in the queue with some run time, but days later the runtime either hasn't changed, or has changed very little.

 

Maybe this is a startup quirk which needs to work itself through?  Fine, go onto to second hardware upgrade and OS upgrade.  It seems to be doing the same thing.

 

I haven't spent time analyzing things, but what it looks like is a deadline issue.  The priorities I have asked for with respect to the various projects are being ignored, because too many tasks have deadlines that are too close.  Which ends up with my machines spending all their time on Einstein@Home.  I can't cut Einstein@Home off, because it at least exercises my (new) GPUs.  Which isn't happening with SETI@Home or World Community Grid.  And I don't want to restrict Einstein@Home to only GPU jobs.

 

I am not complaining, I am just reporting a symptom.  It could be just a statistical fluke.  I do Monte Carlo runs with variables from distributions of infinite variance, so I know flukes.  :-)  I am just trying to understand the system.

 

I like Open Source, and hence I am trying to use Mesa instead of AMD proprietary stuff.  I now have 2 Polaris GPUs.  next up is commisioning a new machine with a Ryzen processor and a Polaris GPU.  Finally, will be updating my server and swapping the HD6450 in it for a Polaris GPU.  So I should soon have 4 machines running MESA/Linux with new kernels and amdgpu support.

 

Have a great day!

 

Gordon Haverland
Gordon Haverland
Joined: 28 Oct 16
Posts: 20
Credit: 428489605
RAC: 0

I spent the day dealing with

I spent the day dealing with things not related to computers.  At this point, I think I will just micromanage the uploading of new jobs.  How it appears to me, is that Einsteing@Home (and possibly other BOINC sites) sends jobs with deadlines sufficiently close to "now", that defeats the load balancing that is supposed to be followed by giving priorities to projects.

 

I think that priorities need to be split between CPU and GPU.  Some BOINC projects do not provide GPU tasks.  There is no sense applying a project priority to tasks that will never happen.

 

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117567143296
RAC: 35281157

Hi Gordon,I'm not sure what

Hi Gordon,

I'm not sure what sort of response you might be looking for so I've simply read what you have posted (a couple of times) and I'll make some guesses/general comments on what might be happening.  I hope some of this might shed some light and/or be useful to you.

How BOINC allocates priorities, what work it chooses to download and from which project(s), together with the order in which it chooses to run that work, are not things that individual projects have any real control over.  If BOINC seems to be making a mess of things, your best option is to make sure you (and your preference choices) are not making things virtually impossible for BOINC to handle.  Micromanaging is not a long term solution and quite often in the short term it can make things worse, unless you really understand all the factors and how they interplay.

By the sound of what you're describing, BOINC is in high priority mode (I call it PANIC mode because quite often some stupid things can happen in that mode) and this is where BOINC can ignore your settings and do what it thinks it needs to do, quite often based on erroneous data that it can't know is erroneous.  It's a long complicated story and quite often it starts with the user setting too large a work cache size for the conditions under which BOINC is operating.

If you have an oversupply of work and a close deadline problem on lots of tasks, temporarily set the basic cache to 0.1 days and zero for extra days, which might get BOINC out of panic mode.  If you have task completion estimates way larger than they should be, that might be why BOINC is in panic mode.  You might be able to get out of panic mode by temporarily suspending some of those tasks until what's left no longer causes BOINC to panic.

If you have partly crunched tasks for other projects yet BOINC seems fixated on crunching Einstein, it's probably because the other projects aren't under deadline pressure but Einstein is.   If you suspend or abort the excess Einstein tasks, maybe BOINC will drop out of panic mode and things can return to normal.  If you abort tasks without reducing your work cache size, maybe your client will just request more work again so be careful with that.

All of the above is supposition and could be quite wrong.  If you don't tell us the full details of projects, work cache settings, resource shares, task estimates compared to true completion times, hours/day that BOINC runs, etc, etc, it's difficult to give proper advice.

As far as Einstein is concerned, the standard deadline is 14 days.  If you have tasks failing the deadline, it's not because you were given almost expired work.  It's probably because your client asked for too much and it sat for far too long before it was allowed to start by your client.  Eventually it got to panic mode stage.  That's where partially completed tasks can be abandoned while BOINC concentrates on 'at risk' tasks.

Projects don't force work on you.  They try to send exactly what your client requests.  If you have too much work, you need to work out why your client asked for it in the first place.  And there's no such thing as a "startup quirk" which will just work itself out :-).  However there are ways to give BOINC a set of preferences that will cause it to have great difficulty in maintaining equilibrium :-).

In your very last paragraph, you refer to project priorities.  Do you actually mean resource share and how it should be honoured?  If BOINC isn't sharing your resources between the selected projects to your satisfaction, it's not a matter that individual projects can do much about.  To get things to change, the BOINC developers would need to do a lot of redesign and I can't see that happening in the current circumstances.

 I hope some of this might be useful.

 

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.