No new work if a task is suspended

solling2
solling2
Joined: 20 Nov 14
Posts: 219
Credit: 1578044612
RAC: 18363
Topic 213065

In your Boinc manager, you see your tasks that were downloaded. You can start or stop additional downloads with the No new work button.

Now if you suspend a task (or several tasks), then no download starts any more. Downloading will only resume if the suspended task(s) are either back at work or aborted.

In other words, the Suspend this task button operates in the same manner as the No new work button.

Is that a little bug that is related to the recent data base reshuffling or is it an older weird Boinc feature?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118443081731
RAC: 25905890

It has always worked that

It has always worked that way.  I guess the thinking must be that if a user has suspended tasks that are in the work cache, it's probably not a good idea to allow further tasks to be sent until the user decides to deal with what is already on board.  The user could either allow them to crunch or abort them.

It's not the same as 'No new tasks'.  Suspending of some tasks to allow others to run 'out of sequence' is a legitimate (and quite transient) use of the 'suspend a task' function.

I think it might have been designed to be some sort of 'anti-cherry-picking' initiative, particularly for projects where cherry picking of tasks had been a problem in the past.

Why do you need to get more tasks if you have already suspended some?  I don't think it's fair to call it either a 'bug' or a 'weird BOINC feature' :-).  If you just want to 'top up the cache', just unsuspend those tasks temporarily, get some new work and then immediately suspend them again.

 

Cheers,
Gary.

solling2
solling2
Joined: 20 Nov 14
Posts: 219
Credit: 1578044612
RAC: 18363

Right, so that feature has a

Right, so that feature has a justification and has been introduced with good intentions, and let me revoke those words bug and weird. :-) But I stick to my view that for new and unexperienced users like me it has an unwelcome side effect.

I had a few CPU tasks and a few GPU tasks in the Boinc manager. I suspended one CPU task in order to measure the speed up when crunching the remaining, because we know that a multicore system will run at full boost speed only if not all core are busy. Soon thereafter I ran out of GPU tasks, which I believed was affiliated with the project downtime. I didn't imagine that postponing a single CPU task would completely inhibit the GPU with no further download. I'm always eager to learn something new, but it's a pity that sometimes it only works by trial and error. :-)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118443081731
RAC: 25905890

solling2 wrote:I had a few

solling2 wrote:
I had a few CPU tasks and a few GPU tasks in the Boinc manager.

OK.

solling2 wrote:
I suspended one CPU task in order to measure the speed up when crunching the remaining, because we know that a multicore system will run at full boost speed only if not all core are busy.

I don't understand what purpose this would serve.  If you suspend a single task, wouldn't one of the other unstarted tasks in your cache simply start crunching in its place so that the same number of CPU cores would still be loaded?  I could understand it if all other CPU tasks that were waiting to start were suspended as well, or if you had no other CPU tasks ready to start.

If that was in fact what you were doing (suspending all available CPU tasks to reduce the active CPU cores by one), you could have achieved that more easily by reducing the number of cores BOINC is allowed to use by one.  For example, on a quad core host, reduce the % of cores setting by 25%.  This is quick, local, immediate, and just as easy to reverse when you are done.  And you can leave it in place for as long as you like without any interruption to the work flow for either CPU or GPU tasks.

 

Cheers,
Gary.

solling2
solling2
Joined: 20 Nov 14
Posts: 219
Credit: 1578044612
RAC: 18363

I'm completely with you in

I'm completely with you in your analysis. I should have been more detailed in my description, but in a foreign language, you want to avoid any unnecessary word. :-)

I usually avoid to touch the percentage of core usage setting, because I remember a time when I had reduced that number for a test but forgot to set it back, had to go off, and found so many tasks untouched when coming back. So this time I had a small number of CPU tasks. In the first place, I suspended them all but one, thus imitating a 75% percent core usage setting. At that time a few GPU tasks were left, so it didn't matter. But the GPU tasks were crunched down way faster than the CPU task. At that point the no download misery began. I didn't know the server wouldn't differentiate what kind of task is suspended. At first, I found no way to download GPU tasks. So I set No new work, which is my default setting, then tried to download work from my fallback project, but they didn't have work at all. So then I allowed crunching of the remaining CPU tasks all at a time. Only thereafter I had the idea to countermand the No new work setting, and because I got downloads to my surprise, I then suspended a single of the remaining two CPU tasks only to confirm that this causes the download inhibition as described.

So I'd like to see one day on a Boinc wish list, that the server could differentiate between CPU and GPU tasks with respect to that suspend/download theme.

My initial intention, to find out more about a possible speed up by limiting core usage, due to the difference between standard and boost speed of a multicore system, was more for fun since an overall speedup is probably marginal. Say I gain a 10% speedup. The GPU is the time limiting step. Those 10% will affect the 89.997 to 100% interval only. Say that duration is 2% of a task, so we're talking about seconds. However, it's funny! :-)

mikey
mikey
Joined: 22 Jan 05
Posts: 12783
Credit: 1873199436
RAC: 1894103

solling2 wrote:I'm completely

solling2 wrote:

I'm completely with you in your analysis. I should have been more detailed in my description, but in a foreign language, you want to avoid any unnecessary word. :-)

I usually avoid to touch the percentage of core usage setting, because I remember a time when I had reduced that number for a test but forgot to set it back, had to go off, and found so many tasks untouched when coming back. So this time I had a small number of CPU tasks. In the first place, I suspended them all but one, thus imitating a 75% percent core usage setting. At that time a few GPU tasks were left, so it didn't matter. But the GPU tasks were crunched down way faster than the CPU task. At that point the no download misery began. I didn't know the server wouldn't differentiate what kind of task is suspended. At first, I found no way to download GPU tasks. So I set No new work, which is my default setting, then tried to download work from my fallback project, but they didn't have work at all. So then I allowed crunching of the remaining CPU tasks all at a time. Only thereafter I had the idea to countermand the No new work setting, and because I got downloads to my surprise, I then suspended a single of the remaining two CPU tasks only to confirm that this causes the download inhibition as described.

So I'd like to see one day on a Boinc wish list, that the server could differentiate between CPU and GPU tasks with respect to that suspend/download theme.

My initial intention, to find out more about a possible speed up by limiting core usage, due to the difference between standard and boost speed of a multicore system, was more for fun since an overall speedup is probably marginal. Say I gain a 10% speedup. The GPU is the time limiting step. Those 10% will affect the 89.997 to 100% interval only. Say that duration is 2% of a task, so we're talking about seconds. However, it's funny! :-)

A solution would be to setup a backup project with a zero resource share, that way if you suspend one project, or it hiccups and runs out of work to send,  the backup project will automatically kick in and send you new tasks. BUT it will only send enough tasks to fill the available cpu or gpu cores that are free on your pc. This would NOT help you in the way you did your test because Boinc still thinks you want to use 100% of the cpu cores and you would just get more cpu tasks from the backup project. But you would get gpu tasks too and that would solve this particular problem. I always keep one cpu project and one gpu project as backups, each one different, for when my main project runs out of work etc.

I personally do not run the same project on both my cpu's and gpu's on the same machine, the workunits are much different in crunching time yet the Boinc software doesn't differentiate between them in the cache sizes. Meaning if the gpu tasks are 2 minutes each but the cpu tasks are 30 minutes each I would still get a boatload of cpu units due to the very short gpu workunits.

solling2
solling2
Joined: 20 Nov 14
Posts: 219
Credit: 1578044612
RAC: 18363

mikey schrieb: A solution

mikey wrote:

A solution would be to setup a backup project with a zero resource share, that way if you suspend one project, or it hiccups and runs out of work to send,  the backup project will automatically kick in and send you new tasks.  ...

Another useful hint for me, thanks! I already had a feeling that I may micromanage too much in this project. But that fell into the category: If something is running somehow, you never think about it again.

One of the good points of the Einstein project is that you rarely need the fallback project, since the Einstein project is running extremely reliable and stable, even, for example, during the recent major server update.

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118443081731
RAC: 25905890

mikey wrote:.... if the gpu

mikey wrote:
.... if the gpu tasks are 2 minutes each but the cpu tasks are 30 minutes each I would still get a boatload of cpu units due to the very short gpu workunits.

For the benefit of anyone following this discussion, please be aware that the above doesn't give the full picture.  It doesn't matter at all that there is a difference in crunch times between GPU tasks and CPU tasks.  What does matter is the accuracy of estimates and how estimates are corrected locally by BOINC.

Firstly, consider the difference between the estimated crunch time and the true crunch time.  The project does try to send out tasks where the estimate agrees with the true value as much as possible.  Due to the wide range of differently performing hardware that crunches these tasks, that can be quite difficult to achieve.  Some hosts may be able to do better than the estimate and some may do worse.  It's never going to be perfect.

Secondly, the mechanism used to correct the estimate may be different between different projects.  Einstein uses DCF (Duration Correction Factor).  This works fine, except for one little problem - there is only a single DCF parameter for the entire project.  This is a well known BOINC deficiency that has been around for a long time with little prospect of ever being resolved, so it would seem.

This means that if you run different searches at the one project, and if the relationship of estimate to actual is markedly different between these searches, the DCF can react in opposite directions in a see-saw fashion and BOINC will never be able to settle on a good stable value for crunch time.  This will be most troublesome if GPU tasks finish a lot faster than their estimate and CPU tasks finish much more slowly than their estimate (or vice-versa).

Here is a more detailed version of the above quote that better explains the situation:-

"If the GPU tasks take 2 minutes each with a 4 minute estimate but the CPU tasks are 30 minutes each with a 15 minute estimate I would get a boatload of CPU units because the 'faster than estimated' GPU tasks will have reduced the DCF which in turn has dragged down the already low estimate for CPU tasks, requiring BOINC to request even more of them."

So, if you only take GPU work from Einstein and CPU tasks from elsewhere, there won't be a fluctuating DCF problem and the accompanying instability in CPU task fetch.  Mikey's solution will work nicely.

But that's not the end of the matter.  There is another factor, that compounds the potential over-fetch of CPU tasks where there is a nice fast GPU crunching as well.  By default, when running a GPU task, a CPU core is prevented from running a CPU task.   This is a default project setting for Einstein because GPU tasks can be adversely affected if appropriate, timely support is not immediately available when needed.  For example, in a quad core host, crunching both CPU and GPU tasks,  BOINC would fetch work for 4 cores but only allow 3 to run CPU tasks.  If you decide to crunch 2 concurrent GPU tasks, BOINC will still fetch for 4 cores but only allow 2 to crunch CPU tasks.  This could allow a significant over-fetch of CPU work, even if the CPU work was coming from a different project (so not being affected by an inappropriate DCF).

This form of over-fetch can be eliminated by the use of BOINC's %cores setting and the use of app_config.xml files.  Again using a quad core example, if the %cores was set at 50% and the app_config.xml file set <gpu_usage> to 0.5 and <cpu_usage> to anything less than 0.5 (eg. 0.2), 2xGPU tasks and 2xCPU tasks would be allowed to run with BOINC fetching new CPU tasks for just 2 cores.  The two non-loaded cores would be available for GPU support whenever needed.  Please realise that these settings do not have any impact on the number of CPU cycles that a GPU task will need for support.  The settings are just to allow BOINC to calculate how many tasks of each type to run.  You have to decide what the appropriate numbers should be, essentially by trial and error (and a bit of common sense) :-).

 

Cheers,
Gary.

mikey
mikey
Joined: 22 Jan 05
Posts: 12783
Credit: 1873199436
RAC: 1894103

Gary Roberts wrote:mikey

Gary Roberts wrote:
mikey wrote:
.... if the gpu tasks are 2 minutes each but the cpu tasks are 30 minutes each I would still get a boatload of cpu units due to the very short gpu workunits.

For the benefit of anyone following this discussion, please be aware that the above doesn't give the full picture.  It doesn't matter at all that there is a difference in crunch times between GPU tasks and CPU tasks.  What does matter is the accuracy of estimates and how estimates are corrected locally by BOINC.

Firstly, consider the difference between the estimated crunch time and the true crunch time.  The project does try to send out tasks where the estimate agrees with the true value as much as possible.  Due to the wide range of differently performing hardware that crunches these tasks, that can be quite difficult to achieve.  Some hosts may be able to do better than the estimate and some may do worse.  It's never going to be perfect.

Secondly, the mechanism used to correct the estimate may be different between different projects.  Einstein uses DCF (Duration Correction Factor).  This works fine, except for one little problem - there is only a single DCF parameter for the entire project.  This is a well known BOINC deficiency that has been around for a long time with little prospect of ever being resolved, so it would seem.

This means that if you run different searches at the one project, and if the relationship of estimate to actual is markedly different between these searches, the DCF can react in opposite directions in a see-saw fashion and BOINC will never be able to settle on a good stable value for crunch time.  This will be most troublesome if GPU tasks finish a lot faster than their estimate and CPU tasks finish much more slowly than their estimate (or vice-versa).

Here is a more detailed version of the above quote that better explains the situation:-

"If the GPU tasks take 2 minutes each with a 4 minute estimate but the CPU tasks are 30 minutes each with a 15 minute estimate I would get a boatload of CPU units because the 'faster than estimated' GPU tasks will have reduced the DCF which in turn has dragged down the already low estimate for CPU tasks, requiring BOINC to request even more of them."

So, if you only take GPU work from Einstein and CPU tasks from elsewhere, there won't be a fluctuating DCF problem and the accompanying instability in CPU task fetch.  Mikey's solution will work nicely.

But that's not the end of the matter.  There is another factor, that compounds the potential over-fetch of CPU tasks where there is a nice fast GPU crunching as well.  By default, when running a GPU task, a CPU core is prevented from running a CPU task.   This is a default project setting for Einstein because GPU tasks can be adversely affected if appropriate, timely support is not immediately available when needed.  For example, in a quad core host, crunching both CPU and GPU tasks,  BOINC would fetch work for 4 cores but only allow 3 to run CPU tasks.  If you decide to crunch 2 concurrent GPU tasks, BOINC will still fetch for 4 cores but only allow 2 to crunch CPU tasks.  This could allow a significant over-fetch of CPU work, even if the CPU work was coming from a different project (so not being affected by an inappropriate DCF).

This form of over-fetch can be eliminated by the use of BOINC's %cores setting and the use of app_config.xml files.  Again using a quad core example, if the %cores was set at 50% and the app_config.xml file set <gpu_usage> to 0.5 and <cpu_usage> to anything less than 0.5 (eg. 0.2), 2xGPU tasks and 2xCPU tasks would be allowed to run with BOINC fetching new CPU tasks for just 2 cores.  The two non-loaded cores would be available for GPU support whenever needed.  Please realise that these settings do not have any impact on the number of CPU cycles that a GPU task will need for support.  The settings are just to allow BOINC to calculate how many tasks of each type to run.  You have to decide what the appropriate numbers should be, essentially by trial and error (and a bit of common sense) :-). 

Thank you very much for stating that much better than I did, I also learned a thing or two along the way!!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.