It took me a while to compose a response to your earlier message. I didn't see the latest one until after I had posted that response.
Glenn Hawley, RASC Calgary wrote:
... [coproc] ATI instance 0; 0.330000 ...
You must have set a GPU utilization factor of 0.33. That's rather crazy for such a low end card. I guess you started with 0.5 (since the tasks are branded with that and then decided to change to 0.33 at some point. The card you have is not capable of running these tasks at 3x. It should run at 2x but I rather suspect you may have a zero to negative performance improvement.
You should first work out what is causing the slow (and variable) crunch times when running single tasks - unless those much slower times in your list happen to be evidence that perhaps at some stage there were actually two tasks crunching simultaneously. How often do you monitor what is actually running?
3- All 16 threads are running an Einstein CPU workunit
4- Deadlines appear to be all December 15th or later
5- Work cache is set at 2 days
6- I used the GPU utilization factor
7- Yes, new work was downloaded identified as requiring 0.5 GPUs, and then after I changed it again, at 0.33 GPUs.
I chose the various components of this machine. The card is about the highest level I could find that would output to both VGA and DVI-D cables. Most modern high end graphics cards only have HDMI outputs, which are of no use without an HDMI compliant display screen. Of course I could buy such a card and use it solely for crunching BOINC, but my wife might not agree :) :)
Is the variability in time due to mixing up the Einstein production of my other computer? It's a laptop, and CPU tasks seem to take over 2 days, whereas on this one it looks like somewhat less.
I tried the test you suggested, suspending all "ready to start" CPU tasks and then suspending one running task. Two GPU tasks immediately lit up. So BOINC seems to be prioritizing the CPU threads over the GPU.
My simplest option, then, (other than buying a graphics card that I can never use for graphics) appears to just be content with one GPU task at a time. At least Richie's suggestions managed to clear up the total blockage of work from the card.
Sorry I didn't response anything earlier... I was sort of wishing that Gary would show up and explain and fix in more detail some of those things that I was chatting about. I And that happened, good.
Glenn Hawley, RASC Calgary wrote:
The card is about the highest level I could find that would output to both VGA and DVI-D cables.
Have you checked if adapters could solve that scenario? I've used VGA-DVI (those small adapters used to come with many GPUs in the package). Also adapter cables are sold in many types (everything between DP/HDMI/DVI). A simple VGA/DVI adapter is cheap. Prices for the cables vary from cheap to nasty, depending on cable length and place of purchase.
Well, I'm not going to change to a different graphics card at this point.
The KVM switch between this new Win10 and the older WinXP is working fine as it is.
At some later date I might look into getting another graphics card, one that costs $500 today but will be cheaper in the future as it becomes obsolete... but still faster than what I've got now.
I wasn't even aware until just recently that more than one task could even be run on a GPU
Hi Glenn,
Thanks for responding. From the information provided, some things are now a lot clearer but some puzzles remain.
Glenn Hawley, RASC Calgary wrote:
3- All 16 threads are running an Einstein CPU workunit
Are you sure about that? If so, that's an immediate problem since with just a single GPU task running, BOINC should be limiting the maximum number of CPU tasks to 15. When you set the GPU utilization factor to 0.5 (and with each GPU task requiring the support of a full CPU thread) the maximum allowed to run should be 14. Are you really sure that when any GPU tasks are running, there are also 16 CPU tasks running?
Please be aware that running all threads does put a heavy load on the machine. Since the machine is new and hopefully with good cooling, it should be OK. Over time, you will need to watch temperatures and keep the internals clean and free from dust blockages. You will probably see quite an increase in your power bill.
In addition to that, you also need to be aware of what 8 cores, 16 threads really means. Yes, you can run a full 16 CPU tasks. Two CPU tasks will be 'sharing' the one physical core. Each one will run more slowly than it would if it had sole access to that physical core. The hope is that although they take longer when running concurrently, it should overall be somewhat shorter than the total time for running the same two tasks consecutively on a single core. Just how much shorter depends on the nature of the tasks. They are compute intensive so I suspect the gain won't be large. The only way to know for sure would be to measure it.
To do that, you could set BOINC to use 50% of your threads. With all GPU tasks suspended, you should see 8 running CPU tasks. As long as nothing much else was running, this should give you a reasonable estimate of the '1 per core' crunch time to compare with the '1 per thread' that you should already be seeing.
The number of running CPU tasks will also impact on GPU performance. As a test to find the true GPU performance, you could try temporarily running GPU tasks only. This would give you the absolute lowest possible GPU crunch time. During that test you could run GPU tasks singly to get that figure and then 2x (as a separate test) to see how that goes. This would certainly tell you if there is any advantage from running 2x.
Glenn Hawley, RASC Calgary wrote:
The card is about the highest level I could find that would output to both VGA and DVI-D cables.
Do you really need both? All cards I've bought recently have DVI-D as well as HDMI and even the oldish screens I use support both DVD-D and VGA. Does your screen not support DVI-D?
Glenn Hawley, RASC Calgary wrote:
Is the variability in time due to mixing up the Einstein production of my other computer? It's a laptop, and CPU tasks seem to take over 2 days, whereas on this one it looks like somewhat less.
I'm not sure what you mean by "mixing up". Your computers list shows two separate hosts, the new Ryzen and an i5-6200U, which I imagine is your laptop. Two separate host IDs, two separate tasks lists, two separate sets of stats. One can't affect the performance of the other. Both are listed as running Win 10.
In your most recent message you mention a KVM switch and Win XP?? Do you have a third machine, running XP, and sharing peripherals with your Ryzen? Is it the KVM switch that requires you to also need VGA?
You also mentioned $500 for a graphics card. You wouldn't need to spend anything like that. Recently I've been buying graphics cards that cost $US135 ($AU189) and in my country prices are always more expensive than in North America. These cards complete an FGRPB1G task in less than 10mins. I'm certainly not trying to encourage you to buy a new card. I'm just trying to correct the impression that others reading here might get about how expensive it is to own a decent GPU capable of efficient crunching performance.
Recently, you must have made a change to your profile here because, as a moderator, I get the job of 'approving' new profiles or ones that change. I remember approving yours. I was impressed by your background, your involvement with RASC and your long service to BOINC projects. In an earlier message in this thread, you mentioned, "with BOINC capability as an important consideration" so I'm just trying to provide information about things that seem to be important to you :-).
Glenn Hawley, RASC Calgary wrote:
I tried the test you suggested, suspending all "ready to start" CPU tasks and then suspending one running task. Two GPU tasks immediately lit up. So BOINC seems to be prioritizing the CPU threads over the GPU.
That is quite puzzling. Are you saying that before you suspended the single running task there were no GPU tasks running, which is what your words seem to imply. Or did you mean that, "one extra GPU task lit up" which would be less puzzling. Seeing as you seemed to have a GPU utilization factor of 0.33, did you try suspending a further CPU task to see if 2 running GPU tasks would turn into 3? Actually, I probably wouldn't even try that for fear it would turn into an immediate crash :-).
Answers to earlier questions (eg. did you really have 16 running CPU tasks when you had a single GPU task running?) might reduce the puzzle so I'll wait until you respond to those. Also, if you have further questions about any of my comments, please ask :-).
The Tasks screen typically shows 16 CPU tasks as "running". Whether they're all actually computing or not isn't immediately obvious.
I used BOINC manager to run only 50% of the CPUs, and the 8 least advanced became "Waiting to run"
I'll have to wait a day or two to see if that increases CPU task run speed significantly.
I wonder if setting that reduces the actual number of CPUs being used to four, and thus each still running two threads, or if it reduces the number of threads per core by half.
The second test, running only GPU tasks, will have to wait until the first test has shown its results.
My two screens have, respectively, a VGA and a DVI-D cable. The VGA one is connected through a KVM switch so that I can bring up my old XP machine through its VGA output (the Pentium 4 is no longer good enough to run Einstein, and is now relegated to Asteroids and SETI). I have become spoiled using two screens (three at work, when I was active in the seismic industry), so the new computer runs both screens, and the older one is now reduced to just the one screen. We have things that still run great on WinXP, but not at all on Win10, or with alternatives that for various reasons we see as inadequate.
Maybe I could buy another graphics card and install it solely for crunching, since even the old machine's graphics are fast enough for the uses my wife and I have (and the card in the new computer is MUCH faster). But I'd want to get the thing stable and figured out first in its current configuration. I can see already that even the modest card I have looks to be producing as much credits as my entire laptop does (~20K).
I have seen hints on the Einstein and BOINC fora that there are ways for BOINC (or Einstein) to control individual GPUs, with different parameters for each, through some sort of config.xml type of coding, but I have only the vaguest idea of the necessary syntax.
The test suspending all "ready to start" CPU tasks was done with only one GPU task running, with the task saying 1 CPU & 0.33 GPU, and when I suspended one of the running CPU tasks, two more such GPU tasks started up, making three active, and presumably sharing not just the one freed-up CPU thread, but probably others as well, thus competing to some extent with the CPU tasks. I've returned to running only one GPU task at a time.
This whole scenario presents interesting puzzles, and I'm learning a whole lot more about BOINC and Einstein than I ever knew... since previously I didn't have the hardware to even explore these options.
I wonder if setting that reduces the actual number of CPUs being used to four, and thus each still running two threads, or if it reduces the number of threads per core by half.
The BOINC setting you are considering here just influences how many tasks are launched, not where they run. Your OS will choose where to run them. Last time I checked, on my Windows machines, the launching was somewhat random among the available CPUs. Thus in your hyperthreaded case, when launching only four tasks, some of them sometimes would get paired on the two sides of a physical core, and sometimes some physical cores would get no Einstein tasks.
You would probably find that you would get more Einstein work done, and much more repeatable elapsed time, if you restricted the Einstein task to run on only the even or the odd numbered CPUs. As you run Windows, you could do this for the currently running instances using, for example, the utility Process Explorer. Doing it as a standing policy might most easily be done using the Process Lasso application.
All my comments here are directed to the specific case where Einstein is restricted to run just four tasks on an 8-CPU machine, where the 8 CPUs are implemented by hyperthreading of four physical cores.
Here's an extract from someone else's app_config.xml that specifies max_concurrent... that might be an approach to throttle CPU tasks to free up cores for GPU efforts.
If there's a way to distinguish gpu_0 from gpu_1 and set different gpu_usage for each, plus set "suspend when computer is in use" for gpu_0, (assuming gpu_0 is the weaker of the two) while limiting the CPU workunits without limiting BOINC access to the cores, that would resolve many potential issues with having a second graphics card.
I've used Process Lasso to exclude all the odd numbered CPUs from the einstein_O2MD1_2.00 running processes, and thus dropped the number running to 8 (out of 16 threads possible).
Plus I excluded hsgamma_FRPB1G from all even numbered CPUs to avoid conflict there.
However when I reset in computing preferences "Use at most 100% of the CPUs" instead of "50%", it picked up 8 more CPU workunits and started them.
So it looks like the 16 workunits may be trying to multitask on only four cores (eight CPUs)
In Process Explorer, it looks like the "number of CPUs" being used by each workunit is varying from 1.12 to 6.21 At least I think that's what it means by the "CPU" column.
The GPU must now be competing for time against something, since the estimated time for the WU has gone from about 3.5 hours up to 3+ days.
I think a variant of max_concurrent would be necessary to rein in the CPU processes.
BOINC and Win10 seem bound and determined to run 16 CPUs worth of work concurrently, even though I've restricted the application (in Process Lasso) to even CPUs only.
It took me a while to compose
)
It took me a while to compose a response to your earlier message. I didn't see the latest one until after I had posted that response.
You must have set a GPU utilization factor of 0.33. That's rather crazy for such a low end card. I guess you started with 0.5 (since the tasks are branded with that and then decided to change to 0.33 at some point. The card you have is not capable of running these tasks at 3x. It should run at 2x but I rather suspect you may have a zero to negative performance improvement.
You should first work out what is causing the slow (and variable) crunch times when running single tasks - unless those much slower times in your list happen to be evidence that perhaps at some stage there were actually two tasks crunching simultaneously. How often do you monitor what is actually running?
Cheers,
Gary.
1 & 2- This machine runs only
)
1 & 2- This machine runs only Einstein.
3- All 16 threads are running an Einstein CPU workunit
4- Deadlines appear to be all December 15th or later
5- Work cache is set at 2 days
6- I used the GPU utilization factor
7- Yes, new work was downloaded identified as requiring 0.5 GPUs, and then after I changed it again, at 0.33 GPUs.
I chose the various components of this machine. The card is about the highest level I could find that would output to both VGA and DVI-D cables. Most modern high end graphics cards only have HDMI outputs, which are of no use without an HDMI compliant display screen. Of course I could buy such a card and use it solely for crunching BOINC, but my wife might not agree :) :)
Is the variability in time due to mixing up the Einstein production of my other computer? It's a laptop, and CPU tasks seem to take over 2 days, whereas on this one it looks like somewhat less.
I tried the test you suggested, suspending all "ready to start" CPU tasks and then suspending one running task. Two GPU tasks immediately lit up. So BOINC seems to be prioritizing the CPU threads over the GPU.
My simplest option, then, (other than buying a graphics card that I can never use for graphics) appears to just be content with one GPU task at a time. At least Richie's suggestions managed to clear up the total blockage of work from the card.
Sorry I didn't response...
)
Sorry I didn't response anything earlier... I was sort of wishing that Gary would show up and explain and fix in more detail some of those things that I was chatting about. I And that happened, good.
Have you checked if adapters could solve that scenario? I've used VGA-DVI (those small adapters used to come with many GPUs in the package). Also adapter cables are sold in many types (everything between DP/HDMI/DVI). A simple VGA/DVI adapter is cheap. Prices for the cables vary from cheap to nasty, depending on cable length and place of purchase.
Well, I'm not going to change
)
Well, I'm not going to change to a different graphics card at this point.
The KVM switch between this new Win10 and the older WinXP is working fine as it is.
At some later date I might look into getting another graphics card, one that costs $500 today but will be cheaper in the future as it becomes obsolete... but still faster than what I've got now.
I wasn't even aware until just recently that more than one task could even be run on a GPU
Hi Glenn, Thanks for
)
Hi Glenn,
Thanks for responding. From the information provided, some things are now a lot clearer but some puzzles remain.
Are you sure about that? If so, that's an immediate problem since with just a single GPU task running, BOINC should be limiting the maximum number of CPU tasks to 15. When you set the GPU utilization factor to 0.5 (and with each GPU task requiring the support of a full CPU thread) the maximum allowed to run should be 14. Are you really sure that when any GPU tasks are running, there are also 16 CPU tasks running?
Please be aware that running all threads does put a heavy load on the machine. Since the machine is new and hopefully with good cooling, it should be OK. Over time, you will need to watch temperatures and keep the internals clean and free from dust blockages. You will probably see quite an increase in your power bill.
In addition to that, you also need to be aware of what 8 cores, 16 threads really means. Yes, you can run a full 16 CPU tasks. Two CPU tasks will be 'sharing' the one physical core. Each one will run more slowly than it would if it had sole access to that physical core. The hope is that although they take longer when running concurrently, it should overall be somewhat shorter than the total time for running the same two tasks consecutively on a single core. Just how much shorter depends on the nature of the tasks. They are compute intensive so I suspect the gain won't be large. The only way to know for sure would be to measure it.
To do that, you could set BOINC to use 50% of your threads. With all GPU tasks suspended, you should see 8 running CPU tasks. As long as nothing much else was running, this should give you a reasonable estimate of the '1 per core' crunch time to compare with the '1 per thread' that you should already be seeing.
The number of running CPU tasks will also impact on GPU performance. As a test to find the true GPU performance, you could try temporarily running GPU tasks only. This would give you the absolute lowest possible GPU crunch time. During that test you could run GPU tasks singly to get that figure and then 2x (as a separate test) to see how that goes. This would certainly tell you if there is any advantage from running 2x.
Do you really need both? All cards I've bought recently have DVI-D as well as HDMI and even the oldish screens I use support both DVD-D and VGA. Does your screen not support DVI-D?
I'm not sure what you mean by "mixing up". Your computers list shows two separate hosts, the new Ryzen and an i5-6200U, which I imagine is your laptop. Two separate host IDs, two separate tasks lists, two separate sets of stats. One can't affect the performance of the other. Both are listed as running Win 10.
In your most recent message you mention a KVM switch and Win XP?? Do you have a third machine, running XP, and sharing peripherals with your Ryzen? Is it the KVM switch that requires you to also need VGA?
You also mentioned $500 for a graphics card. You wouldn't need to spend anything like that. Recently I've been buying graphics cards that cost $US135 ($AU189) and in my country prices are always more expensive than in North America. These cards complete an FGRPB1G task in less than 10mins. I'm certainly not trying to encourage you to buy a new card. I'm just trying to correct the impression that others reading here might get about how expensive it is to own a decent GPU capable of efficient crunching performance.
Recently, you must have made a change to your profile here because, as a moderator, I get the job of 'approving' new profiles or ones that change. I remember approving yours. I was impressed by your background, your involvement with RASC and your long service to BOINC projects. In an earlier message in this thread, you mentioned, "with BOINC capability as an important consideration" so I'm just trying to provide information about things that seem to be important to you :-).
That is quite puzzling. Are you saying that before you suspended the single running task there were no GPU tasks running, which is what your words seem to imply. Or did you mean that, "one extra GPU task lit up" which would be less puzzling. Seeing as you seemed to have a GPU utilization factor of 0.33, did you try suspending a further CPU task to see if 2 running GPU tasks would turn into 3? Actually, I probably wouldn't even try that for fear it would turn into an immediate crash :-).
Answers to earlier questions (eg. did you really have 16 running CPU tasks when you had a single GPU task running?) might reduce the puzzle so I'll wait until you respond to those. Also, if you have further questions about any of my comments, please ask :-).
Cheers,
Gary.
The Tasks screen typically
)
The Tasks screen typically shows 16 CPU tasks as "running". Whether they're all actually computing or not isn't immediately obvious.
I used BOINC manager to run only 50% of the CPUs, and the 8 least advanced became "Waiting to run"
I'll have to wait a day or two to see if that increases CPU task run speed significantly.
I wonder if setting that reduces the actual number of CPUs being used to four, and thus each still running two threads, or if it reduces the number of threads per core by half.
The second test, running only GPU tasks, will have to wait until the first test has shown its results.
My two screens have, respectively, a VGA and a DVI-D cable. The VGA one is connected through a KVM switch so that I can bring up my old XP machine through its VGA output (the Pentium 4 is no longer good enough to run Einstein, and is now relegated to Asteroids and SETI). I have become spoiled using two screens (three at work, when I was active in the seismic industry), so the new computer runs both screens, and the older one is now reduced to just the one screen. We have things that still run great on WinXP, but not at all on Win10, or with alternatives that for various reasons we see as inadequate.
Maybe I could buy another graphics card and install it solely for crunching, since even the old machine's graphics are fast enough for the uses my wife and I have (and the card in the new computer is MUCH faster). But I'd want to get the thing stable and figured out first in its current configuration. I can see already that even the modest card I have looks to be producing as much credits as my entire laptop does (~20K).
I have seen hints on the Einstein and BOINC fora that there are ways for BOINC (or Einstein) to control individual GPUs, with different parameters for each, through some sort of config.xml type of coding, but I have only the vaguest idea of the necessary syntax.
The test suspending all "ready to start" CPU tasks was done with only one GPU task running, with the task saying 1 CPU & 0.33 GPU, and when I suspended one of the running CPU tasks, two more such GPU tasks started up, making three active, and presumably sharing not just the one freed-up CPU thread, but probably others as well, thus competing to some extent with the CPU tasks. I've returned to running only one GPU task at a time.
This whole scenario presents interesting puzzles, and I'm learning a whole lot more about BOINC and Einstein than I ever knew... since previously I didn't have the hardware to even explore these options.
Glenn Hawley, RASC Calgary
)
The BOINC setting you are considering here just influences how many tasks are launched, not where they run. Your OS will choose where to run them. Last time I checked, on my Windows machines, the launching was somewhat random among the available CPUs. Thus in your hyperthreaded case, when launching only four tasks, some of them sometimes would get paired on the two sides of a physical core, and sometimes some physical cores would get no Einstein tasks.
You would probably find that you would get more Einstein work done, and much more repeatable elapsed time, if you restricted the Einstein task to run on only the even or the odd numbered CPUs. As you run Windows, you could do this for the currently running instances using, for example, the utility Process Explorer. Doing it as a standing policy might most easily be done using the Process Lasso application.
All my comments here are directed to the specific case where Einstein is restricted to run just four tasks on an 8-CPU machine, where the 8 CPUs are implemented by hyperthreading of four physical cores.
Here's an extract from
)
Here's an extract from someone else's app_config.xml that specifies max_concurrent... that might be an approach to throttle CPU tasks to free up cores for GPU efforts.
If there's a way to distinguish gpu_0 from gpu_1 and set different gpu_usage for each, plus set "suspend when computer is in use" for gpu_0, (assuming gpu_0 is the weaker of the two) while limiting the CPU workunits without limiting BOINC access to the cores, that would resolve many potential issues with having a second graphics card.
< pp_config> <app> <name>hsgamma_FGRPB1G</name> <fraction_done_exact/> <gpu_versions> <gpu_usage>0.333</gpu_usage> <cpu_usage>0.4</cpu_usage> </gpu_versions> </app> <app> <name>einstein_O2AS20-500</name> <fraction_done_exact/> <gpu_versions> <gpu_usage>0.33</gpu_usage> <cpu_usage>0.4</cpu_usage> </gpu_versions> </app> <app_version> <app_name>einstein_O2AS20-500</app_name> <plan_class>LIBC215</plan_class> <avg_ncpus>0.7</avg_ncpus> <max_concurrent>2</max_concurrent> </app_version> <project_max_concurrent>8</project_max_concurrent> </app_config>
I'm learning more every
)
I'm learning more every day.
I've used Process Lasso to exclude all the odd numbered CPUs from the einstein_O2MD1_2.00 running processes, and thus dropped the number running to 8 (out of 16 threads possible).
Plus I excluded hsgamma_FRPB1G from all even numbered CPUs to avoid conflict there.
However when I reset in computing preferences "Use at most 100% of the CPUs" instead of "50%", it picked up 8 more CPU workunits and started them.
So it looks like the 16 workunits may be trying to multitask on only four cores (eight CPUs)
In Process Explorer, it looks like the "number of CPUs" being used by each workunit is varying from 1.12 to 6.21 At least I think that's what it means by the "CPU" column.
The GPU must now be competing
)
The GPU must now be competing for time against something, since the estimated time for the WU has gone from about 3.5 hours up to 3+ days.
I think a variant of max_concurrent would be necessary to rein in the CPU processes.
BOINC and Win10 seem bound and determined to run 16 CPUs worth of work concurrently, even though I've restricted the application (in Process Lasso) to even CPUs only.