I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.
I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.
I have an RX 580 in a Q6600 quad core host. The Q6600 is 2008 vintage - so relatively slow by todays standards. It runs 3xGPU and 1xCPU tasks. Using the properties function in BOINC Manager, I recorded the following information for one of the concurrent GPU tasks in the final stages of crunching:-
% Complete -> 89.54% 89.93% 100%
CPU Time -> 1:00 1:00 1:02
Tot Time -> 28:18 28:28 28:31
The time figures are minutes:seconds as displayed on the properties page, with a granularity of 1 second. The main point I'm trying to make is that this is quite different from what it was about a month or so ago. At that earlier time on this particular GPU, the follow-up stage (% complete sitting on 89.997%) lasted around 30 seconds to a minute. I never measured it precisely.
On the above figures, the very last bit of crunching plus the followup-stage (if there still is one) plus the time to retrieve the final results back from the GPU plus the time to shut down the app plus writing the results to disk ready for uploading took 3 seconds of elapsed time and very approximately 2 seconds of CPU time. So the above quote doesn't really give the complete picture, particularly regarding how short the 90-100% stage really is and how relatively little of the total CPU consumption occurs there.
I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.
I have an RX 580 in a Q6600 quad core host. The Q6600 is 2008 vintage - so relatively slow by todays standards. It runs 3xGPU and 1xCPU tasks. Using the properties function in BOINC Manager, I recorded the following information for one of the concurrent GPU tasks in the final stages of crunching:-
% Complete -> 89.54% 89.93% 100%
CPU Time -> 1:00 1:00 1:02
Tot Time -> 28:18 28:28 28:31
The time figures are minutes:seconds as displayed on the properties page, with a granularity of 1 second. The main point I'm trying to make is that this is quite different from what it was about a month or so ago. At that earlier time on this particular GPU, the follow-up stage (% complete sitting on 89.997%) lasted around 30 seconds to a minute. I never measured it precisely.
On the above figures, the very last bit of crunching plus the followup-stage (if there still is one) plus the time to retrieve the final results back from the GPU plus the time to shut down the app plus writing the results to disk ready for uploading took 3 seconds of elapsed time and very approximately 2 seconds of CPU time. So the above quote doesn't really give the complete picture, particularly regarding how short the 90-100% stage really is and how relatively little of the total CPU consumption occurs there.
No matter the time length of 90% to 100% task completion the GPU util drops, which was my point. Whether the seconds it is now or the couple of minutes with the prior data, there was little load on the GPU during that time frame.
I took a look at the Top Computers list. Here's the single GPU performance I gathered. The actual performance depends on overclocking and the number of work units run at the same time.
GPU
RAC
GTX 1080 Ti
700k~800k
GTX 1080
550k~700k
Vega 64
1100k~1500k
RX 480/580
550k~750k
RX 570
~500K
AMD GPUs obviously have a huge advantage over NVIDIA GPUs even when they have similar FLOPS. For example GTX 1080 Ti and Vega 64 both have 484 GB/s memory bandwidth. GTX 1080 Ti has 11,340 GFLOPS FP32 while Vega 64 has 12,583 GFLOPS. However Vega 64 is almost twice as fast in Einstein@home. I understand FGRPG uses OpenCL but AMD cards do not outperform NVIDIA cards in most OpenCL benchmarks.
I don't understand why Einstein@home has such poor optimization for NVIDIA cards while the number of hosts with NVIDIA GPU is more than twice the number of hosts with AMD GPU according to https://einsteinathome.org/server_status.php.
E@H's GPU apps are written in OpenCL so the project only has to maintain one code base, not two. Unfortunately NVidia's OpenCL implementation kinda sucks; many people suspect that's deliberate on NVidia's part and that they're trying to encourage developers to target their proprietary CUDA API instead.
Looks like an old AMD R9 390 still produces as much as RX 480/580 or GTX 1080. Initial cost to get one is much lower but then it will take more electricity.
One machine currently "has no GPU" (it has a HD5450 (?), but that proprietary crappy add-on from AMD is not supported with the version of X it is running (and it is running Debian-Jessie). It's a 8320E (8 cores).
I have a 2 core AMD CPU paired with a RX-550
I have a 4 core AMD APU (but GPU part is being ignored) paired with a RX-550.
I have a 8 core 8320E with a RX-460. The fans have been making noises and I got worried, and accidentally ordered two RX-570 replacements for it, when it dies or I get tired of hearing it.
I have a Ryzen-1600X with 32G of RAM and currently a RX-560 in it. One of the RX-570's I bought, has 8 GB of RAM, it will go in this machine.
I have a Ryzen-1600 with 16G of RAM which had a RX-560 in it. That was recently replaced with a RX-570 with 4G of RAM. This machine has the Ryzen idle freeze problem. I am only running it (at the moment) with 3 BOINC jobs, because I still haven't found a solution to this idle freeze. It is running the latest BIOS, I installed the most recent firmware-amd-graphics firmware, I am playing with BIOS settings and some kind of zen python program to disallow C6.
The machine with the 8320E and (effectively) no GPU, was meant to get hardware upgrades last winter, and I ran out of time. So soon it will get the hardware upgrades, which will mean putting a RX-560 in it.
I have done nothing in so far as tuning how many jobs or anything else. By and large, I was hoping to lean on BOINC to learn about doing computations across machines on a LAN. At some point, I am hoping that something (OpenMPI, OpenCL, ...) may allow me to work on problems where all my LAN contributes to solving a problem. One of the problems is such that I might need to set up a BOINC server, to have other computers contribute to solving problems. That's really vague.
I live in the "Peace Country", which is partly in NE BC Canada, and partly in NW Alberta, Canada. It is an area about the size of Germany, with about 150,000 people. That is not enough people for anyone to consider climate modeling for. The Peace Country is known worldwide for honey, because with our long summer days (19 hours? 20?) we have lots of flowers and hence bees. I've written to a couple of people who have a lot of experience with climate/weather stuff, and they both thought I have enough CPU, GPU, RAM and storage to do this. My background is materials science, but heavy on the computing side. So I am hoping to figure out one or more variations on downscaling, so that I can try and couple global circulation model results, to what is happening in the "Peace".
I've always seen no GPU load
)
I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.
mmonnin wrote:I've always
)
I have an RX 580 in a Q6600 quad core host. The Q6600 is 2008 vintage - so relatively slow by todays standards. It runs 3xGPU and 1xCPU tasks. Using the properties function in BOINC Manager, I recorded the following information for one of the concurrent GPU tasks in the final stages of crunching:-
The time figures are minutes:seconds as displayed on the properties page, with a granularity of 1 second. The main point I'm trying to make is that this is quite different from what it was about a month or so ago. At that earlier time on this particular GPU, the follow-up stage (% complete sitting on 89.997%) lasted around 30 seconds to a minute. I never measured it precisely.
On the above figures, the very last bit of crunching plus the followup-stage (if there still is one) plus the time to retrieve the final results back from the GPU plus the time to shut down the app plus writing the results to disk ready for uploading took 3 seconds of elapsed time and very approximately 2 seconds of CPU time. So the above quote doesn't really give the complete picture, particularly regarding how short the 90-100% stage really is and how relatively little of the total CPU consumption occurs there.
Cheers,
Gary.
Hi, in case it helps, here is
)
Hi, in case it helps, here is a link to a list from Tom's Hardware ranking GPU performance from 2018:
https://www.tomshardware.com/reviews/gpu-hierarchy,4388.html
No Titan V? Must have
)
No Titan V?
Must have assumed most people don't have that kind of cash for a GPU.
Gary Roberts wrote:mmonnin
)
No matter the time length of 90% to 100% task completion the GPU util drops, which was my point. Whether the seconds it is now or the couple of minutes with the prior data, there was little load on the GPU during that time frame.
I took a look at the Top
)
I took a look at the Top Computers list. Here's the single GPU performance I gathered. The actual performance depends on overclocking and the number of work units run at the same time.
AMD GPUs obviously have a huge advantage over NVIDIA GPUs even when they have similar FLOPS. For example GTX 1080 Ti and Vega 64 both have 484 GB/s memory bandwidth. GTX 1080 Ti has 11,340 GFLOPS FP32 while Vega 64 has 12,583 GFLOPS. However Vega 64 is almost twice as fast in Einstein@home. I understand FGRPG uses OpenCL but AMD cards do not outperform NVIDIA cards in most OpenCL benchmarks.
https://www.phoronix.com/scan.php?page=article&item=12-opencl-98&num=6
I don't understand why Einstein@home has such poor optimization for NVIDIA cards while the number of hosts with NVIDIA GPU is more than twice the number of hosts with AMD GPU according to https://einsteinathome.org/server_status.php.
E@H's GPU apps are written in
)
E@H's GPU apps are written in OpenCL so the project only has to maintain one code base, not two. Unfortunately NVidia's OpenCL implementation kinda sucks; many people suspect that's deliberate on NVidia's part and that they're trying to encourage developers to target their proprietary CUDA API instead.
Vega 64 doesn't outperform
)
Vega 64 doesn't outperform GTX 1080 Ti by more than 50% in any of the tests in this article.
https://www.phoronix.com/scan.php?page=article&item=12-opencl-98
In the Single Precision FFT Benchmark which I think most pertinent to E@H, Vega 64 only leads by 10%.
Looks like an old AMD R9 390
)
Looks like an old AMD R9 390 still produces as much as RX 480/580 or GTX 1080. Initial cost to get one is much lower but then it will take more electricity.
I have 6 computers on my LAN,
)
I have 6 computers on my LAN, all with AMD CPU.
One machine currently "has no GPU" (it has a HD5450 (?), but that proprietary crappy add-on from AMD is not supported with the version of X it is running (and it is running Debian-Jessie). It's a 8320E (8 cores).
I have a 2 core AMD CPU paired with a RX-550
I have a 4 core AMD APU (but GPU part is being ignored) paired with a RX-550.
I have a 8 core 8320E with a RX-460. The fans have been making noises and I got worried, and accidentally ordered two RX-570 replacements for it, when it dies or I get tired of hearing it.
I have a Ryzen-1600X with 32G of RAM and currently a RX-560 in it. One of the RX-570's I bought, has 8 GB of RAM, it will go in this machine.
I have a Ryzen-1600 with 16G of RAM which had a RX-560 in it. That was recently replaced with a RX-570 with 4G of RAM. This machine has the Ryzen idle freeze problem. I am only running it (at the moment) with 3 BOINC jobs, because I still haven't found a solution to this idle freeze. It is running the latest BIOS, I installed the most recent firmware-amd-graphics firmware, I am playing with BIOS settings and some kind of zen python program to disallow C6.
The machine with the 8320E and (effectively) no GPU, was meant to get hardware upgrades last winter, and I ran out of time. So soon it will get the hardware upgrades, which will mean putting a RX-560 in it.
I have done nothing in so far as tuning how many jobs or anything else. By and large, I was hoping to lean on BOINC to learn about doing computations across machines on a LAN. At some point, I am hoping that something (OpenMPI, OpenCL, ...) may allow me to work on problems where all my LAN contributes to solving a problem. One of the problems is such that I might need to set up a BOINC server, to have other computers contribute to solving problems. That's really vague.
I live in the "Peace Country", which is partly in NE BC Canada, and partly in NW Alberta, Canada. It is an area about the size of Germany, with about 150,000 people. That is not enough people for anyone to consider climate modeling for. The Peace Country is known worldwide for honey, because with our long summer days (19 hours? 20?) we have lots of flowers and hence bees. I've written to a couple of people who have a lot of experience with climate/weather stuff, and they both thought I have enough CPU, GPU, RAM and storage to do this. My background is materials science, but heavy on the computing side. So I am hoping to figure out one or more variations on downscaling, so that I can try and couple global circulation model results, to what is happening in the "Peace".