but I suspect my card under current conditions may have a slightly elevated invalid rate at 3X.
With quite a bit of evidence from running four 5700 cards in three systems at 3X for days, I'm reversing this position. Not only does it seem that running at 3X does not impose a meaningful validity penalty over 2X, but also it does give a meaningful improvement in output and power efficiency. On my Windows 10 systems with three different CPUs it also does not seem to impair interactive responsiveness.
The benefit I see from running 3X vs. 2X varied appreciably among my three systems. Something about the host characteristics (CPU, motherboard, ...) meant somewhat uneven Einstein productivity at 2X on the three machines, with corresponding differences in GPU Load in the area of 90 to 95%. Running at 3X, compressed the differences, with all four 5700 cards on all three systems reporting GPU Load very near 98.0%, and a pretty tight range of elapsed times when I kept power limitation at a level which gave reasonably comparable GPU clock rates.
The actual productivity improvement I saw from going from 2X to 3X ranged from about 5% to about 12%, depending on the system. Up until I had a try at the Radeon VII and the RX 570, I've been an NVidia guy here at Einstein, and in recent applications, the available improvement in going from 2X to 3X on my Nvidia cards was nowhere near that high. Also the polling method used for detecting GPU support needs of the GPU on the Einstein Nvidia applications meant that the impact on host system interactive performance of adding a task was appreciable. so I did not run at 3X anywhere, and actually limited myself to 1X on my wife's system, which is a hyperthreaded CPU with two physical cores.
Imagine my surprise when I found that my system which currently hosts two 5700 cards with a CPU, which while pretty fast, had only four cores and no hyperthreading, was very happy to run at 3X. Of course, BOINC would have refused to start the required six tasks had it believed I only had four cores, but I had previously adjusted matters to lie to BOINC, claiming 16 CPUs, as a means of obtaining adequate daily task dispatch limit when this same system was the host for a Radeon VII.
Speaking of that Radeon VII, it is worth mentioning that the system in question is currently appreciably more productive running two RX 5700 cards at 3X than it was running on Radeon VII at 2X (about 1.8M credits/day vs. 1.6, not nominal but actual). It currently burns somewhat more power, but there is considerable room to trade a substantial reduction in power consumption for a moderate reduction in credit rate.
This post is already long, so I'll save other 5700 observations for later, but suffice it to say that at the moment I am quite happy to have done a 100% conversion of my little three host flotilla from RX 570 cards (most recently) to RX 5700 cards. I must repeat my previous thanks to Gavin and to Mumak for their initial reports that with new driver releases the RX 5700 behavior on Einstein GRP had done a sudden change from almost 100% failure rate to an ordinary invalid rate (currently about 1.6% on my flotilla).
I must repeat my previous thanks to Gavin and to Mumak for their initial reports that with new driver releases the RX 5700 behavior on Einstein GRP had done a sudden change from almost 100% failure rate to an ordinary invalid rate (currently about 1.6% on my flotilla).
Thank you for including me in your recent thanks but really all the credit should go to Mumak for his initial testing and reporting of his findings. I only jumped on the new driver bandwagon because of him!
I have now switched my 5700XT machine to run tasks x3 on the back of your reports :-) Let's see if I can gain the same sort of improvement you have seen.
So we can run Navi 10's under Linux without the OCL problem now?
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I have been running my AMD Radeon 5700 under Ubuntu with the latest recommended Linux driver for long enough with no apparent problems that I claim the driver issue has been fixed.
So I ordered another 5700 (open box). Under advice from a top 50 box running 5700's I am running 3 gpu tasks.
1 ~ 6.5 minutes
2 ~ 10.5 minutes
3 ~ 15.5 minutes
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I have been running my AMD Radeon 5700 under Ubuntu with the latest recommended Linux driver for long enough with no apparent problems that I claim the driver issue has been fixed.
So I ordered another 5700 (open box). Under advice from a top 50 box running 5700's I am running 3 gpu tasks.
1 ~ 6.5 minutes
2 ~ 10.5 minutes
3 ~ 15.5 minutes
Tom M
what task type for these run times? Gamma Ray? or Gravitational Wave?
I have been running my AMD Radeon 5700 under Ubuntu with the latest recommended Linux driver for long enough with no apparent problems that I claim the driver issue has been fixed.
So I ordered another 5700 (open box). Under advice from a top 50 box running 5700's I am running 3 gpu tasks.
1 ~ 6.5 minutes
2 ~ 10.5 minutes
3 ~ 15.5 minutes
Tom M
what task type for these run times? Gamma Ray? or Gravitational Wave?
Opps. Sorry.
These are my preliminary #'s for Pulsar Search #1 (Gamma Ray) (gpu).
I should have some GW gpu later. I just switched it on with a two tasks per gpu to avoid the 3GB memory problem.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I should have some GW gpu later. I just switched it on with a two tasks per gpu to avoid the 3GB memory problem.
If it works well at 2X, you might find it interesting to check 3X. I suspect it won't run out of memory.
I've not noticed that anyone talking about the 3GB problem has posted actual readings of memory use at different multiplicities. Of course the situation is complicated by variation in the requirements of individual tasks.
A few years ago when I had a new Nvidia 2080 I did some initial checkout work twisting various knobs. The very first thing I tried was 1X, 2X, 3X, 4X and by good luck my log sheet has a column titled AvgMemUse. I presume this was a GPU-Z reading. The result is distinctly lower than a linear extrapolation from the 1X requirement. This was a while ago, and though the tasks were Einstein, they were certainly not the GW type, and probably not GRP on the current application. So this is just illustrative. But the wonderful thing was that these tasks all needed the same amount of memory.
1X--1640
2X--2165
3X--2861
4X--3635
One issue is what amount of the memory use has nothing to do with BOINC. That presumably does not increase with multiplicity. A second issue is the question of whether any of the Einstein task memory requirement gets shared among multiple instances.
While I have not seen posts giving reported memory use, I have seen a post asserting that 2X was working on a machine for which a speculation was that it could only work if a needy task got lucky enough to be paired with a less needy task--every time. I suggest a different possible reason--sublinear memory requirement with multiplicity increase. (OK, for the picky, maybe linear, but not going through the 0,0 origin).
If I may ? Quick back-of-the-envelope fit for the data points :
1X--1640
2X--2165
3X--2861
4X--3635
.... is well matched with a linear assumption at => 668 * X + 905 ( actually I found an online linear regression calculator ) ie. 905 when there is no WU on the card and each extra the price of 668 per task. So as you say not going through the origin for sure, and ~ 900 the non-BOINC load.
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
I have been running my AMD Radeon 5700 under Ubuntu with the latest recommended Linux driver for long enough with no apparent problems that I claim the driver issue has been fixed.
So I ordered another 5700 (open box). Under advice from a top 50 box running 5700's I am running 3 gpu tasks.
1 ~ 6.5 minutes
2 ~ 10.5 minutes
3 ~ 15.5 minutes
Tom M
what task type for these run times? Gamma Ray? or Gravitational Wave?
Opps. Sorry.
These are my preliminary #'s for Pulsar Search #1 (Gamma Ray) (gpu).
I should have some GW gpu later. I just switched it on with a two tasks per gpu to avoid the 3GB memory problem.
Tom M
Some preliminary Linux results with Gravitational Wave on an R5700 under Ubuntu 18.4.x using the AIO Boinc Manager from Tbar/petri.
1 task ~ 16.2 Minutes
2 tasks ~ 16.9-17.1 minutes
3 tasks ~ 18.9 minutes
Mixed Gamma Ray/Gravity Wave (3 tasks)
1 GRsearch ~ 15 minutes?
2 GW ~21 minutes
Notes:
1) GW tasks do not peg the gpu utilization like the GR pulsar tasks do.
2) Nvidia GW gpu tasks occaisionally blow up to 3GB of memory under Linux. But apparently the Linux versions do not (so far).
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
archae86 wrote:but I suspect
)
With quite a bit of evidence from running four 5700 cards in three systems at 3X for days, I'm reversing this position. Not only does it seem that running at 3X does not impose a meaningful validity penalty over 2X, but also it does give a meaningful improvement in output and power efficiency. On my Windows 10 systems with three different CPUs it also does not seem to impair interactive responsiveness.
The benefit I see from running 3X vs. 2X varied appreciably among my three systems. Something about the host characteristics (CPU, motherboard, ...) meant somewhat uneven Einstein productivity at 2X on the three machines, with corresponding differences in GPU Load in the area of 90 to 95%. Running at 3X, compressed the differences, with all four 5700 cards on all three systems reporting GPU Load very near 98.0%, and a pretty tight range of elapsed times when I kept power limitation at a level which gave reasonably comparable GPU clock rates.
The actual productivity improvement I saw from going from 2X to 3X ranged from about 5% to about 12%, depending on the system. Up until I had a try at the Radeon VII and the RX 570, I've been an NVidia guy here at Einstein, and in recent applications, the available improvement in going from 2X to 3X on my Nvidia cards was nowhere near that high. Also the polling method used for detecting GPU support needs of the GPU on the Einstein Nvidia applications meant that the impact on host system interactive performance of adding a task was appreciable. so I did not run at 3X anywhere, and actually limited myself to 1X on my wife's system, which is a hyperthreaded CPU with two physical cores.
Imagine my surprise when I found that my system which currently hosts two 5700 cards with a CPU, which while pretty fast, had only four cores and no hyperthreading, was very happy to run at 3X. Of course, BOINC would have refused to start the required six tasks had it believed I only had four cores, but I had previously adjusted matters to lie to BOINC, claiming 16 CPUs, as a means of obtaining adequate daily task dispatch limit when this same system was the host for a Radeon VII.
Speaking of that Radeon VII, it is worth mentioning that the system in question is currently appreciably more productive running two RX 5700 cards at 3X than it was running on Radeon VII at 2X (about 1.8M credits/day vs. 1.6, not nominal but actual). It currently burns somewhat more power, but there is considerable room to trade a substantial reduction in power consumption for a moderate reduction in credit rate.
This post is already long, so I'll save other 5700 observations for later, but suffice it to say that at the moment I am quite happy to have done a 100% conversion of my little three host flotilla from RX 570 cards (most recently) to RX 5700 cards. I must repeat my previous thanks to Gavin and to Mumak for their initial reports that with new driver releases the RX 5700 behavior on Einstein GRP had done a sudden change from almost 100% failure rate to an ordinary invalid rate (currently about 1.6% on my flotilla).
archae86 wrote: I must repeat
)
Thank you for including me in your recent thanks but really all the credit should go to Mumak for his initial testing and reporting of his findings. I only jumped on the new driver bandwagon because of him!
I have now switched my 5700XT machine to run tasks x3 on the back of your reports :-) Let's see if I can gain the same sort of improvement you have seen.
Gav.
It looks like all is fine
)
It looks like all is fine now: https://einsteinathome.org/workunit/451441061 etc (my PC is 12809247, driver is amdgpu-pro-20.10-1048554-ubuntu-18.04 ).
Vit wrote: It looks like all
)
So we can run Navi 10's under Linux without the OCL problem now?
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I have been running my AMD
)
I have been running my AMD Radeon 5700 under Ubuntu with the latest recommended Linux driver for long enough with no apparent problems that I claim the driver issue has been fixed.
So I ordered another 5700 (open box). Under advice from a top 50 box running 5700's I am running 3 gpu tasks.
1 ~ 6.5 minutes
2 ~ 10.5 minutes
3 ~ 15.5 minutes
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Tom M wrote: I have been
)
what task type for these run times? Gamma Ray? or Gravitational Wave?
_________________________________________________________________________
Ian&Steve C. wrote:Tom M
)
Opps. Sorry.
These are my preliminary #'s for Pulsar Search #1 (Gamma Ray) (gpu).
I should have some GW gpu later. I just switched it on with a two tasks per gpu to avoid the 3GB memory problem.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Tom M wrote:I should have
)
If it works well at 2X, you might find it interesting to check 3X. I suspect it won't run out of memory.
I've not noticed that anyone talking about the 3GB problem has posted actual readings of memory use at different multiplicities. Of course the situation is complicated by variation in the requirements of individual tasks.
A few years ago when I had a new Nvidia 2080 I did some initial checkout work twisting various knobs. The very first thing I tried was 1X, 2X, 3X, 4X and by good luck my log sheet has a column titled AvgMemUse. I presume this was a GPU-Z reading. The result is distinctly lower than a linear extrapolation from the 1X requirement. This was a while ago, and though the tasks were Einstein, they were certainly not the GW type, and probably not GRP on the current application. So this is just illustrative. But the wonderful thing was that these tasks all needed the same amount of memory.
1X--1640
2X--2165
3X--2861
4X--3635
One issue is what amount of the memory use has nothing to do with BOINC. That presumably does not increase with multiplicity. A second issue is the question of whether any of the Einstein task memory requirement gets shared among multiple instances.
While I have not seen posts giving reported memory use, I have seen a post asserting that 2X was working on a machine for which a speculation was that it could only work if a needy task got lucky enough to be paired with a less needy task--every time. I suggest a different possible reason--sublinear memory requirement with multiplicity increase. (OK, for the picky, maybe linear, but not going through the 0,0 origin).
Quick back-of-the-envelope
)
If I may ? Quick back-of-the-envelope fit for the data points :
1X--1640
2X--2165
3X--2861
4X--3635
.... is well matched with a linear assumption at => 668 * X + 905 ( actually I found an online linear regression calculator ) ie. 905 when there is no WU on the card and each extra the price of 668 per task. So as you say not going through the origin for sure, and ~ 900 the non-BOINC load.
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Tom M wrote: Ian&Steve C.
)
Some preliminary Linux results with Gravitational Wave on an R5700 under Ubuntu 18.4.x using the AIO Boinc Manager from Tbar/petri.
1 task ~ 16.2 Minutes
2 tasks ~ 16.9-17.1 minutes
3 tasks ~ 18.9 minutes
Mixed Gamma Ray/Gravity Wave (3 tasks)
1 GRsearch ~ 15 minutes?
2 GW ~21 minutes
Notes:
1) GW tasks do not peg the gpu utilization like the GR pulsar tasks do.
2) Nvidia GW gpu tasks occaisionally blow up to 3GB of memory under Linux. But apparently the Linux versions do not (so far).
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!