This amd rx 6900 xt system running under a Linux is pumping out 1.5M rac for $450 (current used price on eBay).
So 6M Rac 4 GPU's vs 4M Rac for Titan V. The V costs maybe $50-100 more per CPU.
Yes, I am ignoring the slot width issue for this speculation.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Radeon VII is almost certainly the AMD sweet spot, for Einstein.
closes in on 3M RAC (depending on CPU) with O3AS, ebay price ranging from $200-250.
I missed that. All the Radeon VII's I looked at were not running near 3M for a single GPU. And were not running near 6M for 2 gpu's.
Hmmmm....
So why is it running O3AS so well?
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Sure looks like a fast CPU is a major factor. I over looked either in the top 50. If they are not there then they didn't stay with it?
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
@Tom M: may I ask how many concurrent BRP7 tasks you run on the iGPU of your Ryzen 5700G? I assume 2 considering the difference in runtime? And how many CPU tasks with it?
I seem to hit some kind of memory bandwidth wall at 12x Milkyway on CPU and 1x BRP7 on the iGPU, if I add any more CPU tasks, the GPU utilisation begins to drop.
Sure looks like a fast CPU is a major factor. I over looked either in the top 50. If they are not there then they didn't stay with it?
A fast CPU definitely makes a big difference on the GW tasks.
My 2x Radeon VIIs are frequently in the top 10 hosts on a ppd basis. The RAC is just not very high, since those GPUs are not always in the same host (as seen by Einstein and BOINC). I'm also not running those cards on Einstein 24/7/365, and when they are running, they will sometimes be on a different BOINC instance, in a VM for testing, etc. FWIW that host is #9 all time, so I am sticking with it. ;)
Once I migrate everything over to my new 7960X workstation, I'll try to run some isolated tests. With a faster CPU 3M ppd is within reach for a Radeon VII. With an OC on the Radeon VII the raw performance will be pretty good, but the ppd/watt might not be all that impressive.
@Tom M: may I ask how many concurrent BRP7 tasks you run on the iGPU of your Ryzen 5700G? I assume 2 considering the difference in runtime? And how many CPU tasks with it?
I seem to hit some kind of memory bandwidth wall at 12x Milkyway on CPU and 1x BRP7 on the iGPU, if I add any more CPU tasks, the GPU utilisation begins to drop.
@Link,
I run 1 brp7/meerKat at a time on my 5700g. I usually am set to only use 14 out of 16 threads overall. I have not noticed the brp7 slowing down when the CPU is loaded. But I don't usually run Asteroids at home.
I am running 4 memory modules (8m x 4 slots).
I have everything on NNT at the moment. I was planning on shutting it down.
But you raise some interesting questions. I will set it up for those two projects.
I don't pay close attention to my daily driver results so I could easily be missing the variation you spotted.
===edit===
@Link,
I have both of those projects started with 1 iGpu brp7 and 6 A@h. I have an 8 thread PrimeGrid running for about another 8 hours. Then I will have 1 task and 12 tasks to get stable. Once I have a baseline I can try 13/14. I need to get the brp7 cpu set down to 0.1 so it is using the same settings as my Linux systems.
I do have the Oracle version of the Boinc Client loaded and enabled at the bios level. I was trying to get some LHC@Home tasks. No luck.
===more edit====
It looks like the reported floating point test for Boinc is as least 10% faster than my 5700G. I have made no attempt to overclock my Cpu/iGpu's. I assume you have or we are looking at the "luck of the draw".
???????
It looks like I may have been confused. After applying the TPUI air cooled automated tuning and getting a reported 13% gain my benchmark is what I thought I remembered yours was.
In any case, my bios doesn't seem to offered any OC for the iGpu.
And I am running Win11 to your Win10. May check and see if AMD has any driver upgrades. Done.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I run 1 brp7/meerKat at a time on my 5700g. I usually am set to only use 14 out of 16 threads overall. I have not noticed the brp7 slowing down when the CPU is loaded. But I don't usually run Asteroids at home.
Hmm... that's pretty slow than, with 12 threads loaded with Milkyway (not Asteroids, that might be different with all those power limits in modern CPUs) I get one BRP7 done in about 3950 seconds, now I added one more Milkyway and that seems to add something like 50 seconds to that, which is still OK I think. I will test 14 later, still experimenting.
Tom M wrote:
I am running 4 memory modules (8m x 4 slots).
I have two sticks DDR4 @3400MT/s. Actually they are 3600, but the system wasn't stable at anything above 3400, @3400 it seems perfectly stable so far.
Tom M wrote:
It looks like the reported floating point test for Boinc is as least 10% faster than my 5700G. I have made no attempt to overclock my Cpu/iGpu's.
Not overclocked either except the RAM, just undervolted. Yes, I was pretty lucky with that chip, -30 on all cores and -20 on the iGPU. -30 on iGPU worked for Moo! but not for BRP7, that's where the one errored task actually comes from. Doesn't make much difference in power consumption, so I guess I'll leave it where it is now as long as all tasks validate.
With Milkyway using pretty little energy compared to anything else, the CPU is boosting constantly @4.6GHz even if I run 16 of their single core N-Body WUs concurrently, but than there's nothing for the GPU left and that would run @400MHz most of the time, so now I'm searching for optimal amount of threads on the CPU while watching what the hardware is doing with HWiNFO64. I guess with Asteroids or PrimeGrid with their pretty highly optimized applications, the situation might be completely different.
https://einsteinathome.org/ho
)
https://einsteinathome.org/host/13168193
AMD sweet spot?
This amd rx 6900 xt system running under a Linux is pumping out 1.5M rac for $450 (current used price on eBay).
So 6M Rac 4 GPU's vs 4M Rac for Titan V. The V costs maybe $50-100 more per CPU.
Yes, I am ignoring the slot width issue for this speculation.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Radeon VII is almost
)
Radeon VII is almost certainly the AMD sweet spot, for Einstein.
closes in on 3M RAC (depending on CPU) with O3AS, ebay price ranging from $200-250.
_________________________________________________________________________
Ian&Steve C. wrote: Radeon
)
I missed that. All the Radeon VII's I looked at were not running near 3M for a single GPU. And were not running near 6M for 2 gpu's.
Hmmmm....
So why is it running O3AS so well?
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Fairly sure it's because of
)
Fairly sure it's because of its HBM memory.
Keith Myers wrote: Fairly
)
and AMD seems to have always been fairly efficient when running multiple task.
Tom, there was a whole thread about Radeon VII performance: Testing Radeon VII on All-Sky Gravitational Wave O3 (O3AS)
fast CPU helps a lot.
_________________________________________________________________________
Ian&Steve C. wrote: Keith
)
Sure looks like a fast CPU is a major factor. I over looked either in the top 50. If they are not there then they didn't stay with it?
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
@Tom M: may I ask how many
)
@Tom M: may I ask how many concurrent BRP7 tasks you run on the iGPU of your Ryzen 5700G? I assume 2 considering the difference in runtime? And how many CPU tasks with it?
I seem to hit some kind of memory bandwidth wall at 12x Milkyway on CPU and 1x BRP7 on the iGPU, if I add any more CPU tasks, the GPU utilisation begins to drop.
.
Tom M wrote:Ian&Steve C.
)
A fast CPU definitely makes a big difference on the GW tasks.
My 2x Radeon VIIs are frequently in the top 10 hosts on a ppd basis. The RAC is just not very high, since those GPUs are not always in the same host (as seen by Einstein and BOINC). I'm also not running those cards on Einstein 24/7/365, and when they are running, they will sometimes be on a different BOINC instance, in a VM for testing, etc. FWIW that host is #9 all time, so I am sticking with it. ;)
Once I migrate everything over to my new 7960X workstation, I'll try to run some isolated tests. With a faster CPU 3M ppd is within reach for a Radeon VII. With an OC on the Radeon VII the raw performance will be pretty good, but the ppd/watt might not be all that impressive.
Link wrote:@Tom M: may I
)
@Link,
I run 1 brp7/meerKat at a time on my 5700g. I usually am set to only use 14 out of 16 threads overall. I have not noticed the brp7 slowing down when the CPU is loaded. But I don't usually run Asteroids at home.
I am running 4 memory modules (8m x 4 slots).
I have everything on NNT at the moment. I was planning on shutting it down.
But you raise some interesting questions. I will set it up for those two projects.
I don't pay close attention to my daily driver results so I could easily be missing the variation you spotted.
===edit===
@Link,
I have both of those projects started with 1 iGpu brp7 and 6 A@h. I have an 8 thread PrimeGrid running for about another 8 hours. Then I will have 1 task and 12 tasks to get stable. Once I have a baseline I can try 13/14. I need to get the brp7 cpu set down to 0.1 so it is using the same settings as my Linux systems.
I do have the Oracle version of the Boinc Client loaded and enabled at the bios level. I was trying to get some LHC@Home tasks. No luck.
===more edit====
It looks like the reported floating point test for Boinc is as least 10% faster than my 5700G. I have made no attempt to overclock my Cpu/iGpu's. I assume you have or we are looking at the "luck of the draw".
???????
It looks like I may have been confused. After applying the TPUI air cooled automated tuning and getting a reported 13% gain my benchmark is what I thought I remembered yours was.
In any case, my bios doesn't seem to offered any OC for the iGpu.
And I am running Win11 to your Win10. May check and see if AMD has any driver upgrades. Done.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Tom M wrote:I run 1
)
Hmm... that's pretty slow than, with 12 threads loaded with Milkyway (not Asteroids, that might be different with all those power limits in modern CPUs) I get one BRP7 done in about 3950 seconds, now I added one more Milkyway and that seems to add something like 50 seconds to that, which is still OK I think. I will test 14 later, still experimenting.
I have two sticks DDR4 @3400MT/s. Actually they are 3600, but the system wasn't stable at anything above 3400, @3400 it seems perfectly stable so far.
Not overclocked either except the RAM, just undervolted. Yes, I was pretty lucky with that chip, -30 on all cores and -20 on the iGPU. -30 on iGPU worked for Moo! but not for BRP7, that's where the one errored task actually comes from. Doesn't make much difference in power consumption, so I guess I'll leave it where it is now as long as all tasks validate.
With Milkyway using pretty little energy compared to anything else, the CPU is boosting constantly @4.6GHz even if I run 16 of their single core N-Body WUs concurrently, but than there's nothing for the GPU left and that would run @400MHz most of the time, so now I'm searching for optimal amount of threads on the CPU while watching what the hardware is doing with HWiNFO64. I guess with Asteroids or PrimeGrid with their pretty highly optimized applications, the situation might be completely different.
.