All things Nvidia GPU

Tom M

Joined: 2 Feb 06

Posts: 6453

Credit: 9579843175

RAC: 7430300

YummyCheese,Tom M

7 Nov 2024 16:15:09 UTC

Message 229623 in response to message 229621

(moderation:

)

YummyCheese,

Tom M wrote:

You should only compare your results to other Windows systems. Linux has a Nvidia MPS which is not available under Windows.

The current Top performing Windows box is running Brp7/MeerKat tasks. They are taking about taking 390s or less. So I am presuming he is running 1x. The catch is he has 8 gpus running. Possibly 8 rtx 3080's.

A rig somewhat closer to yours has 2 Rtx 3080 ti's.He is running a mix of Brp7/MeerKat and the All-Sky GW tasks.

In either case, they are not getting even close to 3M / GPU.

HTH,

Tom M

###edit###

My own experience is you don't want to try the above two apps at the same time on the same GPU.

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

KLiK

Joined: 1 Apr 14

Posts: 67

Credit: 432378463

RAC: 1287576

Does E@h have some "Top GPU"

18 Nov 2024 22:26:17 UTC

Message 229935

(moderation:

)

Does E@h have some "Top GPU" list, like SETi@home or MilkyWay@home had? So we can see the Top (10) GPUs on E@h?

Thanks

non-profit org. Play4Life in Zagreb, Croatia, EU

Tom M

Joined: 2 Feb 06

Posts: 6453

Credit: 9579843175

RAC: 7430300

KLiK wrote: Does E@h have

18 Nov 2024 22:32:27 UTC

Message 229936 in response to message 229935

(moderation:

)

KLiK wrote:

Does E@h have some "Top GPU" list, like SETi@home or MilkyWay@home had? So we can see the Top (10) GPUs on E@h?

Thanks

I am not aware of one. Maybe someone else has found one.

A little data analysis of the Top 50 list will give you at least a list of "suspects".

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

Tom M

Joined: 2 Feb 06

Posts: 6453

Credit: 9579843175

RAC: 7430300

Tom M wrote: KLiK

19 Nov 2024 18:35:58 UTC

Message 229960 in response to message 229936

(moderation:

)

Tom M wrote:

KLiK wrote:

Does E@h have some "Top GPU" list, like SETi@home or MilkyWay@home had? So we can see the Top (10) GPUs on E@h?

Thanks

I am not aware of one. Maybe someone else has found one.

A little data analysis of the Top 50 list will give you at least a list of "suspects".

:)

For instance if you review the single GPU systems listed.

Rtx 4090 is number one

Rtx 3080 ti is two

Radian Rx 6900 xt is three.

Titan V is five.

Etc.

If you go through the list and divide the # of GPU's into the RAC you can get a guesstimate of the individual GPU RAC. This will allow you to do a rough apples to apples comparison.

Nvidia Tesla V100-SXM2-16GB (16144MB is averaging a little over 3M per GPU.

No multiple GPU system appears to equal the rtx 4090.

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 240

Credit: 10589455586

RAC: 21902845

Tom M wrote: Tom M

19 Nov 2024 20:00:40 UTC

Message 229962 in response to message 229960

(moderation:

)

Tom M wrote:

Tom M wrote:

KLiK wrote:

Does E@h have some "Top GPU" list, like SETi@home or MilkyWay@home had? So we can see the Top (10) GPUs on E@h?

Thanks

I am not aware of one. Maybe someone else has found one.

A little data analysis of the Top 50 list will give you at least a list of "suspects".

:)

For instance if you review the single GPU systems listed.

Rtx 4090 is number one

Rtx 3080 ti is two

Radian Rx 6900 xt is three.

Titan V is five.

Etc.

If you go through the list and divide the # of GPU's into the RAC you can get a guesstimate of the individual GPU RAC. This will allow you to do a rough apples to apples comparison.

Nvidia Tesla V100-SXM2-16GB (16144MB is averaging a little over 3M per GPU.

No multiple GPU system appears to equal the rtx 4090.

The Titan V on the list is still "stabilizing". It has been up and running for maybe a week or so with the new system with current settings that we think will be best. Its RAC is still going up, but slowly. We are interested to see where it ends up.

KLiK- The GPU will only tell part of the story with some of the work right now. The O3 work right now is also very dependent on the CPU even though it also relies on the GPU. Not all projects are so dependent, but this one definitely is.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3956

Credit: 46952672642

RAC: 64619576

Boca Raton Community HS

19 Nov 2024 21:09:53 UTC

Message 229964 in response to message 229962

(moderation:

)

Boca Raton Community HS wrote:

The Titan V on the list is still "stabilizing". It has been up and running for maybe a week or so with the new system with current settings that we think will be best. Its RAC is still going up, but slowly. We are interested to see where it ends up.

KLiK- The GPU will only tell part of the story with some of the work right now. The O3 work right now is also very dependent on the CPU even though it also relies on the GPU. Not all projects are so dependent, but this one definitely is.

based on the daily production, seems like it'll end up somewhere around 3.1M points per day. the fast CPU is helping it power through!

running MPS and 5x tasks at a time I assume?

_________________________________________________________________________

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 240

Credit: 10589455586

RAC: 21902845

Ian&Steve C. wrote: Boca

19 Nov 2024 21:20:13 UTC

Message 229965 in response to message 229964

(moderation:

)

Ian&Steve C. wrote:

Boca Raton Community HS wrote:

The Titan V on the list is still "stabilizing". It has been up and running for maybe a week or so with the new system with current settings that we think will be best. Its RAC is still going up, but slowly. We are interested to see where it ends up.

KLiK- The GPU will only tell part of the story with some of the work right now. The O3 work right now is also very dependent on the CPU even though it also relies on the GPU. Not all projects are so dependent, but this one definitely is.

based on the daily production, seems like it'll end up somewhere around 3.1M points per day. the fast CPU is helping it power through!

running MPS and 5x tasks at a time I assume?

Correct!

We retired an older system and got the 14900KS up and running (with KLEVV CRAS V RAM running stable at 8400MHz out of the box). Although the CPU has a bad reputation, it is incredible when undervolted and NOT running all-core (almost impossible to cool since the temp spikes in ~1 second). And, when not overclocked (and because it is undervolted), it is actually a relatively efficient system. Running MPS at 40%, but playing with that number a little bit but not noticing a difference. The final recalc step takes roughly 1 minute on the system across all 5 cores running tasks. Students put it all together and now fine-tuning.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3956

Credit: 46952672642

RAC: 64619576

I was also just looking at

19 Nov 2024 21:25:32 UTC

Message 229966

(moderation:

)

I was also just looking at the A6000 Threadripper 5995 host.

are you really running 11-12x tasks per GPU? or is this system not running all the time? what MPS setting?

_________________________________________________________________________

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 240

Credit: 10589455586

RAC: 21902845

Ian&Steve C. wrote:I was

21 Nov 2024 14:10:11 UTC

Message 229968 in response to message 229966

(moderation:

)

Ian&Steve C. wrote:

I was also just looking at the A6000 Threadripper 5995 host.

are you really running 11-12x tasks per GPU? or is this system not running all the time? what MPS setting?

Right now, yes it is. It has not been fully optimized but here was our line of reasoning- anyone is welcome to chime in if there are flaws in it. That system runs 100% across all cores with cpu research (usually WCG when they are able to keep the system full). That is slowing the clock speed of the cpu, and we accept that. From our observations of other systems running this work, there are always periods of time where the gpu cores and memory bus are not fully utilized due to recalc steps. This is understandable and accepted on basically all of our systems even if the vram is ~80% full. We don't have that issue with the A6000 gpus because the vram is huge. If we run 12 tasks concurrent, then the gpu stays fully saturated (cores and memory bus), with only momentary dips. We understand that we CAN overload the gpu and tasks would slow down. We still have to refine the amout but we saw idle time at 10x so we set to 12x if I remember correctly. This still needs to be refined, along with MPS. It's running MPS ar 40% but I know that no single task will use more than 40% at any given time. We still have to refine the percent.

If I am way off target with reasoning, let me know.

Edit 11/21- We have already bumped it down to 10x. Might still go lower. I believe that when we originally tried 10x and saw idle time, all of the tasks started at roughly the same time and could not "naturally" separate from each other. Now, we are adjusting the concurrent task count on the fly while BOINC is open (read config file) and the tasks are already spaced out from each other. I think we still might go down from 10x but it is hard to know where it will end up. We will be almost completely offline next week so something we will eventually figure out. Not sure how to approach MPS percentage except trial and error.

KLiK

Joined: 1 Apr 14

Posts: 67

Credit: 432378463

RAC: 1287576

Tom M wrote:Tom M

22 Nov 2024 15:12:26 UTC

Message 230034 in response to message 229960

(moderation:

)

Tom M wrote:

Tom M wrote:

KLiK wrote:

Does E@h have some "Top GPU" list, like SETi@home or MilkyWay@home had? So we can see the Top (10) GPUs on E@h?

Thanks

I am not aware of one. Maybe someone else has found one.

A little data analysis of the Top 50 list will give you at least a list of "suspects".

:)

For instance if you review the single GPU systems listed.

Rtx 4090 is number one

Rtx 3080 ti is two

Radian Rx 6900 xt is three.

Titan V is five.

Etc.

If you go through the list and divide the # of GPU's into the RAC you can get a guesstimate of the individual GPU RAC. This will allow you to do a rough apples to apples comparison.

Nvidia Tesla V100-SXM2-16GB (16144MB is averaging a little over 3M per GPU.

No multiple GPU system appears to equal the rtx 4090.

Thanks, will do that...but it would be nice for E@h to have such a feature, as it is largely used on other projects, so it must be already built in the system (here also).

non-profit org. Play4Life in Zagreb, Croatia, EU

All things Nvidia GPU

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner