I am about to do something that no one has ever done before!

Keith Myers

Joined: 11 Feb 11

Posts: 4963

Credit: 18701004226

RAC: 6260488

Mikey, your spreadsheet needs

28 Feb 2023 21:02:57 UTC

Message 208946 in response to message 208940

(moderation:

)

Mikey, your spreadsheet needs a bit of updating to current project states. Who runs what, who is active still etc.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46620742642

RAC: 64201356

Dido wrote: I was wondering,

28 Feb 2023 21:10:29 UTC

Message 208948 in response to message 208939

(moderation:

)

Dido wrote:

I was wondering, do you own the computers with server CPUs on your profile here?

yes I do. The EPYC systems make great multi-GPU systems.

_________________________________________________________________________

ASROCK

Joined: 12 Jan 15

Posts: 15

Credit: 1327826581

RAC: 0

Does anybody know why I can't

28 Feb 2023 22:24:53 UTC

Message 208954

(moderation:

)

Does anybody know why I can't merge some of my VMs from the "Computers" tab on the Web?

I have manually assigned different external IP addresses to some of them upon creation. Could that be the reason?

EDIT: I'm also seeing some weird GPU load dips from 98% to 55% on multiple GPUs in the load profiling software that the data center runs on the entire mainframe. I can confirm the CPU cores are at low utilization, so that's not the problem. Any ideas what could be causing that. Is there a debugging feature available for Boinc apps?

The app in question is "Gamma-ray pulsar binary search #1 on GPUs v1.24 () x86_64-pc-linux-gnu"

mikey

Joined: 22 Jan 05

Posts: 12678

Credit: 1839078099

RAC: 4035

Keith Myers wrote: Mikey,

28 Feb 2023 22:45:23 UTC

Message 208956 in response to message 208946

(moderation:

)

Keith Myers wrote:

Mikey, your spreadsheet needs a bit of updating to current project states. Who runs what, who is active still etc.

It's not mine it's the Official Grid Coin White List page

but yes I agree WCG is no longer run by IBM for a one and Minecraft is no longer active for another.

Rodrigo

Joined: 5 Aug 17

Posts: 22

Credit: 249781581

RAC: 6332

EDIT: I'm also seeing some

28 Feb 2023 23:48:29 UTC

Message 208962 in response to message 208954

(moderation:

)

EDIT: I'm also seeing some weird GPU load dips from 98% to 55% on multiple GPUs in the load profiling software that the data center runs on the entire mainframe. I can confirm the CPU cores are at low utilization, so that's not the problem. Any ideas what could be causing that. Is there a debugging feature available for Boinc apps?

Im sure someone can tell better than me, if im wrong, please, correct it. You mentioned on other post that the GPUs are FP32 capable, as far as i know, when that's the case, with FGRPB1G tasks (the Gamma-ray app you mentioned), the FP64 portion of the task is done on the CPU, so the GPU load drops. But as you also mentioned the CPU cores are at low utilization, i dont really know what could be.

Until last year i had a notebook with a Nvidia MX-150 dedicated graphics crunching FGRPB1G tasks, the last 10% of each task took longer and the GPU load drops with the CPU load increasing. But, the MX-150 is FP64 capable, so i'm not sure whats happening.

EDIT: The Tesla T4 is also FP64 capable, like the MX-150.

Tom M

Joined: 2 Feb 06

Posts: 6432

Credit: 9561767231

RAC: 9947838

Dido, Glad you got it

1 Mar 2023 18:50:08 UTC

Message 209002

(moderation:

)

Dido,

Glad you got it running. On the regular leaderboard you would need all those systems running under a single computer ID to compete for the top individual system.

You should be able to compete for the top user however.

Your nodes will use considerable more power under load than idling. Who is paying for it?

I note you are showing 46 distinct systems. Not 1,000 :)

May this activity not cause you harm.

Tom M

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

ASROCK

Joined: 12 Jan 15

Posts: 15

Credit: 1327826581

RAC: 0

I thought I got it running,

1 Mar 2023 22:01:14 UTC

Message 209021 in response to message 209002

(moderation:

)

I thought I got it running, but no. If I deploy around 20 VMs concurrently, everything seems to work just fine, but as I increase the number of active VMs, everything becomes unstable. On some VMs the Boinc client would randomly crash, on others I see inexplicable performance degradation, chaotic fluctuations in GPU load, while CPU utilization never exceeds 40% per thread. I've looked extensively at the performance and load profiling data from the mainframe and I can't explain why these issues occur. This is a typical case of demonic possesion. Perhaps, I should hire an exorcist. WTF.

To answer your question about who is paying for the electricity - nobody. The entire data center has an independent power grid. The source of that power is renewable energy. When you see big companies brag about being "carbon neutral", this is what it means.

mikey

Joined: 22 Jan 05

Posts: 12678

Credit: 1839078099

RAC: 4035

Dido wrote: I thought I got

1 Mar 2023 22:59:56 UTC

Message 209025 in response to message 209021

(moderation:

)

Dido wrote:

I thought I got it running, but no. If I deploy around 20 VMs concurrently, everything seems to work just fine, but as I increase the number of active VMs, everything becomes unstable. On some VMs the Boinc client would randomly crash, on others I see inexplicable performance degradation, chaotic fluctuations in GPU load, while CPU utilization never exceeds 40% per thread. I've looked extensively at the performance and load profiling data from the mainframe and I can't explain why these issues occur. This is a typical case of demonic possesion. Perhaps, I should hire an exorcist. WTF.

I have no clue but your process to figure it out should help the programmers once you have to give it back to them again.

Quote:

To answer your question about who is paying for the electricity - nobody. The entire data center has an independent power grid. The source of that power is renewable energy. When you see big companies brag about being "carbon neutral", this is what it means.

WOO HOO!!

Tom M

Joined: 2 Feb 06

Posts: 6432

Credit: 9561767231

RAC: 9947838

Did he ever demonstrate the

5 Jun 2023 3:14:10 UTC

Message 213250

(moderation:

)

Did he ever demonstrate the ability to place a system in the top 5 or 10 at e@h?

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46620742642

RAC: 64201356

Tom M wrote: Did he ever

5 Jun 2023 3:47:07 UTC

Message 213252 in response to message 213250

(moderation:

)

Tom M wrote:

Did he ever demonstrate the ability to place a system in the top 5 or 10 at e@h?

each host was setup as a VM with a single GPU. Nvidia T4s aren’t that powerful and probably wouldn’t even make top 50.

but I’m not sure if he ever got it working. He said before that he had a lot of weird issues when all systems were under load.

_________________________________________________________________________

I am about to do something that no one has ever done before!

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner