EM searches, BRP Raidiopulsar and FGRP Gamma-Ray Pulsar

Mr P Hucker

Joined: 12 Aug 06

Posts: 838

Credit: 519261995

RAC: 15425

"In addition, about 350

18 Jul 2022 1:08:13 UTC

Message 198861

(moderation:

)

"In addition, about 350 high-performance, specialized graphics cards (GPUs) have been added in parallel with about 2,000 existing cards for specialized applications. These additions increase Atlas' theoretical peak computing performance to more than 2 PFLOP/s."

That doesn't sound much. I have a £150 GPU that does a theoretical 8 Tflops. They only have 250 times the power of one of my GPUs, yet they say they have 2350 GPUs.

The UPS beats mine though, I only have 1.5kW. But with deep cycle leisure batteries it can last for an eternity. None of the sealed lead acid crap that comes with it. My neighbour once asked me why all my lights were on during a powercut :-)

Pah! "Each cable is rated for 10 Gb/s." I have a 40Gb/s cable between my house and garage. I can't find switches and network cards that go that fast though :-(

They seem to have neater wiring than Summit though:

If this page takes an hour to load, reduce posts per page to 20 in your settings, then the tinpot 486 Einstein uses can handle it.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 315908530

RAC: 332887

Yup, the technology curve for

18 Jul 2022 2:08:36 UTC

Message 198862 in response to message 198861

(moderation:

)

Yup, the technology curve for GPUs especially is such that by the time it's installed it is well out of date.

Our server status page puts E@H at 13179.3 TFLOPS ~ 13 PFLOPS ( estimated from collective RAC ).

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Filipe

Joined: 10 Mar 05

Posts: 186

Credit: 405876026

RAC: 407775

Peter Hucker of the Scottish

18 Jul 2022 11:07:26 UTC

Message 198869 in response to message 198861

(moderation:

)

Peter Hucker of the Scottish Boinc Team wrote:

"In addition, about 350 high-performance, specialized graphics cards (GPUs) have been added in parallel with about 2,000 existing cards for specialized applications. These additions increase Atlas' theoretical peak computing performance to more than 2 PFLOP/s."

That doesn't sound much. I have a £150 GPU that does a theoretical 8 Tflops. They only have 250 times the power of one of my GPUs, yet they say they have 2350 GPUs.

The UPS beats mine though, I only have 1.5kW. But with deep cycle leisure batteries it can last for an eternity. None of the sealed lead acid crap that comes with it. My neighbour once asked me why all my lights were on during a powercut :-)

Pah! "Each cable is rated for 10 Gb/s." I have a 40Gb/s cable between my house and garage. I can't find switches and network cards that go that fast though :-(

They seem to have neater wiring than Summit though:

All our GPU compute power is not enough?

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46724372642

RAC: 64319308

Mike Hewson wrote: (

18 Jul 2022 14:15:10 UTC

Message 198876 in response to message 198862

(moderation:

)

Mike Hewson wrote:

( estimated from collective RAC ).

and this is the problem. this is not a valid way to measure FLOPS at all. this increase to ~13 PFLOPS is largely from the shifting of systems from O3AS gravitational wave tasks (which awarded much less credit) when GW ran out, over to the only available GPU work, FGRPB1G, which awards ~10x more credit per unit time. so it "looks" like FLOPS increased just because people started earning much more credit with the same devices.

_________________________________________________________________________

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46724372642

RAC: 64319308

Filipe wrote: All our GPU

18 Jul 2022 14:28:54 UTC

Message 198877 in response to message 198869

(moderation:

)

Filipe wrote:

All our GPU compute power is not enough?

nope. according to the server status page, there are ~13,000 hosts with either an Nvidia or AMD GPU (last 7 days). the vast majority of those are probably slower, low end devices. and there will be some percentage that aren't even crunching or crunching other projects. there are ~1.6 million WUs that still need processed, which equates to ~3.2 million tasks that need to be completed, not even accounting for errors and invalids resulting in resends.

_________________________________________________________________________

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 315908530

RAC: 332887

In a real sense we can always

19 Jul 2022 0:58:00 UTC

Message 198897

(moderation:

)

In a real sense we can always exceed the computing power of E@H. Or Atlas for that matter, or <*insert you favourite supercomputer here*>. The multidimensional parameter spaces for these searches can be explored in different ways to look for new signals & regularities. Nowadays there is such a mass of information available from the various detection devices, so there always a wealth of data. What is discarded as noise for one search template may constitute a detection for another. That's because all manner of radiation traverses the universe and our local space. Suppose for a given investigation the search sensitivity goes like the square root of the signal integration time, then to double your chances of finding something you need to quadruple the time. This is quite typical : you can always 'listen' for longer to access the 'quieter' sources. There will always be something to do here at E@H. ;-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117531040187

RAC: 35372873

Ian&Steve C. wrote:The days

19 Jul 2022 10:32:27 UTC

Message 198906 in response to message 198839

(moderation:

)

Ian&Steve C. wrote:

The days remaining estimate is derived from the 5-day average WU completed per day metric. That 5-day average has been dropping since it looks like the “Atlas Condor Jobs” super computer ...

I haven't looked at the stats pages for quite a while so the mention of "Atlas Condor Jobs" surprised me since (in years past) the only Atlas entry was for "Atlas AEI Hannover".

So I took a look at the top participants list and can see the Condor entry but none for AEI Hannover. The current value of total credit for Condor is way too low for everything that Atlas had accumulated so it can't just be a rename. From memory, Atlas AEI Hannover had a total credit of many tens of billions.

Realising that the table is constructed in RAC order rather than Total Credit, I clicked on the Total heading to get a reordering. It actually starts from the bottom up so I had to click twice :-). That caused Atlas AEI Hannover to appear as #2, so it's still in the list but no longer usually visible because its RAC is less than 7M these days.

That was quite a blast from the past as several other 'high producers' also appeared as well. In particular I remember Gavin who had some high producing machines and was active in the Forums some years ago. His current RAC is only a shadow of what it was but he must still be around since it's still significant.

It's quite a reminder that there are people who have contributed a lot in the past but whose efforts are no longer normally seen in the default view. Perhaps the default view should be based on total credit to direct attention to past significant contributions.

Condor doesn't get a look-in (yet) if the ordering is based on Total :-).

Cheers,
Gary.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 315908530

RAC: 332887

Gary Roberts wrote:From

19 Jul 2022 11:21:00 UTC

Message 198908 in response to message 198906

(moderation:

)

Gary Roberts wrote:

From memory, Atlas AEI Hannover had a total credit of many tens of billions.

Now that you mention that : I wonder if this total can possibly be all simply due to 'burn-ins' of new nodes ? There's not that many nodes. If correct this suggests to me that in addition to the role of pre- and post-processing E@H work, maybe it is scheduled to do some of the actual work units when it has nothing else/better to do ( using nodes of any age ). I'm pretty sure that once built these supercomputers are kept busy close to 24/7 @ 100%.

Just a thought.

{ Of course, I know who #1 is. But there is a computer at the bottom of the total credit list owned by 'ballen'. I guess that 2005 era computer, probably the very first enrolled, could not now hack the pace. ;-) }

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250425987

RAC: 35014

Atlas run BOINC in two

19 Jul 2022 12:05:00 UTC

Message 198909

(moderation:

)

Atlas runs BOINC in two different ways: There is a single, low (CPU) priority BOINC client running on every node of Atlas, announcing as many CPU cores as the node has. These clients get and run only CPU jobs. The associated account on E@H is "Atlas AEI Hannover", and it is there basically since 2008.

To make use of the growing number of GPUs on Atlas we recently (~May 2022) developed another scheme of submitting E@H (GPU) tasks as low-priority Condor jobs (minimal priority to not interfere with 'real' people using the GPUs on Atlas). The associated E@H account is "Atlas Condor Jobs". There is basically one host(id) for every GPU on Atlas (~2100). The main reason for setting this up was to help finish the "O3AS1" GW analysis. As this has ended now, the "automatic submission" has been turned off, and the RAC of this account should drop noticeable again.

Conan

Joined: 19 Jun 05

Posts: 172

Credit: 8254819

RAC: 4900

Will any of the new work

19 Jul 2022 12:39:59 UTC

Message 198910

(moderation:

)

Will any of the new work types (or old GPU types be converted to use CPU) be for CPUs as well, can BRP7 be done on a CPU if not why not? Time to run I suppose will be a limiting factor, memory wont be.

The current Arecibo large work units can take up to 13 hours if a number are run together but that is not a problem (getting less credit than gamma ray #5 is).

I would just like to run some new CPU work.

Conan

EM searches, BRP Raidiopulsar and FGRP Gamma-Ray Pulsar

Forums › Technical News

Comment viewing options

Forums › Technical News