Benchmarking List for GPUs and Gamma-ray Pulsar binary search

Markus Windisch
Markus Windisch
Joined: 23 Aug 21
Posts: 61
Credit: 97881372
RAC: 0
Topic 226938

Seconds/task in average, LATeah3012 task series, times vary on different machines - these values will be noted as an interval.

Note that this is not really a benchmark, more like an experience. But it may help when you plan to buy a card.

CUDA benchmark is also relevant (although BOINC doesn't use CUDA drivers): link

GTX 660 TI (2GB) - 3332

GTX 750 TI - 1550

AMD RX 5600XT (6GB) - 253

RTX 2070 (8GB) - 253

RTX 3060 (12GB) OC - 223 (OC) 

RTX 2080 - 200 (OC)

RTX 3060TI LHR (8GB) - 180

RTX 2080Ti - 127-142 (OC) 

RTX 3070Tis - 128 (OC) 

RTX A6000 - 113

TITAN V - 112 (OC) 

Radeon VII - 98  (OC) 

RTX 3080Ti - 90 (OC) 

Cheers!

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46750502642
RAC: 64129220

My 3080Tis do them in about

I run 3x per GPU with these current tasks  times listed are 1/3 of actual runtime to reflect effective task speed  


My RTX 3080Tis do them in about 90 seconds per task. (+50 core, +1000 mem, 300W PL) 

 

my RTX 3070Tis do them in about 128 seconds per task (+50 core, +1000 mem, 230W PL) 

 

my RTX 2080Tis do them in about 142 seconds per task. (+50 core, +400 mem, 225W PL) 

 

before I stopped using them, the RTX 2080 was doing about 200 seconds per task (+50 core, +400 mem, 185W PL) 

 

(note: these are times for the current LAH3000 series tasks, 4000 series run slower) 

_________________________________________________________________________

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220614931
RAC: 940490

Ian&Steve C. wrote:(note:

Ian&Steve C. wrote:

(note: these are times for the current LAH3000 series tasks, 4000 series run slower) 

Yes, that is an important thing to control for in making any comparisons.

A much tinier, but non-zero complication is that the current issue of 3000 series tasks appears to me to be requiring progressively slightly more computation week-by-week, in contrast to the multi-month stability of effort level I think the 4000 series possessed.

Markus Windisch
Markus Windisch
Joined: 23 Aug 21
Posts: 61
Credit: 97881372
RAC: 0

I don't know what that means

I don't know what that means exactly, but my tasks seem to be 3000ish series, too (LATeah3012L00_884.0_0_0.0_9384993_0 f.e.)

It's amazing how much these newer cards can crunch compared to older ones. They are 6x in a gaming benchmark, but 30x in crunching with BOINC (660 ti vs RTX3080)

petri33
petri33
Joined: 4 Mar 20
Posts: 123
Credit: 4030645819
RAC: 7008964

The variation in the tasks of

The variation in the tasks of 3-series seems to be relate to the ...

"I don't know what that means exactly, but my tasks seem to be 3000ish series, too (LATeah3012L00_884.0_0_0.0_9384993_0 f.e.)"

... 884 part of the tasks name.

The higher the number at the place of '884' the longer it takes to process. That is what I have observed.

*nag and hope*

*i* ()___)_____________))))~~

It messes up with my testing and timing. If the series changes I have to start all over testing (in the production environment i.e. Einstein (your) database) .

I'd really need a performance testing environment that the Lunatics provided when developing and optimizing Seti@home SW. 

This --> A user defined set of tasks and executable versions and a comparison of run times in percents and a check of result validity.

------------- and other things that came to my mind --------------

"Oh please Code Lords -- Gimme gimme gimme -- ABBA -- time goes by -- Madonna."

p.s.

I'm developing an AIO solution to the NVIDIA users that have at least OpenCL 2.0 and a reeeelatively new GPU. I have RTX 2080 Ti's and a TITAN V (an old one). + No need (?) for a libsleep.so. User editable clFFT NVIDIA kernels to test at will.

In the forthcoming weekend I may be able to give a preliminary version to Ian&Steve to test with. (Should you get one please make a backup of your current version(s). The all-new version may not perform as expected.) -- For now the version I'm running under "anonymous platform" does not seem to reach its full potential.

But as said .. bubbling under. Something is coming. Summer is coming to the northern hemisphere -- I hope.

--

me

Markus Windisch
Markus Windisch
Joined: 23 Aug 21
Posts: 61
Credit: 97881372
RAC: 0

petri33 schrieb: It messes

petri33 wrote:

It messes up with my testing and timing. If the series changes I have to start all over testing (in the production environment i.e. Einstein (your) database) .

I'd really need a performance testing environment that the Lunatics provided when developing and optimizing Seti@home SW.

Damn... well. Can we maybe rely on CUDA Benchmarks? These are available everywhere

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18714890382
RAC: 6364950

Wow! Exciting things coming

Wow! Exciting things coming Petri.  Sound like you have been busy.

I'll be in line after Ian does his tests.

I once tried to figure out how to get the old Lunatics and BenchMT benchmarking scripts working on Einstein.  But after a lot of spinning wheels, all for nought.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18714890382
RAC: 6364950

Markus Windisch

Markus Windisch wrote:

petri33 wrote:

It messes up with my testing and timing. If the series changes I have to start all over testing (in the production environment i.e. Einstein (your) database) .

I'd really need a performance testing environment that the Lunatics provided when developing and optimizing Seti@home SW.

Damn... well. Can we maybe rely on CUDA Benchmarks? These are available everywhere

Einstein does not use CUDA.  They use OpenCL.

 

cecht
cecht
Joined: 7 Mar 18
Posts: 1533
Credit: 2900978889
RAC: 2188333

I've kept a tally of

I've kept a tally of completed tasks and times over the past month or so for my
AMD RX 5600XT (6 GB) running 3X tasks of the LATeah3012 series.

13099 tasks reported over 921 continuous hours, effective times per task
average: 253 sec
stdev: 16 sec
range: 116 sec -- 321 sec

 

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 240
Credit: 10553155586
RAC: 25479456

The RTX A6000s complete the

The RTX A6000s complete the tasks in about 113 sec. 

I have been messing around with the settings recently and turned on the ECC of the GPUs to see the impact. It slows computation by about 20%, understandably. 

So my question- does ECC enabled on these GPUs have any benefit for these work units? Will it improve the accuracy of the results? 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46750502642
RAC: 64129220

Boca Raton Community HS

Boca Raton Community HS wrote:

The RTX A6000s complete the tasks in about 113 sec. 

I have been messing around with the settings recently and turned on the ECC of the GPUs to see the impact. It slows computation by about 20%, understandably. 

So my question- does ECC enabled on these GPUs have any benefit for these work units? Will it improve the accuracy of the results? 

maybe. But I think any inaccuracy is due to the code and validator and not the GPU’s lack of ECC hardware (a bit flip would undoubtedly result in a computation error and not an invalid). The invalid rate of the new 1.28 Nvidia app on normal consumer GPUs seems to be around 3-4%. Not worth the 20% hit in my opinion, even if it drives the invalid rate to less than 1%, which I doubt anyway. 

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.