How does BOINC / Einstein handle multiple GPUs?

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 408428888
RAC: 315057

With a pair of 550, you can

With a pair of 550, you can expect running 4 wu at the same time! 2 by GPU.

read this trhead:

http://einsteinathome.org/node/196075

If a single WU takes you 48minuntes, 2 wu at once will take you +/-1h10.

Meaning +/- 80 wu by 24 hours. Instead of 60!

24 hours = 1440minutes.

1440/70= +/- 20*4WU = 80

1440/48= 30*2GPU = 60

Here's my app_info:

einstein_S6Bucket
Gravitational Wave S6 GC search v1.01

einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe

einstein_S5R6_3.01_graphics_windows_intelx86.exe

einstein_S6Bucket
101
windows_intelx86
1.000000
1.000000
SSE2
6.13.0

einstein_S6Bucket_1.01_windows_intelx86__SSE2.exe

einstein_S5R6_3.01_graphics_windows_intelx86.exe
graphics_app

einsteinbinary_BRP4
Binary Radio Pulsar Search

einsteinbinary_BRP3_1.07_windows_intelx86__BRP3cuda32.exe

einsteinbinary_BRP3_1.00_graphics_windows_intelx86.exe

cudart_xp32_32_16.dll

cufft_xp32_32_16.dll

db.dev.win.96b133b1

dbhs.dev.win.96b133b1

einsteinbinary_BRP4
105
windows_intelx86
0.200000
1.000000
BRP3cuda32
6.13.0

einsteinbinary_BRP3_1.07_windows_intelx86__BRP3cuda32.exe

cudart_xp32_32_16.dll
cudart32_32_16.dll

cufft_xp32_32_16.dll
cufft32_32_16.dll

einsteinbinary_BRP3_1.00_graphics_windows_intelx86.exe
graphics_app

db.dev.win.96b133b1
db.dev

dbhs.dev.win.96b133b1
dbhs.dev

CUDA
0.500000

220200960.000000

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 408428888
RAC: 315057

The missing files can be

The missing files can be dowlnoal here:

http://einstein.phys.uwm.edu/download/

Quote:

The BOINC data directory doesn't show in the default viewing of Explorer ie. if you don't go to 'Folder Options' in Control Panel and select 'Show hidden files, folders, and drives'. If it's hidden then Windows search facility won't search it either.

Go to

C:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu and creat your app_info.xml there. Than restart boinc.

zombie67 [MM]
Joined: 10 Oct 06
Posts: 121
Credit: 503746790
RAC: 1244905

RE: PCI-E bus does have an

Quote:
PCI-E bus does have an effect with Einstein CUDA apps. I have seen at least a 30% loss in performance in both Linux and Windows by moving a GPU from an x16 slot to an x8 slot.

Can anyone else duplicate this? My team mates have told me that running a very fast card even in a x1 slot (via adaptor) has no degradation in performance.

Reno, NV Team: SETI.USA

Sid
Sid
Joined: 17 Oct 10
Posts: 164
Credit: 973288111
RAC: 360037

RE: RE: PCI-E bus does

Quote:
Quote:
PCI-E bus does have an effect with Einstein CUDA apps. I have seen at least a 30% loss in performance in both Linux and Windows by moving a GPU from an x16 slot to an x8 slot.

Can anyone else duplicate this? My team mates have told me that running a very fast card even in a x1 slot (via adaptor) has no degradation in performance.


My result is - not more then 10%. So 2 cards at 8X are faster then one on 16X. I was running 6 WUs simultaneously on one 560 Ti card and now I'm running 4WUs+4WUs on 2 cards because my processor can't handle more then 8 task simultaneously.

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 408428888
RAC: 315057

RE: now I'm running

Quote:
now I'm running 4WUs+4WUs on 2 cards because my processor can't handle more then 8 task simultaneously.

Are you runnung 24/7?

Sid
Sid
Joined: 17 Oct 10
Posts: 164
Credit: 973288111
RAC: 360037

RE: RE: now I'm running

Quote:
Quote:
now I'm running 4WUs+4WUs on 2 cards because my processor can't handle more then 8 task simultaneously.

Are you runnung 24/7?


I'm trying to run 24/7 but it is not always possible for some reasons.
If you are asking about RAC - I wasn't getting more then 20K with one card and not more then 45k with two cards.
Theoretically I have to get much more.
Still investigate why.

Horacio
Horacio
Joined: 3 Oct 11
Posts: 205
Credit: 80557243
RAC: 0

RE: RE: PCI-E bus does

Quote:
Quote:
PCI-E bus does have an effect with Einstein CUDA apps. I have seen at least a 30% loss in performance in both Linux and Windows by moving a GPU from an x16 slot to an x8 slot.

Can anyone else duplicate this? My team mates have told me that running a very fast card even in a x1 slot (via adaptor) has no degradation in performance.

Ive been using 2 gt9500 in the same box one at 16x and the other on x1 (on a x16 mechanical slot with just 1x electrical). A BRP on the x16 took around 3 hours while the other took around 5 hours. The difference was less notorious for the optimized SETI apps as there is less data transfers after the initial load.

Anyway, the overall performance was better than just running one card, and as running in 1x dosnt reduced the performance under 50% then I guess that running 2 cards at x8 will always be better than running just one at x16.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 0

RE: RE: PCI-E bus does

Quote:
Quote:
PCI-E bus does have an effect with Einstein CUDA apps. I have seen at least a 30% loss in performance in both Linux and Windows by moving a GPU from an x16 slot to an x8 slot.

Can anyone else duplicate this? My team mates have told me that running a very fast card even in a x1 slot (via adaptor) has no degradation in performance.

Was your team talking specifically about E@H, or about GPU apps in general? Most probably won't care about a 1x slot much since they only talk to the CPU at startup, shutdown, and (optionally) when saving checkpoints. E@H's GPU app is only partially GPU accelerated, and relies on talking back and for to the CPU which does other parts of the work (and is why you run into a sharp diminishing returns wall above the 460/560 series GPUs). Because of this back and forth chatter it's one of the few apps to take a non-trivial hit from only having 8x bandwidth instead of 16x.

hotze33
hotze33
Joined: 10 Nov 04
Posts: 100
Credit: 368387400
RAC: 0

I can also confirm the 30%

I can also confirm the 30% loss in performance from going PCIx16->PCIx8. I have recently removed the GTX460 and now the GTX470 is crunching alone. Computation time decreased from ~3600s to ~2600s.

Novotno
Novotno
Joined: 15 Jan 12
Posts: 2
Credit: 130016
RAC: 0

I can also confirm loss of

I can also confirm loss of performance from going 16x->4x. On my Q6600@3GHz and 2xGTX285 (3 CUDA units per card) time to complete 3 CUDA WU: GTX285@PCIex-16x 2:20h, GTX285@PCIex-4x 3:30h My ASUS P5K-E only allows 16x/4x configuration.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.