Comparing shaders between models isn't useful when it comes to OpenCL, or even CAL. That's only useful when you compare between GPUs of the same make and generation. Only then is more better.
What OpenCL uses is Compute Units and OpenCL Processing Elements.
See Geeks3D for a complete write-up and while you're there, get CAPS GPU Viewer. That'll show you the 'power' of your GPU.
From the introduction to OpenCL (PDF) we know that when you know the amount of SIMD processors on the GPU, that you know the amount of Compute Units, as these are the same.
The HD6850 has 12 SIMDs or Compute Units.
The HD5570 has 5 SIMDs or Compute Units.
This way we can calculate the strength of the GPU; we do that by dividing the amount of stream processors by the amount of compute units.
HD6850 has 12 Compute Units. 960 stream shaders / 12 Compute Units = 80 OpenCL Elements.
HD5570 has 5 Compute Units. 400 stream shaders / 5 Compute Units = 80 OpenCL Elements.
Which means that in pure OpenCL speed, the HD6850 and HD5570 are equal.
Hello! Thank you for explanation. I have 1Gb model/Core i7-2600 (8 virtual cores), ratio 0,5. But nevertheless I have only 9 tasks running at a time. Why so? I thought there will be 10 tasks (8 cores and 2 GPUs). 2 of the tasks are marked "0,5 CPU + 0,5 GPU" - what's that supposed to mean?
that actually makes perfect sense. think about it - the 7 CPU tasks that are running require one core each, leaving only one core free. the 2 GPU tasks that are running require only 0.5 CPU's each (for a total of 1 whole core). so 7 CPU tasks consume 7 CPU cores, while 2 GPU tasks consume the 8th and final CPU core, for a total of 9 tasks running ay any one time...
Quote:
The HD6850 has 12 SIMDs or Compute Units.
The HD5570 has 5 SIMDs or Compute Units.
This way we can calculate the strength of the GPU; we do that by dividing the amount of stream processors by the amount of compute units.
HD6850 has 12 Compute Units. 960 stream shaders / 12 Compute Units = 80 OpenCL Elements.
HD5570 has 5 Compute Units. 400 stream shaders / 5 Compute Units = 80 OpenCL Elements.
Which means that in pure OpenCL speed, the HD6850 and HD5570 are equal.
so i guess the same can be said about my HD 5870, since 1600 stream processors / 20 compute units = 80 OpenCL Elements. and i guess that would explain why the HD 6850 run times were similar to mine, even though the 5870 is far more powerful than the 6850. but shouldn't the HD 5570 have similar run times too then?
...but shouldn't the HD 5570 have similar run times too then?
When you put it in your system, yes. ;-)
But now you have to put into the equation:
- How much memory does the card have?
- What memory speed does the GPU have?
- What core speed does the GPU have?
- What kind of PCIe slot is it in?
- What kind of CPU is next to it?
- What else is this CPU doing?
- Did the user allow for the extra core for the GPU only?
- Is anything (CPU or GPU) overclocked and by how much?
- Which drivers are we using?
...but shouldn't the HD 5570 have similar run times too then?
When you put it in your system, yes. ;-)
But now you have to put into the equation:
- How much memory does the card have?
- What memory speed does the GPU have?
- What core speed does the GPU have?
- What kind of PCIe slot is it in?
- What kind of CPU is next to it?
- What else is this CPU doing?
- Did the user allow for the extra core for the GPU only?
- Is anything (CPU or GPU) overclocked and by how much?
- Which drivers are we using?
Today I tried setting up for all cores: BOINC uses 100% of the CPU/all 4 cores with 100% duty cycle. Result: I still get only one GPU + 0.5 CPU job plus 4 other (non-GPU) Einstein@home jobs.
AFAIK, in a host with different GPUs, BOINC should recognize all of them but, by default, it will use just the better one (you need to add/edit the cc_config file to instruct it to use all of them...)
Weird thing is that in your hosts page it seems that BOINC is reporting only the 5570... So there is something wrong there...
It might help to restart BOINC and then look in the first lines of event log wich devices are listed/recognized by BOINC...
Comparing shaders between models isn't useful when it comes to OpenCL, or even CAL. That's only useful when you compare between GPUs of the same make and generation. Only then is more better.
Yeah my bad; my gamer bias was showing. I was so impressed with the size of my pipe-line that I forgot its not the size that matters but what you can do with it... :P
Quote:
What OpenCL uses is Compute Units and OpenCL Processing Elements.
See Geeks3D for a complete write-up and while you're there, get CAPS GPU Viewer. That'll show you the 'power' of your GPU.
From the introduction to OpenCL (PDF) we know that when you know the amount of SIMD processors on the GPU, that you know the amount of Compute Units, as these are the same.
Thanks for the info, I haven't been keeping up with things on the openCL side for a while...and killed several hours of pouring over the new data and testing software...damn you. :)
This did lead me to another interesting link people here might find interesting: an openCL-benchmark database that compares the compute performance between all the different hardware including AMD CPUs and the new Intel cores that support openCL. It gives a lot of detailed information when you open the links for each piece of hardware.
...but shouldn't the HD 5570 have similar run times too then?
When you put it in your system, yes. ;-)
But now you have to put into the equation:
- How much memory does the card have?
- What memory speed does the GPU have?
- What core speed does the GPU have?
- What kind of PCIe slot is it in?
- What kind of CPU is next to it?
- What else is this CPU doing?
- Did the user allow for the extra core for the GPU only?
- Is anything (CPU or GPU) overclocked and by how much?
- Which drivers are we using?
etc.
true...i didn't think about all that LOL.
Ageless and I have very similar systems, both 6850s and i5-2500Ks but our run times are a little different because my i5 is OCed to 4.5GHz and his isn't which looks to give me about a 300sec boost for my GPU times. Funny that it should be so...wonder how much difference a PCIe 3.0 card and slot would make.
I direct you over to the PCIe 2.0 vs 3.0 thread in Cruncher's Corner.
In short, it makes a HUGE difference. These tasks rely heavily on CPU, when running 3 tasks, one can near max PCIe 3.0 bandwidth.
4 would max it for sure. This is why I stated in the 690 thread that a 690 would do more harm than good. Would be able to run 6 tasks at a time, but would be limited greatly by bandwidth.
Having a good OC processor makes a big difference for this project.
BOINC: 7.0.28
ATI: 12.3 (apparently 12.4 might have some issues with BOINC) - APP SDK installed.
GFX Card: Radeon 5670, 1GB onboard RAM
Collatz Conjecture works fine on this and uses 100% of my GPU.
However when I suspend that, the GPU goes to E@H (good) which is only using between 25-35% of the GPU (bad). The command that initialised it:
06/06/2012 20:51:18 | Einstein@Home | Starting task p2030.20100903.G43.15+00.93.C.b5s0g0.00000_504_1 using einsteinbinary_BRP4 version 124 (atiOpenCL) in slot 14
Extrapolating from time taken and % complete, I'm estimating about 10.5hrs to complete this task (don't know if that's fast or slow, but it could be 4 times faster if it used all my GPU).
It usually helps to configure BOINC so that 1 core of the CPU is not allocated to CPU only tasks and is therefore free to be used exclusively to feed the GPU.
E.g. if you have a quadcore, set preferences to use only 75% of available cores on a multi-core CPU => 3 cores will be allocated to CPU only jobs (e.g. E@H GW search) and the remaining core will get a BRP4 OpenCL job.
RE: My 6850 crunches in 65
)
Comparing shaders between models isn't useful when it comes to OpenCL, or even CAL. That's only useful when you compare between GPUs of the same make and generation. Only then is more better.
What OpenCL uses is Compute Units and OpenCL Processing Elements.
See Geeks3D for a complete write-up and while you're there, get CAPS GPU Viewer. That'll show you the 'power' of your GPU.
From the introduction to OpenCL (PDF) we know that when you know the amount of SIMD processors on the GPU, that you know the amount of Compute Units, as these are the same.
The HD6850 has 12 SIMDs or Compute Units.
The HD5570 has 5 SIMDs or Compute Units.
This way we can calculate the strength of the GPU; we do that by dividing the amount of stream processors by the amount of compute units.
HD6850 has 12 Compute Units. 960 stream shaders / 12 Compute Units = 80 OpenCL Elements.
HD5570 has 5 Compute Units. 400 stream shaders / 5 Compute Units = 80 OpenCL Elements.
Which means that in pure OpenCL speed, the HD6850 and HD5570 are equal.
RE: Hello! Thank you for
)
that actually makes perfect sense. think about it - the 7 CPU tasks that are running require one core each, leaving only one core free. the 2 GPU tasks that are running require only 0.5 CPU's each (for a total of 1 whole core). so 7 CPU tasks consume 7 CPU cores, while 2 GPU tasks consume the 8th and final CPU core, for a total of 9 tasks running ay any one time...
so i guess the same can be said about my HD 5870, since 1600 stream processors / 20 compute units = 80 OpenCL Elements. and i guess that would explain why the HD 6850 run times were similar to mine, even though the 5870 is far more powerful than the 6850. but shouldn't the HD 5570 have similar run times too then?
RE: ...but shouldn't the HD
)
When you put it in your system, yes. ;-)
But now you have to put into the equation:
- How much memory does the card have?
- What memory speed does the GPU have?
- What core speed does the GPU have?
- What kind of PCIe slot is it in?
- What kind of CPU is next to it?
- What else is this CPU doing?
- Did the user allow for the extra core for the GPU only?
- Is anything (CPU or GPU) overclocked and by how much?
- Which drivers are we using?
etc.
RE: RE: ...but shouldn't
)
true...i didn't think about all that LOL.
RE: Today I tried setting
)
AFAIK, in a host with different GPUs, BOINC should recognize all of them but, by default, it will use just the better one (you need to add/edit the cc_config file to instruct it to use all of them...)
Weird thing is that in your hosts page it seems that BOINC is reporting only the 5570... So there is something wrong there...
It might help to restart BOINC and then look in the first lines of event log wich devices are listed/recognized by BOINC...
RE: RE: My 6850 crunches
)
Yeah my bad; my gamer bias was showing. I was so impressed with the size of my pipe-line that I forgot its not the size that matters but what you can do with it... :P
Thanks for the info, I haven't been keeping up with things on the openCL side for a while...and killed several hours of pouring over the new data and testing software...damn you. :)
This did lead me to another interesting link people here might find interesting: an openCL-benchmark database that compares the compute performance between all the different hardware including AMD CPUs and the new Intel cores that support openCL. It gives a lot of detailed information when you open the links for each piece of hardware.
RE: RE: RE: ...but
)
Ageless and I have very similar systems, both 6850s and i5-2500Ks but our run times are a little different because my i5 is OCed to 4.5GHz and his isn't which looks to give me about a 300sec boost for my GPU times. Funny that it should be so...wonder how much difference a PCIe 3.0 card and slot would make.
I direct you over to the PCIe
)
I direct you over to the PCIe 2.0 vs 3.0 thread in Cruncher's Corner.
In short, it makes a HUGE difference. These tasks rely heavily on CPU, when running 3 tasks, one can near max PCIe 3.0 bandwidth.
4 would max it for sure. This is why I stated in the 690 thread that a 690 would do more harm than good. Would be able to run 6 tasks at a time, but would be limited greatly by bandwidth.
Having a good OC processor makes a big difference for this project.
BOINC: 7.0.28 ATI: 12.3
)
BOINC: 7.0.28
ATI: 12.3 (apparently 12.4 might have some issues with BOINC) - APP SDK installed.
GFX Card: Radeon 5670, 1GB onboard RAM
Collatz Conjecture works fine on this and uses 100% of my GPU.
However when I suspend that, the GPU goes to E@H (good) which is only using between 25-35% of the GPU (bad). The command that initialised it:
06/06/2012 20:51:18 | Einstein@Home | Starting task p2030.20100903.G43.15+00.93.C.b5s0g0.00000_504_1 using einsteinbinary_BRP4 version 124 (atiOpenCL) in slot 14
Extrapolating from time taken and % complete, I'm estimating about 10.5hrs to complete this task (don't know if that's fast or slow, but it could be 4 times faster if it used all my GPU).
Any thoughts?
Hi! It usually helps to
)
Hi!
It usually helps to configure BOINC so that 1 core of the CPU is not allocated to CPU only tasks and is therefore free to be used exclusively to feed the GPU.
E.g. if you have a quadcore, set preferences to use only 75% of available cores on a multi-core CPU => 3 cores will be allocated to CPU only jobs (e.g. E@H GW search) and the remaining core will get a BRP4 OpenCL job.
CU
HB