well there's some truth in everyone's statements here...
i suppose i should rephrase my claim as follows: 3 simultaneous E@H tasks run more efficiently on a GTX 560 Ti 1GB than 2 simultaneous E@H tasks do on a Win7 x64 platformso long as you also have sufficient CPU resources and a full 16 lanes of PCIe 2.0 bandwidth for the GPU. this also means that 3 simultaneous E@H tasks will run better on a GTX 560 Ti 1GB than 2 simultaneous E@H tasks will on a Win7 x64 platform so long as you have sufficient CPU resources and at least 8 lanes of PCIe 3.0 bandwidth. i can also confirm the same results as above when limited to only 8 lanes of PCIe 2.0 bandwidth (from when i had dual GTX 560 Ti's installed and the x16 slots were limited to PCIe 2.0 x8 bandwidth), and so the same must be true when limited to 4 lanes of PCIe 3.0 bandwidth on a Win7 x64 platform.
This is only true if you are running cpu tasks as well. There is a whole new world out there once you stop the cpu from interfering with the gpu...
Gord
Not exactly.
On 2 of 3 PCs no CPU units are running and on the Wheezy there is 4core cpu with 2 parallel cpu units only.
Moreover both linux PCs have increased priority of GPU tasks.
WinXP PC has integrated AMD video card and 560ti is dedicated only for E@H, not for desktop operations.
Good for Sunny to have fast PC, probably based on Intel (?).
Those my 3PCs are based on obsolete 45W-65W AMDs 2xAM2/1xAM3, 2xDDR2/1xDDR3, slow craps ;-)
On the other hand, I do not think the difference between 2/3 parallel units on 560Ti is kind of significant.
WinXP PC has integrated AMD video card and 560ti is dedicated only for E@H, not for desktop operations.
Good for Sunny to have fast PC, probably based on Intel (?).
Those my 3PCs are based on obsolete 45W-65W AMDs 2xAM2/1xAM3, 2xDDR2/1xDDR3, slow craps ;-)
that's what i do as well - IGP as dedicated display GPU, allowing the discrete GPUs to be completely dedicated to crunching. one of my home machines has a 6-core 1090T CPU and utilizes the HD 4290 IGP from the 890GX chipset to run the display. my other two home crunchers have i7 3770K CPUs and utilize the CPU's integrated HD 4000 GPU...and yes, having several available CPU core/threads certainly helps on the GPU end ;-)
Quote:
On the other hand, I do not think the difference between 2/3 parallel units on 560Ti is kind of significant.
it depends - on your slightly outdated hardware that may very well be the case, but i can assure you the difference on my machines is substantial (i can't recall the exact difference in terms of RAC or PPD b/c i experimented with and optimized my platform a long time ago...but i do remember that the difference was substantial enough for me to continue running 3 simultaneous tasks). with the necessary additional VRAM and compute power, i'm able to run 4 simultaneous tasks on my GTX 580s with substantially better results than running only 3 simultaneous tasks. but like any other GPU (or combination of GPUs), production efficiency will depending on a slew of other factors, and running the same number of simultaneous tasks on the same exact video card w/ the same exact core and memory clock may not work for someone else due to those other factors (like PCIe slot bandwidth and/or speed, mobo chipset, CPU type, etc)...
I updated my Linux AMD drivers from version 13.4 to 13.10 Beta 2 last week via both of my AMD hosts. On my dual card host, I have seen a significant reduction in run time compared to the older driver.
3x BRP5 tasks per GPU - 20 tasks averaged per version
Driver - Run time in seconds per task
13.4 - 7694.10
13.10 Beta 2 - 6406.51
The 3x card host also saw a reduction but not as significant as the dual card host that I have. I think this may be due to this particular host having the extra card running via an x8 slot which I have previously seen can reduce performance a bit via the cards running via x16 slots. There may be another reason as well.
The groups of tasks that I averaged with were from last week. I wanted to get a few days of run time to confirm driver stability before posting.
Hi,
has anyone experience with the new nVidia GT 630 with the GK208-301-A1 "Kepler" chip?
The card is rated at 25W maximum and, according to TomsHardware, has (nearly) the same performance as the GT 640. But the benchmarks are mostly based on games, not on open-CL crunching.
If you run with Linux, AMD GPUs will perform faster, clock higher, and use lower voltages than in Windows in regards to OpenCL. Plus, it uses far less resources than Windows XP.
If you run with Linux, AMD GPUs will perform faster, clock higher, and use lower voltages than in Windows in regards to OpenCL. Plus, it uses far less resources than Windows XP.
After doing some testing with my Radeon HD 7950 running Catalyst 13.11 @ 1200MHz and no voltage increase in Ubuntu and a GPU load of 93%, runtime per task is 800 seconds with three BRP4G-opencl-ati work units (as in, run three work units then divide their time by three for actual throughput rate). I'll soon have numbers for the longer BRP5 tasks
If you run with Linux, AMD GPUs will perform faster, clock higher, and use lower voltages than in Windows in regards to OpenCL. Plus, it uses far less resources than Windows XP.
After doing some testing with my Radeon HD 7950 running Catalyst 13.11 @ 1200MHz and no voltage increase in Ubuntu and a GPU load of 93%, runtime per task is 800 seconds with three BRP4G-opencl-ati work units (as in, run three work units then divide their time by three for actual throughput rate). I'll soon have numbers for the longer BRP5 tasks
yeah, that beats the pants off my 1083 seconds per task in Windows 7, and i'm running dual 7970s @ 1000MHz each!
RE: and there is no
)
This is only true if you are running cpu tasks as well. There is a whole new world out there once you stop the cpu from interfering with the gpu...
Gord
well there's some truth in
)
well there's some truth in everyone's statements here...
i suppose i should rephrase my claim as follows: 3 simultaneous E@H tasks run more efficiently on a GTX 560 Ti 1GB than 2 simultaneous E@H tasks do on a Win7 x64 platform so long as you also have sufficient CPU resources and a full 16 lanes of PCIe 2.0 bandwidth for the GPU. this also means that 3 simultaneous E@H tasks will run better on a GTX 560 Ti 1GB than 2 simultaneous E@H tasks will on a Win7 x64 platform so long as you have sufficient CPU resources and at least 8 lanes of PCIe 3.0 bandwidth. i can also confirm the same results as above when limited to only 8 lanes of PCIe 2.0 bandwidth (from when i had dual GTX 560 Ti's installed and the x16 slots were limited to PCIe 2.0 x8 bandwidth), and so the same must be true when limited to 4 lanes of PCIe 3.0 bandwidth on a Win7 x64 platform.
RE: This is only true if
)
Not exactly.
On 2 of 3 PCs no CPU units are running and on the Wheezy there is 4core cpu with 2 parallel cpu units only.
Moreover both linux PCs have increased priority of GPU tasks.
WinXP PC has integrated AMD video card and 560ti is dedicated only for E@H, not for desktop operations.
Good for Sunny to have fast PC, probably based on Intel (?).
Those my 3PCs are based on obsolete 45W-65W AMDs 2xAM2/1xAM3, 2xDDR2/1xDDR3, slow craps ;-)
On the other hand, I do not think the difference between 2/3 parallel units on 560Ti is kind of significant.
RE: WinXP PC has integrated
)
that's what i do as well - IGP as dedicated display GPU, allowing the discrete GPUs to be completely dedicated to crunching. one of my home machines has a 6-core 1090T CPU and utilizes the HD 4290 IGP from the 890GX chipset to run the display. my other two home crunchers have i7 3770K CPUs and utilize the CPU's integrated HD 4000 GPU...and yes, having several available CPU core/threads certainly helps on the GPU end ;-)
it depends - on your slightly outdated hardware that may very well be the case, but i can assure you the difference on my machines is substantial (i can't recall the exact difference in terms of RAC or PPD b/c i experimented with and optimized my platform a long time ago...but i do remember that the difference was substantial enough for me to continue running 3 simultaneous tasks). with the necessary additional VRAM and compute power, i'm able to run 4 simultaneous tasks on my GTX 580s with substantially better results than running only 3 simultaneous tasks. but like any other GPU (or combination of GPUs), production efficiency will depending on a slew of other factors, and running the same number of simultaneous tasks on the same exact video card w/ the same exact core and memory clock may not work for someone else due to those other factors (like PCIe slot bandwidth and/or speed, mobo chipset, CPU type, etc)...
RE: RE: and there is no
)
I am amending the first part of my statement to read:
This is only true if your cpu doesn't have the horsepower to feed the gpu to its full potential.
Both of my 560Ti's run optimally at x4.
Gord
I updated my Linux AMD
)
I updated my Linux AMD drivers from version 13.4 to 13.10 Beta 2 last week via both of my AMD hosts. On my dual card host, I have seen a significant reduction in run time compared to the older driver.
3x BRP5 tasks per GPU - 20 tasks averaged per version
Driver - Run time in seconds per task
13.4 - 7694.10
13.10 Beta 2 - 6406.51
The 3x card host also saw a reduction but not as significant as the dual card host that I have. I think this may be due to this particular host having the extra card running via an x8 slot which I have previously seen can reduce performance a bit via the cards running via x16 slots. There may be another reason as well.
The groups of tasks that I averaged with were from last week. I wanted to get a few days of run time to confirm driver stability before posting.
Hi, has anyone experience
)
Hi,
has anyone experience with the new nVidia GT 630 with the GK208-301-A1 "Kepler" chip?
The card is rated at 25W maximum and, according to TomsHardware, has (nearly) the same performance as the GT 640. But the benchmarks are mostly based on games, not on open-CL crunching.
Alex
If you run with Linux, AMD
)
If you run with Linux, AMD GPUs will perform faster, clock higher, and use lower voltages than in Windows in regards to OpenCL. Plus, it uses far less resources than Windows XP.
RE: If you run with Linux,
)
After doing some testing with my Radeon HD 7950 running Catalyst 13.11 @ 1200MHz and no voltage increase in Ubuntu and a GPU load of 93%, runtime per task is 800 seconds with three BRP4G-opencl-ati work units (as in, run three work units then divide their time by three for actual throughput rate). I'll soon have numbers for the longer BRP5 tasks
RE: RE: If you run with
)
yeah, that beats the pants off my 1083 seconds per task in Windows 7, and i'm running dual 7970s @ 1000MHz each!