You miss my point. There are reviews that do benchmark Compute performance of video cards. My post was in response to the previous posts wishing that there were single precision and double precision benchmarks done on recent video cards. There are. The Anandtech Folding@Home benchmarks are still valid for other projects since you can interpret the comparison results as a scalar function among different brands of cards and different generations of hardware. If you want direct comparison of compute performance at Einstein@Home, then look at the Top 100 participants or Top computers in Statistics. You still cannot divorce yourself from the performance differences caused by different operating systems, desktop hardware, drivers or apps.
Alright, I got your point, Keith. However, my comment was formulated a bit harsh to make it clear that there can't be something like "general compute benchmarks". It doesn't really matter that Anandtech offers what the OP was looking for, because he can't draw the conclusion he's looking for from that data. And if he does it nevertheless, he's probably going to be wrong. Compute workloads are jsut too diverse to replace them with anything but the specific app, or at least the same algorithm, library etc.
BTW: do have already have results from a GTX1060 3 GB? My GTX 1060 6 GB yields about 119k RAC running 2 x BRP CUDA 5.5 at 2.0 GHz (OC) chip clock and 4000 MHz memory clock (stock is 3800 MHz for compute loads).
BTW: do have already have results from a GTX1060 3 GB? My GTX 1060 6 GB yields about 119k RAC running 2 x BRP CUDA 5.5 at 2.0 GHz (OC) chip clock and 4000 MHz memory clock (stock is 3800 MHz for compute loads).
I have a 1060 3GB and a 1060 6GB. Using similar standards and overclocking methods, but on a somewhat different host, I'm seeing my 3GB about 93% as productive as my 6GB running 3x on BRP6/CUDA55 on a Windows 10 host.
I'll post more specific numbers later. (right now the 1060 3GB is half way through a 24-hour proof run at 1 increment less in both core and memory clock than maximum observed success, if that is flawless, I'll back down one more increment for cruising speed).
As stated previously, I currently operated a Founders Edition GTX 1070, a dual-fan MSRP-priced PNY GTX 1060 6GB, and and MSRP-priced single-fan Zotac GTX 1060 3GB. All of them operated in Windows 10 systems with two Nvidia GPU cards. Each has gone through a similar commissioning process, ending with them running at a "cruising speed" somewhat below maximum observed success in what I believe and hope to be a long-term stable error-free condition.
Here are my performance observations:
Attribute 1070 1060_6GB 1060_3GB
Paid $450 $250 $200
core_offset 180 170 205
core_MHz 2012 2025 2006
mem_offset 800 550 550
mem_MHz 2304 2177 2177 on the GPU-Z scale
credit/day 203,657 143,097 132,306 for BRP6/CUDA55, computed from average elapsed time
Very nice tabled results Archae86. Looks to me like the 1060 3GB models offer the best bang for the buck with only 8% less performance than the 1060 6GB models at 80% of the list price of the 6GB models.
As stated previously, I currently operated a Founders Edition GTX 1070, a dual-fan MSRP-priced PNY GTX 1060 6GB, and and MSRP-priced single-fan Zotac GTX 1060 3GB. All of them operated in Windows 10 systems with two Nvidia GPU cards. Each has gone through a similar commissioning process, ending with them running at a "cruising speed" somewhat below maximum observed success in what I believe and hope to be a long-term stable error-free condition.
Here are my performance observations:
Attribute 1070 1060_6GB 1060_3GB
Paid $450 $250 $200
core_offset 180 170 205
core_MHz 2012 2025 2006
mem_offset 800 550 550
mem_MHz 2304 2177 2177 on the GPU-Z scale
credit/day 203,657 143,097 132,306 for BRP6/CUDA55, computed from average elapsed time
Does a memory overclock improve production? I have a 1070 but haven't really overclocked the memory. I think like 50mhz only.
Does a memory overclock improve production? I have a 1070 but haven't really overclocked the memory. I think like 50mhz only.
Yes it does. In fact on the Pascal cards I've worked with so far, the memory overclock I've reached has contributed far more production improvement than has the core clock.
Be careful, though, to check results. On one of my cards even a substantial memory overclock completed and reported normally, and passed the sanity test at the beginning of quorum validation. The first hint of trouble was an inconclusive comparison out of multiple validation attempts, followed--typically days later, by invalid conclusions reached by comparison with the tie-breaker when it came back. I thought I was moving pretty slowly and carefully, but wound up wasting the computation of about 15 BRP6 WUs this way.
Another of my cards had a nasty habit of doing un-requested reboots if the memory overclock was too high. I'm putting together some observations, thoughts and conclusions on Pascal overclocking, and mean to start a thread on that topic "real soon now".
As stated previously, I currently operated a Founders Edition GTX 1070, a dual-fan MSRP-priced PNY GTX 1060 6GB, and and MSRP-priced single-fan Zotac GTX 1060 3GB. All of them operated in Windows 10 systems with two Nvidia GPU cards. Each has gone through a similar commissioning process, ending with them running at a "cruising speed" somewhat below maximum observed success in what I believe and hope to be a long-term stable error-free condition.
Here are my performance observations:
Attribute 1070 1060_6GB 1060_3GB
Paid $450 $250 $200
core_offset 180 170 205
core_MHz 2012 2025 2006
mem_offset 800 550 550
mem_MHz 2304 2177 2177 on the GPU-Z scale
credit/day 203,657 143,097 132,306 for BRP6/CUDA55, computed from average elapsed time
Does a memory overclock improve production? I have a 1070 but haven't really overclocked the memory. I think like 50mhz only.
I seem to be getting away with a 400Mhz overclock on memory with my 1070's. This after getting them to run at the stock 8000Mhz in P2 state first. I've found the BRP6 CUDA55 tasks always respond best to memory overclocks over core clocks.
Could you also include average power consumption? HWinfo64 reports it in W for convenience.
While I have no doubt that Mumak's program faithfully reports what it is told by the card, and it has a handy averaging function, it is definitely not an accurate representation of the system power impact of running Einstein work on one of these cards. Even if the representation of board-level power is perfect, which I gravely doubt, it cannot possibly properly include the extra power consumption imposed on CPU, motherboard RAM, I/O, and the power supply inefficiency overhead.
With all those "pay no attention to these numbers" caveats as preface, just for you I post these HWiNFO averages while running 3X BRP6/CUDA55:
GTX 1070 127.4 watts
GTX 1060 6GB 88.9 watts
The host system which is running these two cards had a idle power consumption with these cards installed of about 65 watts, and consumes an average of 295 watts at the wall (box, not monitor) when running Einstein BRP6 work at these rates.
You miss my point. There are
)
You miss my point. There are reviews that do benchmark Compute performance of video cards. My post was in response to the previous posts wishing that there were single precision and double precision benchmarks done on recent video cards. There are. The Anandtech Folding@Home benchmarks are still valid for other projects since you can interpret the comparison results as a scalar function among different brands of cards and different generations of hardware. If you want direct comparison of compute performance at Einstein@Home, then look at the Top 100 participants or Top computers in Statistics. You still cannot divorce yourself from the performance differences caused by different operating systems, desktop hardware, drivers or apps.
Alright, I got your point,
)
Alright, I got your point, Keith. However, my comment was formulated a bit harsh to make it clear that there can't be something like "general compute benchmarks". It doesn't really matter that Anandtech offers what the OP was looking for, because he can't draw the conclusion he's looking for from that data. And if he does it nevertheless, he's probably going to be wrong. Compute workloads are jsut too diverse to replace them with anything but the specific app, or at least the same algorithm, library etc.
BTW: do have already have results from a GTX1060 3 GB? My GTX 1060 6 GB yields about 119k RAC running 2 x BRP CUDA 5.5 at 2.0 GHz (OC) chip clock and 4000 MHz memory clock (stock is 3800 MHz for compute loads).
MrS
Scanning for our furry friends since Jan 2002
ExtraTerrestrial Apes
)
I have a 1060 3GB and a 1060 6GB. Using similar standards and overclocking methods, but on a somewhat different host, I'm seeing my 3GB about 93% as productive as my 6GB running 3x on BRP6/CUDA55 on a Windows 10 host.
I'll post more specific numbers later. (right now the 1060 3GB is half way through a 24-hour proof run at 1 increment less in both core and memory clock than maximum observed success, if that is flawless, I'll back down one more increment for cruising speed).
As stated previously, I
)
As stated previously, I currently operated a Founders Edition GTX 1070, a dual-fan MSRP-priced PNY GTX 1060 6GB, and and MSRP-priced single-fan Zotac GTX 1060 3GB. All of them operated in Windows 10 systems with two Nvidia GPU cards. Each has gone through a similar commissioning process, ending with them running at a "cruising speed" somewhat below maximum observed success in what I believe and hope to be a long-term stable error-free condition.
Here are my performance observations:
Very nice tabled results
)
Very nice tabled results Archae86. Looks to me like the 1060 3GB models offer the best bang for the buck with only 8% less performance than the 1060 6GB models at 80% of the list price of the 6GB models.
archae86 wrote:As stated
)
Does a memory overclock improve production? I have a 1070 but haven't really overclocked the memory. I think like 50mhz only.
mmonnin wrote:Does a memory
)
Yes it does. In fact on the Pascal cards I've worked with so far, the memory overclock I've reached has contributed far more production improvement than has the core clock.
Be careful, though, to check results. On one of my cards even a substantial memory overclock completed and reported normally, and passed the sanity test at the beginning of quorum validation. The first hint of trouble was an inconclusive comparison out of multiple validation attempts, followed--typically days later, by invalid conclusions reached by comparison with the tie-breaker when it came back. I thought I was moving pretty slowly and carefully, but wound up wasting the computation of about 15 BRP6 WUs this way.
Another of my cards had a nasty habit of doing un-requested reboots if the memory overclock was too high. I'm putting together some observations, thoughts and conclusions on Pascal overclocking, and mean to start a thread on that topic "real soon now".
Thanks for those numbers!
)
Thanks for those numbers! Could you also include average power consumption? HWinfo64 reports it in W for convenience.
MrS
Scanning for our furry friends since Jan 2002
mmonnin wrote:archae86
)
I seem to be getting away with a 400Mhz overclock on memory with my 1070's. This after getting them to run at the stock 8000Mhz in P2 state first. I've found the BRP6 CUDA55 tasks always respond best to memory overclocks over core clocks.
ExtraTerrestrial Apes
)
While I have no doubt that Mumak's program faithfully reports what it is told by the card, and it has a handy averaging function, it is definitely not an accurate representation of the system power impact of running Einstein work on one of these cards. Even if the representation of board-level power is perfect, which I gravely doubt, it cannot possibly properly include the extra power consumption imposed on CPU, motherboard RAM, I/O, and the power supply inefficiency overhead.
With all those "pay no attention to these numbers" caveats as preface, just for you I post these HWiNFO averages while running 3X BRP6/CUDA55:
GTX 1070 127.4 watts
GTX 1060 6GB 88.9 watts
The host system which is running these two cards had a idle power consumption with these cards installed of about 65 watts, and consumes an average of 295 watts at the wall (box, not monitor) when running Einstein BRP6 work at these rates.