The gain will though however only show up SIGNIFICANTLY when using it for something like this (and other) projects. Meaning that some off the code are re-used when running HT and the 2 cores are running the same app.
I beg to differ. It is oft reported that SETI runs much faster when paired with Einstein in a hyperthreaded machine than when paired with another SETI. On my Gallatin the difference is that a typical SETI units consumes 45 CPU minutes when paired with an Einstein run, compared to about 75 CPU minutes when paired with another SETI.
On the other hand, the Einstein half of the pair seems to take the same time in these two cases.
Cache problem. SETI needs the whole cache, but not Einstein.
But it works with two separate cpu's like my dual p3, so not solely a cache defined thing.
The gain will though however only show up SIGNIFICANTLY when using it for something like this (and other) projects. Meaning that some off the code are re-used when running HT and the 2 cores are running the same app.
I beg to differ. It is oft reported that SETI runs much faster when paired with Einstein in a hyperthreaded machine than when paired with another SETI. On my Gallatin the difference is that a typical SETI units consumes 45 CPU minutes when paired with an Einstein run, compared to about 75 CPU minutes when paired with another SETI.
On the other hand, the Einstein half of the pair seems to take the same time in these two cases.
Cache problem. SETI needs the whole cache, but not Einstein.
But it works with two separate cpu's like my dual p3, so not solely a cache defined thing.
Andy
Why?
Two separated cpus have two separated caches.
Xeon with Hyper(2 threads working):
6240 seconds average for 10 workunits
Xeon with Hyper off(only 1 thread avaiable to windows, working):
3960 seconds average for 10 workunits
Good to see fresh data on this topic.
Here is a little from the last two days on my one hyperthreaded machine.
It is a Gallatin 3.2 GHz P4 EE.
As such it has a 3-level cache:
L1: 8K data 12kuop code
L2: 512K
L3: 2048K
My Einstein Science ap is akosf S-39L, and my SETI science ap crunch3r's current version:
$Build: Windows SSE2 Intel Pentium 4 V2.10 by Crunch3r $
$Rev: 166.10 Windows SSE2 Intel Pentium 4 V2.10 $
My recent Einstein results have been from z1_1228.5
with most reported CPU times in HT mode around 125 minutes.
For this observation, I turned off HT and ran two pure results, logging 72.1 and 72.2 minutes. The first two results after resuming HT, also done pure (no SETI, no mixing with the non-HT state) logged 125.3 and 125.6 minutes.
So my observed production rate gain from using HT for Einstein on this machine is 15%. As always, possible inconsistencies in the WU's and possible variations in the machine conditions put some error haze around this observation.
Unfortunately my current stock of SETI results is unusually inconsistent in required CPU time from result to result. I will report, however, that for this particular machine the SETI HT gain is far higher than that I am reporting for Einstein, with an especially high benefit when SETI is one thread and Einstein is the other. I'll look for a run of results with consistent CPU times and rerun that part of the observation and report here. The results I observed were that "SETI/Einstein" HT gave SETI CPU times indistinguishable from non-HT SETI times, with a very modest (less than 3%) degradation in the Einstein time. Even I think this sounds too good to be true.
There is quite a variety of HT-supporting machines out there, differing in cache size, levels, and performance, not to mention the more basic Willamette vs. Prescott parentage differences. It would be great to see recent measured data from more machines.
At the moment I do not have the time to be switching HT on/off to check the benchmark times of workunits. Aside from the sheer science of what E@H is, another distinct advantage that I see for this project is that the client(s)/project is just so darn stable - such that I can leave machines unattended for weeks with no worries about their productivity. :)
I hope I'm not jinxing anything by saying that.. even though I'm sure "Murphy" is lurking somewhere nearby!
"Chance is irrelevant. We will succeed."
- Seven of Nine
The gain will though however only show up SIGNIFICANTLY when using it for something like this (and other) projects. Meaning that some off the code are re-used when running HT and the 2 cores are running the same app.
I beg to differ. It is oft reported that SETI runs much faster when paired with Einstein in a hyperthreaded machine than when paired with another SETI. On my Gallatin the difference is that a typical SETI units consumes 45 CPU minutes when paired with an Einstein run, compared to about 75 CPU minutes when paired with another SETI.
On the other hand, the Einstein half of the pair seems to take the same time in these two cases.
Cache problem. SETI needs the whole cache, but not Einstein.
But it works with two separate cpu's like my dual p3, so not solely a cache defined thing.
Andy
Why?
Two separated cpus have two separated caches.
You are the computer expert you tell me. 11k secs for Seti running Seti and Einstein, 15k sec running two * Seti.
Andy
But it works with two separate cpu's like my dual p3, so not solely a cache defined thing.
Why?
Two separated cpus have two separated caches.
You are the computer expert you tell me. 11k secs for Seti running Seti and Einstein, 15k sec running two * Seti.
Ok. I understand you. As far as I know SETI uses 1-2 MB (or bigger) memory blocks at a time, Einstein needs only 20-30 kB. Probably, If you run two SETI applications at once then your memory cannot serve out both processors as fast as they need the datas. Crunch3r's SSE application is very hungry. :-)
The same problem appears on the dual core CPUs also, but it seems to be a cache problem, because the cache is between the CPU and the main memory. So, If you want faster SETI processing then you need bigger memorybandwith.
edit: If you run Einstein and SETI together then the needed memorybandwidht is less.
RE: RE: RE: The gain
)
But it works with two separate cpu's like my dual p3, so not solely a cache defined thing.
Andy
RE: RE: RE: RE: The
)
Why?
Two separated cpus have two separated caches.
I have two Xeon 3.0
)
I have two Xeon 3.0 Irwindales on Asus Ncch-DL boards running this app. Only one chip per board. Same 220FSB for each board, S-39L.
Xeon with Hyper(2 threads working):
6240 seconds average for 10 workunits
Xeon with Hyper off(only 1 thread avaiable to windows, working):
3960 seconds average for 10 workunits
Note: If you look at my results... they may change from the time of this post because I adding more CPUs.
Overclock with the MSI G31M3-L and Intel E8600 3.33Ghz
Intel D865GLC Socket 478 Motherboard Review
Overclock your ASUS 1005HA netbook and crunch more
RE: Xeon with Hyper(2
)
Good to see fresh data on this topic.
Here is a little from the last two days on my one hyperthreaded machine.
It is a Gallatin 3.2 GHz P4 EE.
As such it has a 3-level cache:
L1: 8K data 12kuop code
L2: 512K
L3: 2048K
My Einstein Science ap is akosf S-39L, and my SETI science ap crunch3r's current version:
$Build: Windows SSE2 Intel Pentium 4 V2.10 by Crunch3r $
$Rev: 166.10 Windows SSE2 Intel Pentium 4 V2.10 $
My recent Einstein results have been from z1_1228.5
with most reported CPU times in HT mode around 125 minutes.
For this observation, I turned off HT and ran two pure results, logging 72.1 and 72.2 minutes. The first two results after resuming HT, also done pure (no SETI, no mixing with the non-HT state) logged 125.3 and 125.6 minutes.
So my observed production rate gain from using HT for Einstein on this machine is 15%. As always, possible inconsistencies in the WU's and possible variations in the machine conditions put some error haze around this observation.
Unfortunately my current stock of SETI results is unusually inconsistent in required CPU time from result to result. I will report, however, that for this particular machine the SETI HT gain is far higher than that I am reporting for Einstein, with an especially high benefit when SETI is one thread and Einstein is the other. I'll look for a run of results with consistent CPU times and rerun that part of the observation and report here. The results I observed were that "SETI/Einstein" HT gave SETI CPU times indistinguishable from non-HT SETI times, with a very modest (less than 3%) degradation in the Einstein time. Even I think this sounds too good to be true.
There is quite a variety of HT-supporting machines out there, differing in cache size, levels, and performance, not to mention the more basic Willamette vs. Prescott parentage differences. It would be great to see recent measured data from more machines.
At the moment I do not have
)
At the moment I do not have the time to be switching HT on/off to check the benchmark times of workunits. Aside from the sheer science of what E@H is, another distinct advantage that I see for this project is that the client(s)/project is just so darn stable - such that I can leave machines unattended for weeks with no worries about their productivity. :)
I hope I'm not jinxing anything by saying that.. even though I'm sure "Murphy" is lurking somewhere nearby!
"Chance is irrelevant. We will succeed."
- Seven of Nine
RE: RE: RE: RE: Quote
)
You are the computer expert you tell me. 11k secs for Seti running Seti and Einstein, 15k sec running two * Seti.
Andy
RE: RE: RE: But it
)
Ok. I understand you. As far as I know SETI uses 1-2 MB (or bigger) memory blocks at a time, Einstein needs only 20-30 kB. Probably, If you run two SETI applications at once then your memory cannot serve out both processors as fast as they need the datas. Crunch3r's SSE application is very hungry. :-)
The same problem appears on the dual core CPUs also, but it seems to be a cache problem, because the cache is between the CPU and the main memory. So, If you want faster SETI processing then you need bigger memorybandwith.
edit: If you run Einstein and SETI together then the needed memorybandwidht is less.
Akosf, Thanks for the
)
Akosf,
Thanks for the explaination, I've wondered for some time why it happens.
Andy