It's again a increadable boost of performace! Just switched from 40.12 to 41.06 on my (non overclocked) Athlon 3800+ and time consumption for a long WU went down from 3300 - 3500 sec to less than 2600 sec now! This means that nearly 3 WU are calculated each hour and even without short one's this machine will be limited now by the 32 WU per CPU and day!
The only thing then, that will give a HT'ed P4 Prescott a boost is SSE2 or SSE3 optimized code ?? Or will even that not increase the speed ??
Cheers
only very limited in my opinion, the only thing that, to the best of my knowledge, would make a big impact would be a decrease in the "important" dataset, as has been done for S39L, but even that dataset (~11KB) doesn't fit in the 8KB L1 datacache of my Prestonia's (Northwood based Xeons), so when running with HT enabled, there are 2 threads, both "wanting" 11KB L1 at the same time, while the CPU only has 8KB to offer.
This means there are cache-misses, flushes, reloads and fetches from the L2 cache, or even from main system RAM (worst case) which all adds latency to just the memory-handling.
Another issue with HT is the fact that those two Einstein threads are basically doing the same type of work, both claiming resources of a similar nature from the CPU, which only has so many ALU- and FPU-execution units available.
Under ideal circumstances for HyperThreading, you would be running two different threads, one claiming ALU, the other claiming FPU execution units and their combined datasets fit together in L1 and/or L2 cache.
From my own "experiments" with HyperThreading I have found combinations like running SETI + SIMAP or SETI + Distributed.net RC5 at the same time to make the most optimal use of my Xeons, resulting in crunchtimes for both projects that were very, very close to the times I got with HT disabled on those systems.
(hope that made sense)
Yes I get your point.
Will try the S39L out again, though I seem to remember that it was slower than the S40.04. It could fit better on my Prescott with 16kb L1 cache.
One additional question.
Is there an easy way to disable HT on my P4, and then perhaps later on enable it again, without a XP re-install ?
you could try setting it to another "location" on the computerpage under your Einstein account and set the general preferences for that location to use only 1 processor on multi-cpu computers.
That way HyperThreading will still be enabled, but there will only be running a single Einstein process, which should get (nearly) all available cpu-resources (assuming you're only running Einstein on that computer).
Is there an easy way to disable HT on my P4, and then perhaps later on enable it again, without a XP re-install ?
I think this is usually done in the boot screens where you alter CMOS settings (some call these the BIOS screens). Just how this looks will depend on your motherboard/BIOS supplier. On my ASUS motherboard running a Gallatin under Windows XP Pro, I switch back and forth from hyperthreaded to not just by Start|turn off computer|Restart, then during the initial "black screens" hitting the Del key to get to the BIOS screens. I think HT is in the second tab over--possibly labelled advanced settings?--and the HT setting toggles back and forth between HT enabled and disabled. Then Save Settings/reboot, and the deed is done.
On my system, a few software installs did not run right in HT which cleaned right up with HT disabled for the install only. Also, on another system at work, the video driver for one video card was rendered the system unstable in HT. So if you don't need HT, you might have a more stable system with it turned off (most likely this is a matter of exposing previously silent software bugs, rather as the implementation of protected mode of the 286 "made" lots of applications not run when it caught long-existing violations). I have, however, happily run in HT for most of the last two years.
On the more negative side, you may well find the system less responsive when not run in HT. Apparently a lot of software has made excessive use of long critical sections, so one advantage of HT claimed by my colleagues is better interactive response, partcularly in the presence of large tasks.
Dual Xeon @ 2.8/800 (Prestonia) S40.12 Hyperthreading disabled:
20-25 minutes for short
75-80 minutes for long
Hyperthreading enabled:
40-45 minutes for short
145-155 minutes for long
S41.06 Hyperthreading disabled:
18-20 minutes for short
60-70 minutes for long
Hyperthreading enabled:
40-45 minutes for short
140-145 minutes for long
and except for the one I reported earlier, no further invalid results (yet)
It seems that with S41.06 I have reached the point where running only 2 Einstein results at a time is giving me a higher throughput on the Xeons than with 4 simultanious.
It's again a increadable
)
It's again a increadable boost of performace! Just switched from 40.12 to 41.06 on my (non overclocked) Athlon 3800+ and time consumption for a long WU went down from 3300 - 3500 sec to less than 2600 sec now! This means that nearly 3 WU are calculated each hour and even without short one's this machine will be limited now by the 32 WU per CPU and day!
RE: RE: Thx for the
)
Yes I get your point.
Will try the S39L out again, though I seem to remember that it was slower than the S40.04. It could fit better on my Prescott with 16kb L1 cache.
One additional question.
Is there an easy way to disable HT on my P4, and then perhaps later on enable it again, without a XP re-install ?
Cheers
you could try setting it to
)
you could try setting it to another "location" on the computerpage under your Einstein account and set the general preferences for that location to use only 1 processor on multi-cpu computers.
That way HyperThreading will still be enabled, but there will only be running a single Einstein process, which should get (nearly) all available cpu-resources (assuming you're only running Einstein on that computer).
Back on topic:
2x Athlon MP2400+ (Thoroughbred core, L1 cache: 64KB data, 64KB instruction)
S40.12
18-20 minutes for short
60-65 minutes for long
D41.12
16-18 minutes for short
53-56 minutes for long
S41.06
14-16 minutes for short
47-49 minutes for long
Opteron 275 S41.06
)
Opteron 275
S41.06 2250-2500
D41.13 3000
D41.12 2600
S40.12 3100 ~seconds
Athlon 64 3800+
S41.06 2200
D41.13 2600
D41.12 2400
S40.12 2850 ~seconds
Lots of valid and no invalid results.
So, one of my PC is crunching
)
So, one of my PC is crunching with S41 for checking.
Version: S41.06
Processor: Pentium-M 1860MHz (Dothan core)
I hope that it will show me the reason of the invalid results.
xp2500+ S40.12 65min long,
)
xp2500+
S40.12 65min long, 9min short
S41.06 45-50min long, no short to report yet
Results for S41.06
15 succes done
8 valid
7 pending
0 invalid
98SE XP2500+ @ 2.1 GHz Boinc v5.8.8
I Have some times this
)
I Have some times this error:
5.2.13 BoincStudio 0.4b
Die Semaphore kann nicht erneut gesetzt werden. (0x67) - exit code 103 (0x67)
2006-05-02 19:54:20.1250 [normal]: Optimised by akosf S41.06 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'.
r2006-05-02 19:54:20.1406 [normal]: Started search at lalDebugLevel = 0
bad CRC 8026d7cc (should be b4f067de)
2006-05-02 19:54:20.1718 [CRITICAL]: Error in unzipping file '../../projects/einstein.phys.uwm.edu/skygrid_1310_z_T04.dat'. Return value: 2
2006-05-02 19:54:20.1718 [CRITICAL]: System says: Bad file descriptor
Level 0: $Id: ComputeFStatistic.c,v 1.364 2005/12/19 20:06:34 bema Exp $
Function call `InitFStat(status, &GV)' failed.
file ComputeFStatistic.c, line 498
2006-05-02 19:54:20.1718 [normal]:
Level 1: $Id: ComputeFStatistic.c,v 1.364 2005/12/19 20:06:34 bema Exp $
2006-05-02 19:54:20.1718 [normal]: Status code 3: Invalid input
2006-05-02 19:54:20.1718 [normal]: function InitFStat, file ComputeFStatistic.c, line 2517
2006-05-02 19:54:20.1718 [CRITICAL]: BOINC_ERR_EXIT(): now calling boinc_finish()
RE: Is there an easy way to
)
I think this is usually done in the boot screens where you alter CMOS settings (some call these the BIOS screens). Just how this looks will depend on your motherboard/BIOS supplier. On my ASUS motherboard running a Gallatin under Windows XP Pro, I switch back and forth from hyperthreaded to not just by Start|turn off computer|Restart, then during the initial "black screens" hitting the Del key to get to the BIOS screens. I think HT is in the second tab over--possibly labelled advanced settings?--and the HT setting toggles back and forth between HT enabled and disabled. Then Save Settings/reboot, and the deed is done.
On my system, a few software installs did not run right in HT which cleaned right up with HT disabled for the install only. Also, on another system at work, the video driver for one video card was rendered the system unstable in HT. So if you don't need HT, you might have a more stable system with it turned off (most likely this is a matter of exposing previously silent software bugs, rather as the implementation of protected mode of the 286 "made" lots of applications not run when it caught long-existing violations). I have, however, happily run in HT for most of the last two years.
On the more negative side, you may well find the system less responsive when not run in HT. Apparently a lot of software has made excessive use of long critical sections, so one advantage of HT claimed by my colleagues is better interactive response, partcularly in the presence of large tasks.
Dual Xeon @ 2.8/800
)
Dual Xeon @ 2.8/800 (Prestonia)
S40.12
Hyperthreading disabled:
20-25 minutes for short
75-80 minutes for long
Hyperthreading enabled:
40-45 minutes for short
145-155 minutes for long
S41.06
Hyperthreading disabled:
18-20 minutes for short
60-70 minutes for long
Hyperthreading enabled:
40-45 minutes for short
140-145 minutes for long
and except for the one I reported earlier, no further invalid results (yet)
It seems that with S41.06 I have reached the point where running only 2 Einstein results at a time is giving me a higher throughput on the Xeons than with 4 simultanious.
I have 6 done by now and the
)
I have 6 done by now and the 5th is validated and happens to be my fastest ever(2444 sec.).
It's an old one wich did'nt get a quorum so was sent out again.
Makes a nice comparisson, the other two being different machines and aplications.
http://einsteinathome.org/workunit/7108367