This sounds good. I mean, personally, I still won't be using Windows much for crunching, but it's great we figured this out :-) and if it could be made available to everyone, e.g., all the 80% who can't or don't want to use Linux, the project could benefit quite a lot...
Yes! Just think about it, if only (conservative estimate) 10 % of the computing power is currently contributed by modern AMD PCs under Windows for E@H, a 30 % performance increase just for those boxes would mean ca 2 additional Tera Flops for the project, just by changing a single byte.
Hey, this was fun, wasn't it, and well worth sacrifising some sleep !
In the meantime, I think it's a matter of courtesy to keep the number of modified clients to a minimum until Bernd OK's the change. It was essential to verify our hypothesis to try out the change, but let's wait until the official OK before everybody is patching the app. If 1000 people are patching and one of them makes a mistake, it can mess up quite a few results. As a software engineer I'd prefer that the new version is formally tested, approved, and only then released with a new version number before it's widely used so any negative effects are traceable.
CU
BRM
Hi, I'm one of the guys that patched his AMD farm with great results.
I like your statement as an software engineer, I just wish the guys responsible for this project would have the same standard !!!
What a waste of cpu power right now without using the optimized instruction sets of the newer CPU that we mainly use and the wasted time of crashed units in the first 2 week of this new run. So much about your nice standards in this project.
I hope they give us an Akosf optimized version sooooooon. Feel like watching $300 worth of my electricity bill a month being wasted right now or feels like driving on the Autobahn in 1st gear only.
On the other hand THANKS for Your great insight to help improve this run a little bid,
Pete
Goodness, huge farms :-D makes my single AMD box (and even my friend's three, two of which are SSE2 capable) pale in comparison. Of course with so many boxes the effect will be much more noticeable; I'm sure it will pay of for both the team and the project. Still, what about the "not so many patched clients" policy?
Hey, that policy sounds Funny :-)
But pay's it the electricity Bill over here?
I'm the guy wit the 19 A64 + 6 A64 X2 as of Ziegenmelkers Post above.
Glad that the problem has been found for the AMD cpu's.
But do think that the programmers of all projects need to get together to iron out these problems. I am pretty sure that the the Seti Optimisers have known about this intel compiler problem for a couple of years. Their main site is owned by, KWSN - Chicken of Angnor, AKA Simon, http://lunatics.at/index.php
The last time someone was modding / patching apps on this project, the project team / scientists intervened and said "no".
While this could have a positive benefit for my system, I am not going to make a change to a closed-source application. If it was open-source, I'd be a little more willing...
The last time someone was modding / patching apps on this project, the project team / scientists intervened and said "no".
While this could have a positive benefit for my system, I am not going to make a change to a closed-source application. If it was open-source, I'd be a little more willing...
FWIW, IMO, YMMV, etc, etc, etc...
Yup, I second that 100%. Let's be a bit patient.
I did inform Bernd about our findings so he's aware of it and investigating "clean" and legal ways to deal with it. Let's not forget that this is not the only issue, there are some client errors still happening which need to be addressed, too.
I did inform Bernd about our findings so he's aware of it and investigating "clean" and legal ways to deal with it. Let's not forget that this is not the only issue, there are some client errors still happening which need to be addressed, too.
As I read of "some client errors still happening" I think it migt be good to Inform that all these "Client error" "Compute error" pairs in my Results that happened in the Last 10 Days are no faults of the Application. They all belong to my actions against wasting Energy in computing Work Units i'll never had a Chance to deliver in Time. Unfortunatly i had not looked at my Computers for a while, so some of them are coming to close to the deadlines of a bunch of WU's, I decidet then to abort them + a few extra. (not all of my Computers can crunch 24/7, it hardly depends on Holidays or illnes of collegues, if all of them are healthy and not on Holiday the most computers have to pause during 8 to 12 hours at 5 Days a Week.
All Computers at which i've decided to try the patch are running rock-solid for a while now.
http://einsteinathome.org/task/84250668 http://einsteinathome.org/task/84196371
Outcome marked as success, however validate state for both marked as invalid.
I won't get it, this whole run seems to me like an utter waste of energy.
Go get the apps fixed to run faster on AMD@Win-Boxes and also get rid of the debug-info, i am currently pulling all my remaining (aren't that much now either) boxes of off einstein that i could reach.
http://einsteinathome.org/task/84250668 http://einsteinathome.org/task/84196371
Outcome marked as success, however validate state for both marked as invalid.
I won't get it, this whole run seems to me like an utter waste of energy.
Go get the apps fixed to run faster on AMD@Win-Boxes and also get rid of the debug-info, i am currently pulling all my remaining (aren't that much now either) boxes of off einstein that i could reach.
Actually the debug info can be most helpful in resolving any problems that might (or actually do) occur, so it would be counterproductive to remove them. Only a few bytes / second of debug info is produced, so it doesn't matter performance-wise.
The validation errors are frequent when the initial replication is 3 (3 workunits deliverd) and 2 of the hosts have the same OS, but the third has a different (e.g. 2 x Darwin vs 1 x Linux or 2 x Windows vs 1 x Linux. ). I guess the validation problem will be reduced once the app will be using hand-crafted assembly code for most of the computation and the numerical differences introduced by different compilers get minimized.
RE: This sounds good. I
)
Yes! Just think about it, if only (conservative estimate) 10 % of the computing power is currently contributed by modern AMD PCs under Windows for E@H, a 30 % performance increase just for those boxes would mean ca 2 additional Tera Flops for the project, just by changing a single byte.
Hey, this was fun, wasn't it, and well worth sacrifising some sleep !
CU
BRM
I can only say I 100% agree
)
I can only say I 100% agree with your post :-D
RE: In the meantime, I
)
Hi, I'm one of the guys that patched his AMD farm with great results.
I like your statement as an software engineer, I just wish the guys responsible for this project would have the same standard !!!
What a waste of cpu power right now without using the optimized instruction sets of the newer CPU that we mainly use and the wasted time of crashed units in the first 2 week of this new run. So much about your nice standards in this project.
I hope they give us an Akosf optimized version sooooooon. Feel like watching $300 worth of my electricity bill a month being wasted right now or feels like driving on the Autobahn in 1st gear only.
On the other hand THANKS for Your great insight to help improve this run a little bid,
Pete
RE: Goodness, huge farms
)
Hey, that policy sounds Funny :-)
But pay's it the electricity Bill over here?
I'm the guy wit the 19 A64 + 6 A64 X2 as of Ziegenmelkers Post above.
have a look at one of my hosts HostID 752545
Have I really to say anymore then the last result?
Result 84292629 was the last without the patch:
Next two where partially affected by it.
And as of result 84292635 you can see the full impact of the patch:
So please keep on exploring and hope that we soon can expect an really optimized einstein binary.
Muetze
Muetze
Glad that the problem has
)
Glad that the problem has been found for the AMD cpu's.
But do think that the programmers of all projects need to get together to iron out these problems. I am pretty sure that the the Seti Optimisers have known about this intel compiler problem for a couple of years. Their main site is owned by, KWSN - Chicken of Angnor, AKA Simon, http://lunatics.at/index.php
Andy
Just as a word of
)
Just as a word of caution:
The last time someone was modding / patching apps on this project, the project team / scientists intervened and said "no".
While this could have a positive benefit for my system, I am not going to make a change to a closed-source application. If it was open-source, I'd be a little more willing...
FWIW, IMO, YMMV, etc, etc, etc...
RE: Just as a word of
)
Yup, I second that 100%. Let's be a bit patient.
I did inform Bernd about our findings so he's aware of it and investigating "clean" and legal ways to deal with it. Let's not forget that this is not the only issue, there are some client errors still happening which need to be addressed, too.
CU
BRM
RE: I did inform Bernd
)
As I read of "some client errors still happening" I think it migt be good to Inform that all these "Client error" "Compute error" pairs in my Results that happened in the Last 10 Days are no faults of the Application. They all belong to my actions against wasting Energy in computing Work Units i'll never had a Chance to deliver in Time. Unfortunatly i had not looked at my Computers for a while, so some of them are coming to close to the deadlines of a bunch of WU's, I decidet then to abort them + a few extra. (not all of my Computers can crunch 24/7, it hardly depends on Holidays or illnes of collegues, if all of them are healthy and not on Holiday the most computers have to pause during 8 to 12 hours at 5 Days a Week.
All Computers at which i've decided to try the patch are running rock-solid for a while now.
Muetze
Muetze
http://einstein.phys.uwm.edu/
)
http://einsteinathome.org/task/84250668
http://einsteinathome.org/task/84196371
Outcome marked as success, however validate state for both marked as invalid.
I won't get it, this whole run seems to me like an utter waste of energy.
Go get the apps fixed to run faster on AMD@Win-Boxes and also get rid of the debug-info, i am currently pulling all my remaining (aren't that much now either) boxes of off einstein that i could reach.
RE: http://einstein.phys.uw
)
Actually the debug info can be most helpful in resolving any problems that might (or actually do) occur, so it would be counterproductive to remove them. Only a few bytes / second of debug info is produced, so it doesn't matter performance-wise.
The validation errors are frequent when the initial replication is 3 (3 workunits deliverd) and 2 of the hosts have the same OS, but the third has a different (e.g. 2 x Darwin vs 1 x Linux or 2 x Windows vs 1 x Linux. ). I guess the validation problem will be reduced once the app will be using hand-crafted assembly code for most of the computation and the numerical differences introduced by different compilers get minimized.
CU
BRM