While I am in agreement regarding cross-project comparability, I still cannot fathom the method that is being used to reach this goal. Attempting to correct across projects using the "averages" you discuss would seem to be a near impossibility. Since credit rate is completely arbitrary, why not negotiate a standard rate (e.g., X credits per hour on machine Y) to which all projects must conform in order to use the BOINC system? It seems to me that the BOINC developers are stuck on trying to bring credits back in to line with the pre-optimzed rate from SETI.
The current concept of granting credits is (see BOINC WIKI):
you get credits for donating your CPU (and memory and harddiskspace).
Donating 1 hour of cpu x should give you y credits - regardless the projekt you share.
Optimizing application z will return more WUs / h than before - but you donate the same amount of CPU -> you will get the same credits per CPU-hour as before - but that will be less credits per WU.
Since credit rate is completely arbitrary, why not negotiate a standard rate (e.g., X credits per hour on machine Y) to which all projects must conform in order to use the BOINC system?
Why do you think, that machine Y is better suited for defining a standard than the average of all machines?
Norbert
Since credit rate is completely arbitrary, why not negotiate a standard rate (e.g., X credits per hour on machine Y) to which all projects must conform in order to use the BOINC system?
Why do you think, that machine Y is better suited for defining a standard than the average of all machines?
Norbert
Good point ! I wish I were able to think that myself. ( no irony here )
"Entia non sunt multiplicandam praeter necessitatem" (OKHAM)
Since credit rate is completely arbitrary, why not negotiate a standard rate (e.g., X credits per hour on machine Y) to which all projects must conform in order to use the BOINC system?
Why do you think, that machine Y is better suited for defining a standard than the average of all machines?
Norbert
Simply put, averages (or means) are very suceptible to skew in a distribution caused by outliers or skew that is the result of the very different projects that are currently working under the BOINC framework. More importantly, the "averages" that are being discussed are actually moving averages since the projects and their applications (as well as the average crunching machine) are always changing. What I am suggesting is that a standard machine Y (and this could be an actual machine used for project calibration by individual projects or by BOINC to assist projects in setting how they grant credit) may be a better pathway to standardizaton than trying to calibrate off these constantly moving targets.
Simply put, averages (or means) are very suceptible to skew in a distribution caused by outliers or skew that is the result of the very different projects that are currently working under the BOINC framework. More importantly, the "averages" that are being discussed are actually moving averages since the projects and their applications (as well as the average crunching machine) are always changing. What I am suggesting is that a standard machine Y (and this could be an actual machine used for project calibration by individual projects or by BOINC to assist projects in setting how they grant credit) may be a better pathway to standardizaton than trying to calibrate off these constantly moving targets.
What type of host do you have in mind? I only have data for my Athlon XP 2200+ and my sons P4 2.66GHz. At most projects they are about even. At some projects my Athlon (with even less memory) is nearly 20% faster. But if a project uses SSE2 and not 3Dnow I'm lost. There is no standard machine.
Norbert
May I ask what do you think about the new credit's system now implemented in Rosetta@home ? Just have a look... you won't be disapointed.
I think, it's as fair as possible with the given system for Rosetta. In the long run I would aim for a more self calibrating mechanism (that's clients calibrating themselfes towards an average credit claiming plus projects given the info how they compare to the rest of the projects the host is connected to and calibrating themselves too).
My latest update of different projects as they pertain to cross project parity can be found here. Note: the einstein data represented was collected between Jul 25 and Aug 10(the old S5 sytem). I've been shuffling projects to collect data and have put einstein back into the mix this morning so I can see where the new system is at.
Note: all data is from standard/official project software.
And Yes, Eric K from seti has a copy of this. Others are also collecting data on this issue.
Hmmm...let me see if this is more clearly stated then...
There is no standard machine until one is set. As noted in my earlier post (though perhaps not stated with perfect clarity), under the BOINC system ALL credit systems are completely arbitrary. My suggestion was that, instead of relying on moving averages, setting a standard machine to calibrate across all projects might be a simpler solution (keeping in mind that a "standard" machine could also be a straightforward averaging across 2 or 3 machines--say an AMD and an Intel--which is also much more simple than trying to nail down moving statisical targets).
@tony
very nice graphical work. I wonder if it would be possible to display histograms from your data? The potential problems of averaging (outliers, skewed distributions, etc.) cannot be determined using the point estimates displayed in the bar graphs.
@Tony
Thanks for the charts Tony, not done that with the machines here. But a quick analysis says my mainly Intel collection mirrors your P4, except for the celery and not the AMD's, unfortunately the one AMD machine we have is now to be retired, it hadn't done much work anyway.
@Scott Brown,
A standard machine would be nice but as you can see with Tony's results and my reply different machine perform differently on different projects. The celery and AMD mentioned were both primarily Einstein crunchers as that was were they performed the most science. Seti likes Intels with as large an L2 cache as it can get. Not too sure there is much difference on most of the other projects although I get the impression CPND, and maybe the other climate projects, like lots of memory and frequent back-ups.
RE: While I am in agreement
)
The current concept of granting credits is (see BOINC WIKI):
you get credits for donating your CPU (and memory and harddiskspace).
Donating 1 hour of cpu x should give you y credits - regardless the projekt you share.
Optimizing application z will return more WUs / h than before - but you donate the same amount of CPU -> you will get the same credits per CPU-hour as before - but that will be less credits per WU.
Udo
Udo
Scott Brown: RE: Since
)
Scott Brown:
Why do you think, that machine Y is better suited for defining a standard than the average of all machines?
Norbert
RE: Scott Brown:RE: Since
)
Good point ! I wish I were able to think that myself. ( no irony here )
"Entia non sunt multiplicandam praeter necessitatem"
(OKHAM)
RE: Scott Brown:RE: Since
)
Simply put, averages (or means) are very suceptible to skew in a distribution caused by outliers or skew that is the result of the very different projects that are currently working under the BOINC framework. More importantly, the "averages" that are being discussed are actually moving averages since the projects and their applications (as well as the average crunching machine) are always changing. What I am suggesting is that a standard machine Y (and this could be an actual machine used for project calibration by individual projects or by BOINC to assist projects in setting how they grant credit) may be a better pathway to standardizaton than trying to calibrate off these constantly moving targets.
RE: Simply put, averages
)
What type of host do you have in mind? I only have data for my Athlon XP 2200+ and my sons P4 2.66GHz. At most projects they are about even. At some projects my Athlon (with even less memory) is nearly 20% faster. But if a project uses SSE2 and not 3Dnow I'm lost. There is no standard machine.
Norbert
May I ask what do you think
)
May I ask what do you think about the new credit's system now implemented in Rosetta@home ? Just have a look... you won't be disapointed.
"Entia non sunt multiplicandam praeter necessitatem"
(OKHAM)
RE: May I ask what do you
)
I think, it's as fair as possible with the given system for Rosetta. In the long run I would aim for a more self calibrating mechanism (that's clients calibrating themselfes towards an average credit claiming plus projects given the info how they compare to the rest of the projects the host is connected to and calibrating themselves too).
Norbert
My latest update of different
)
My latest update of different projects as they pertain to cross project parity can be found here. Note: the einstein data represented was collected between Jul 25 and Aug 10(the old S5 sytem). I've been shuffling projects to collect data and have put einstein back into the mix this morning so I can see where the new system is at.
Note: all data is from standard/official project software.
And Yes, Eric K from seti has a copy of this. Others are also collecting data on this issue.
RE: ... There is no
)
Hmmm...let me see if this is more clearly stated then...
There is no standard machine until one is set. As noted in my earlier post (though perhaps not stated with perfect clarity), under the BOINC system ALL credit systems are completely arbitrary. My suggestion was that, instead of relying on moving averages, setting a standard machine to calibrate across all projects might be a simpler solution (keeping in mind that a "standard" machine could also be a straightforward averaging across 2 or 3 machines--say an AMD and an Intel--which is also much more simple than trying to nail down moving statisical targets).
@tony
very nice graphical work. I wonder if it would be possible to display histograms from your data? The potential problems of averaging (outliers, skewed distributions, etc.) cannot be determined using the point estimates displayed in the bar graphs.
@Tony Thanks for the charts
)
@Tony
Thanks for the charts Tony, not done that with the machines here. But a quick analysis says my mainly Intel collection mirrors your P4, except for the celery and not the AMD's, unfortunately the one AMD machine we have is now to be retired, it hadn't done much work anyway.
@Scott Brown,
A standard machine would be nice but as you can see with Tony's results and my reply different machine perform differently on different projects. The celery and AMD mentioned were both primarily Einstein crunchers as that was were they performed the most science. Seti likes Intels with as large an L2 cache as it can get. Not too sure there is much difference on most of the other projects although I get the impression CPND, and maybe the other climate projects, like lots of memory and frequent back-ups.
Andy