It is not just about processing speeds, it is also about how many hours per day you allow your computers to run. For example I use my home computer (Pentium IV 3 Ghz) for 2 hrs. a day on average. From the part of the world I come neiher electricity nor computers are cheap, so it is just not possible to keep computers running 24/7. There are a lot of people like me and they will be forced out of this project, which is very unfortunate.
You have a lot of sunlight in India. Why don't you exploit it?
Tullio
It is not just about processing speeds, it is also about how many hours per day you allow your computers to run. For example I use my home computer (Pentium IV 3 Ghz) for 2 hrs. a day on average. From the part of the world I come neiher electricity nor computers are cheap, so it is just not possible to keep computers running 24/7. There are a lot of people like me and they will be forced out of this project, which is very unfortunate.
You have a lot of sunlight in India. Why don't you exploit it?
Tullio
OK - I originated this thread expecting some sort of answer. But I didn't get one - only a lot of "Me too!"s, some spam, some OT comments.
Well here is an on-topic comment:
It just did it again - 2 more 64hour WUs with due dates shorter than the original seti WUs (see top of thread) that are still in the quese. So I stopped getting anymore Einstein workunits. When these are done, I'll maybe visit this message board in a month or so to see if anything has changed.
TTFN (Ta Ta For Now)
ditto,
i've lost workunits that took 70 cpu hours to complete and couldn't be completed within the 2 week due date. i've gotten credit for a few others that were late if quorum had not been met.
please lengthen the due date or change the milestone to start/stop the clock at the workunit start time.
There's an issue here which hasn't been addressed, which is why is the project sending work to hosts which cannot meet the deadline in the first place?.
You can get the project estimated FLOP count for the result from the command line sent to the app out of the client_state file. We also know the estimated run time for the result will be:
Which in the case is over 2.8 Msecs. The parameters haven't changed significantly for this host for months and are always available to the project side scheduler, therefore why was this result sent to the host at all since it's obvious it can't possibly make a 2 week deadline by a wide margin.
There's an issue here which hasn't been addressed, which is why is the project sending work to hosts which cannot meet the deadline in the first place?.
You can get the project estimated FLOP count for the result from the command line sent to the app out of the client_state file. We also know the estimated run time for the result will be:
Which in the case is over 2.5 Msecs. The parameters haven't changed significantly for this host for months and are always available to the project side scheduler, therefore why was this result sent to the host at all since it's obvious it can't possibly make a 2 week deadline by a wide margin.
Alinator
Has that computer finished crunching an S5R2 unit yet? If not, the RDCF will be left over from S5RI: your host metrics may not have changed for months, but the project's certainly have - I was going to say especially with the de-optimisation of AMD discussed elsewhere, but that doesn't apply to MMX! Let it crunch to the end to get a new RDCF, and then let us know what the difference was?
(...)
why is the project sending work to hosts which cannot meet the deadline in the first place?.
(...)
Because they don't care as someone made me aware of today:
Quote:
The application we'll use for this run is all new and has never been used before. The algorithm used in the old App is still a part of the new one, and other parts have also been used before, but they have never been used in the present combination, and in particular not in a distributed computing project of that scale. We expect some problems to arise from this.
One of the issues we are still working on is some "overhead", i.e. calculations that are performed for technical reasons, but don't actually contribute to the result, thus wasting computing power.
We therefore will set up S5R2 as a short, experimental run that limits the search to parts of the parameter space where the overhead is well under control. During this short run we will improve the Application in various aspects. The results will also help us tuning the parameters for the next, larger run (probably named S5R3).
So don't bother either and just turn off your client if it isn't running 24/7 due to other reasons.
Has that computer finished crunching an S5R2 unit yet? If not, the RDCF will be left over from S5RI: your host metrics may not have changed for months, but the project's certainly have - I was going to say especially with the de-optimisation of AMD discussed elsewhere, but that doesn't apply to MMX! Let it crunch to the end to get a new RDCF, and then let us know what the difference was?
This was the second S5R2 this host has run. The first was a ~160 MHz one which finished just within the deadline, and there was no appreciable change in the RDCF after it reported and validated.
However, let's look at this logically for a second. We know the current app is plain vanilla at this point across all platforms. Therefore the RDCF should be going up for all platforms since the app is less efficient by definition.
Under that condition, you should be able to safely assume that if the estimated time to completion is outside the deadline with the old parameters there is no way a less efficient app can improve on that, given the WU will take longer to complete compared to the last runs at a similar template frequency.
Has that computer finished crunching an S5R2 unit yet? If not, the RDCF will be left over from S5RI: your host metrics may not have changed for months, but the project's certainly have - I was going to say especially with the de-optimisation of AMD discussed elsewhere, but that doesn't apply to MMX! Let it crunch to the end to get a new RDCF, and then let us know what the difference was?
This was the second S5R2 this host has run. The first was a ~160 MHz one which finished just within the deadline, and there was no appreciable change in the RDCF after it reported and validated.
However, let's look at this logically for a second. We know the current app is plain vanilla at this point across all platforms. Therefore the RDCF should be going up for all platforms since the app is less efficient by definition.
Under that condition, you should be able to safely assume that if the estimated time to completion is outside the deadline with the old parameters there is no way a less efficient app can improve on that, given the WU will take longer to complete compared to the last runs at a similar template frequency.
If the previous WU had exited before the scheduler work-fetch contact that resulted in the current WU, then I would agree with you. (I'm sure the RDCF is calculated and stored locally, so the reporting/validation doesn't matter: it's only reported to the project so we can see it conveniently on the webserver).
I very much doubt that they would even have considered changing the server algorithm to say 'hey, the new app is de-optimised - let's include a fiddle-factor in the Est_FLOPs - to allow a bit of leeway'.
Looking at your formula, I see four possibilities:
* Bad Est_FLOPs - their end - perfectly possible with a new work-generator
* Bad FP_BM or RDCF - your end - should correct itself over time
* Bad crunch - checkpoint read error causing restart, for example. Should be visible in
* Bad scheduler allocation decision.
If you can absolutely rule out 1, 2 and 3, then 4 is a BOINC server bug and a candidate for reporting on trac.
RE: It is not just about
)
You have a lot of sunlight in India. Why don't you exploit it?
Tullio
RE: RE: It is not just
)
VERY helpfull comment :-(
http://www.boincstats.com/stats/banner.php?cpid=3837f9fafc28ff2e9df5b13ae2f8aaf7
OK - I originated this thread
)
OK - I originated this thread expecting some sort of answer. But I didn't get one - only a lot of "Me too!"s, some spam, some OT comments.
Well here is an on-topic comment:
It just did it again - 2 more 64hour WUs with due dates shorter than the original seti WUs (see top of thread) that are still in the quese. So I stopped getting anymore Einstein workunits. When these are done, I'll maybe visit this message board in a month or so to see if anything has changed.
TTFN (Ta Ta For Now)
ditto, i've lost workunits
)
ditto,
i've lost workunits that took 70 cpu hours to complete and couldn't be completed within the 2 week due date. i've gotten credit for a few others that were late if quorum had not been met.
please lengthen the due date or change the milestone to start/stop the clock at the workunit start time.
There's an issue here which
)
There's an issue here which hasn't been addressed, which is why is the project sending work to hosts which cannot meet the deadline in the first place?.
For example looking at this result 570498:
You can get the project estimated FLOP count for the result from the command line sent to the app out of the client_state file. We also know the estimated run time for the result will be:
(Est_FLOPs / FP_BM) * (RDCF / (On_Frac * Run_Frac * CPU_Eff))
Which in the case is over 2.8 Msecs. The parameters haven't changed significantly for this host for months and are always available to the project side scheduler, therefore why was this result sent to the host at all since it's obvious it can't possibly make a 2 week deadline by a wide margin.
Wrote down formula wrong!
Alinator
RE: There's an issue here
)
Has that computer finished crunching an S5R2 unit yet? If not, the RDCF will be left over from S5RI: your host metrics may not have changed for months, but the project's certainly have - I was going to say especially with the de-optimisation of AMD discussed elsewhere, but that doesn't apply to MMX! Let it crunch to the end to get a new RDCF, and then let us know what the difference was?
Edited to quote corrected formula
RE: (...) why is the
)
Because they don't care as someone made me aware of today:
So don't bother either and just turn off your client if it isn't running 24/7 due to other reasons.
RE: Has that computer
)
This was the second S5R2 this host has run. The first was a ~160 MHz one which finished just within the deadline, and there was no appreciable change in the RDCF after it reported and validated.
However, let's look at this logically for a second. We know the current app is plain vanilla at this point across all platforms. Therefore the RDCF should be going up for all platforms since the app is less efficient by definition.
Under that condition, you should be able to safely assume that if the estimated time to completion is outside the deadline with the old parameters there is no way a less efficient app can improve on that, given the WU will take longer to complete compared to the last runs at a similar template frequency.
RE: RE: Has that computer
)
If the previous WU had exited before the scheduler work-fetch contact that resulted in the current WU, then I would agree with you. (I'm sure the RDCF is calculated and stored locally, so the reporting/validation doesn't matter: it's only reported to the project so we can see it conveniently on the webserver).
I very much doubt that they would even have considered changing the server algorithm to say 'hey, the new app is de-optimised - let's include a fiddle-factor in the Est_FLOPs - to allow a bit of leeway'.
Looking at your formula, I see four possibilities:
* Bad FP_BM or RDCF - your end - should correct itself over time
* Bad crunch - checkpoint read error causing restart, for example. Should be visible in
* Bad scheduler allocation decision.
If you can absolutely rule out 1, 2 and 3, then 4 is a BOINC server bug and a candidate for reporting on trac.
I'm pretty sure that the
)
I'm pretty sure that the projects can reset the RDCF from the server end, If so should this have been done when S5R2 started.
Andy