Hi!
I'm new of this project and boinc in general, I joined about a couple of months ago.
Probably I'm missing something.
Since about a week, boinc refuses to download new workunits.
It fetch a new wu only if I press the update button *and* the last downloaded wu is 100% done.
I get this message:
Message from server: No work sent
Message from server: (won't finish in time) Computer on 54.8% of time, BOINC on 99.9% of that, this project gets 100.0% of that
I really can't understand what happened !
My pc take about 3 cpu hours to process a work unit having 2 weeks as deadline.
I have some pending results (about 8 now).
My pc is not 24/7 up, but, if not on the same day, on the next day the wu is done and uploaded.
EDIT:
Is it maybe something related to this value?
Result duration correction factor 10.292671
EDIT2:
When a new unit starts, the "To completion" column shows about 160 hrs., but, as I've said is done in ~3 hrs.
Copyright © 2024 Einstein@Home. All rights reserved.
no new WU - won't finish in time
)
Yes, that is most likely the problem, together with the fact that BOINC knows that your computer is only going to be on about 50% of the time. Effectively, BOINC thinks that it will need about 60 hours of wall clock time just to complete one result, if the actual time is about three hours.
When a new task is downloaded, it contains an estimate of the time required to crunch it. This estimate is affected by your CPU benchmarks and so can be quite wrong if your benchmarks happen to be wrong. Your local BOINC client has the benefit of previous experience on your computer to make "adjustments" to this internal estimate. This is what the duration correction factor (DCF) is all about. When you started crunching, your DCF would have been 1.000000. As an example, let us imagine that your very first result was estimated at 4.0 hours but actually took only 2.0 hours perhaps because your benchmark scores were too low. You might think that the DCF should be changed to 0.5 since it only took half the estimated time. However, that would be dangerous if it were a "one-off" fast result, as it could lead BOINC into downloading too much work if your machine didn't sustain the same rate of production. So BOINC is programmed to make the adjustment gradually (10% of the difference between estimated and actual times) when the adjustment is in the downwards direction. So in this particular example, the DCF would only change from 1.00 to 0.95. Eventually, if you continued to crunch these same type of results every 2.0 hours, the DCF would approach 0.50. It would take 20 or 30 successive results to get there.
When going the other way, things are very different. Imagine that your 4.0 hour estimated result actually took 40.0 hours to complete. This needn't even be a "real" 40.0 hours. BOINC just has to be fooled into thinking it took 40.0 hours. In this case the change isn't made 10% at a time. As this is considered to be likely to cause missed deadlines, BOINC is programmed to make the full change immediately and so suddenly you have a DCF of 10 and BOINC is in panic mode and restricting further work fetches.
So how did your DCF get to 10? If your OS was Windows 9x then it would have been somewhat understandable. In your case it seems to be FreeBSD so I have no idea. Maybe you have had some crashes where system time got interfered with - I'm only clutching at straws. The important thing is that you can correct this very easily. In your BOINC directory, you have a state file called "client_state.xml" which contains the values of a whole raft of parameters. There will be one DCF value for each project to which you are attached. You can simply stop BOINC completely and then edit the EAH DCF value in this file using your favourite text editor. Please be very careful NOT to make unintended changes to any other parameters. If you are not comfortable in changing configuration files, don't do it. The DCF will gradually revert to its proper value anyway as you continue to finish results. The editing is just to hasten the process :).
Please ask further questions if the above procedure is not self-evident to you.
Cheers,
Cheers,
Gary.
Thank you for clarifying me
)
Thank you for clarifying me all these concepts.
Some days ago a process that should be idle was eating a lot of cpu cycles and I noticed that only the day after I started it.
The workunit boinc was processing was at the same point I left it the day before. Probably the value of DCF was altered by that fact.
So I think that your guess is good!
As you noticed I'm using FreeBSD, so I ought to :)
I ended up editing the file as I can't stand in front of boinc gui checking if the current wu is finished to press update.
And probably a delay in pressing the update button will cause the DCF getting higher.
I set the value to 1 and boinc start to work regularly!
I processed and uploaded 4 wu and the value is now 0.7.
Thank you very very much for the solution!!!
P.S.
Are you a C or Java programmer?
RE: Thank you for
)
You're most welcome.
Yes, I had realised that you should be skilled at editing config files. I actually started using FreeBSD myself in the days when it was still known as 386BSD and I still have a couple of PI type boxes running early FreebSD 2.x and 4.x versions. Thanks for fixing my typo with the bolding. That comment was meant for any other people happening to read about client_state.xml and deciding to start changing things. I wanted to make sure that they understood the potential dangers.
No, any delay in "updating" would have no effect on DCF. If you are watching your work list in a BOINC Manager window and if you see a result actually finish and the next one start, you can see the immediate change in the time estimates of the "ready to run" results once the crunch time for the result being crunched is known. This is because the DCF is being changed immediately. It is all happening locally without any server contact. Of course, when your DCF is way out like yours was and therefore the corrections are large, the BOINC client will suddenly notice that your cache no longer contains as much work as was previously estimated and so a work fetch is quite likely at this point.
If DCF is still dropping that quickly it could be a sign that your benchmarks are too low - ie your machine is considerably better than BOINC thinks it is. I've just looked at your actual benchmark scores as recorded on the website and they do look low for a 2.0Gig P4. It might be worth your while shutting all other tasks down and manually rerunning the benchmarks to see if you can improve the numbers.
No problems. No, I'm not a programmer of any description really. I've written Bourne shell scripts many years ago but nothing of any significance lately.
Cheers,
Cheers,
Gary.
RE: I actually started
)
Really? Are you still using them?
No problem. Hope that someone else in troubles finds this thread useful.
This is why I've asked you if you are a C or Java programmer, but I figured that it's a stupid guess: you typed "{b]" amd in the Italian keyboard layout you get "[" with alt gr+è and "{" with alt gr+shift+è. Having to type a lot of "{", my fingers often press the second combination even if I have to type a "[". Probably you're not using the same layout.
The problem was that I had only one unit and, once finished, boinc was downloading a new one only after pressing "update".
Now the DCF of the pc has reached a value of 0.2! I've noticed too that the value of the benchmark is low: even my 1.5Ghz winxp laptop is faster. I've yet tried to re-run the benchmarks after stopping all the things that could be stopped but the result was nearly the same.
I had the suspect that the problem is the FreeBSD port, so I put on that pc an old hd with w2k, booted and installed boinc. The values for the benchmark are nearly doubled!
So I think I should ask to the boinc-client port maintainer.
Sorry for my English and really thank you again!
RE: So I think I should ask
)
Here is a boinc-client port maintainer speaking, good morning.
The port is compiled with gcc -O3 optimizations. I know of no other way to beef the numbers up, sorry.