Hi, it is not neccesary to checkpoint this often. It seems to be hard coded. Every 200 seconds will be enough.
I tried it and it turns out that BRP4 app does listen to BOINC checkpoint requests. Looks like at one point you had your BOINC set at 10 min. since that's about how often one of your tasks check-pointed according to its log. It'd be interesting to see if significantly reducing (or even eliminating) checkpoint frequency would reduce runtime in a meaningful way.
But it is not BOINC report progress and write this log, its APP itself.
it's checkpointing after some regular time interval defined in BOINC, not because it reached some milestone in the app to trigger a checkpoint.
You're getting it wrong. That's not how it works at all. Applications make checkpoints only when they reach certain points in calculation set by the APPLICATION programmers. Where it is possible/convenient to record it (and then later restore from it). BOINC client simply can not influence this. All that the corresponding setting in the BOINC client does is say to app "please do not write checkpoints more than once xx minutes." But when, in which places, and how often to write them is up to the scientific application alone. The corresponding option is even worded accordingly:
"Request task to checkpoint at most every xxx seconds"
APP can follow this recommendation by skip writing the next checkpoint if less than the specified interval has passed since the previous one was recorded. Or ignore this recommendation. But in any case, checkpoints are written only at points predefined by app programmer when the calculation process reaches it. This is both theoretically and has been tested repeatedly in practice by me and many other users. For example, a fresh example with an FGRP5 application in another topic: https://einsteinathome.org/content/strange-wus-names-and-checkpoint-issues-latest-fgrp5-batch
go ahead and let the task "finish". it wont. it's hung or stuck in some kind of infinite loop. it might get to 99.999 but will never complete
Fast machines, with gpu never write a checkpoint, because the calculation stops after 30 seconds. Then why do we need 40 checkpoints on slow computers? This is just a waste of time.
Fast machines, with gpu never write a checkpoint, because the calculation stops after 30 seconds. Then why do we need 40 checkpoints on slow computers? This is just a waste of time.
In the case of BRP4, if you want less checkpoints, increase the time interval for checkpoints in BOINC manager settings. I did a test and BRP4 does adhere relatively closely to BOINC checkpoint requests. You can see the evidence for this in the task logs.
However and unfortunately, reducing the amount of checkpoints does not improve run times. I went from 22 checkpoints per task to just 1 and noticed no difference in run times. It appears that in the case of BRP4, the checkpoint process is very quick and uses very little resources. I was hopeful that this was going to be one way to increase run times but it does not.
GPU tasks do checkpoint, I've seen it with both BRP7 and O3AS, which are strictly GPU tasks. In case of BRP7, you can even see it in the logs. It depends on the app coding, your BOINC settings, and the run times on a given machine.
Someone tested the checkpoint effect on an RPi 4 with an SD card that had extremely poor I/O performance,
It resulted in a -10% performance drop. However, under normal conditions, checkpoints should have a negligible impact on performance, 1% at most.
Interesting. I tested it on an old laptop with a Celeron N3050 CPU & a 32GB eMMC drive and saw no difference. It seems like those with slow storage like an SD card could increase their BRP4 run times in a meaningful way by going with 1 or 0 checkpoints.
It seems i have overreacted. I am sorry! This is a non issue.
Not at all. The checkpointing mechanism is not obvious. It's important to discuss its details every now and then... how this works between the BOINC client and the science app, as did the useful repost by 'San-Fernando-Valley'.
Risky64 wrote: Hi, it is not
)
I tried it and it turns out that BRP4 app does listen to BOINC checkpoint requests. Looks like at one point you had your BOINC set at 10 min. since that's about how often one of your tasks check-pointed according to its log. It'd be interesting to see if significantly reducing (or even eliminating) checkpoint frequency would reduce runtime in a meaningful way.
It is worse! The cpu seems to
)
It is worse! The cpu seems to wait until the checkpoint is written. I want less checkpoints!!!
Have a look at post 231731
)
Have a look at post 231731 -- sorry I don't know how to link to this referenced post --
Might be of interest.
Here is a copy:
From MAD_MAX
But it is not BOINC report progress and write this log, its APP itself.
You're getting it wrong. That's not how it works at all. Applications make checkpoints only when they reach certain points in calculation set by the APPLICATION programmers. Where it is possible/convenient to record it (and then later restore from it). BOINC client simply can not influence this. All that the corresponding setting in the BOINC client does is say to app "please do not write checkpoints more than once xx minutes." But when, in which places, and how often to write them is up to the scientific application alone. The corresponding option is even worded accordingly:
APP can follow this recommendation by skip writing the next checkpoint if less than the specified interval has passed since the previous one was recorded. Or ignore this recommendation. But in any case, checkpoints are written only at points predefined by app programmer when the calculation process reaches it. This is both theoretically and has been tested repeatedly in practice by me and many other users. For example, a fresh example with an FGRP5 application in another topic: https://einsteinathome.org/content/strange-wus-names-and-checkpoint-issues-latest-fgrp5-batch
.... etc.
sfvFast machines, with gpu never
)
Fast machines, with gpu never write a checkpoint, because the calculation stops after 30 seconds. Then why do we need 40 checkpoints on slow computers? This is just a waste of time.
Risky64 wrote: Fast
)
In the case of BRP4, if you want less checkpoints, increase the time interval for checkpoints in BOINC manager settings. I did a test and BRP4 does adhere relatively closely to BOINC checkpoint requests. You can see the evidence for this in the task logs.
However and unfortunately, reducing the amount of checkpoints does not improve run times. I went from 22 checkpoints per task to just 1 and noticed no difference in run times. It appears that in the case of BRP4, the checkpoint process is very quick and uses very little resources. I was hopeful that this was going to be one way to increase run times but it does not.
GPU tasks do checkpoint, I've seen it with both BRP7 and O3AS, which are strictly GPU tasks. In case of BRP7, you can even see it in the logs. It depends on the app coding, your BOINC settings, and the run times on a given machine.
Someone tested the checkpoint
)
Someone tested the checkpoint effect on an RPi 4 with an SD card that had extremely poor I/O performance,
It resulted in a -10% performance drop. However, under normal conditions, checkpoints should have a negligible impact on performance, 1% at most.
ahorek's team wrote:Someone
)
Interesting. I tested it on an old laptop with a Celeron N3050 CPU & a 32GB eMMC drive and saw no difference. It seems like those with slow storage like an SD card could increase their BRP4 run times in a meaningful way by going with 1 or 0 checkpoints.
ahorek's team
)
It seems i have overreacted. I am sorry! This is a non issue.
I have tried 1500 seconds for checkpoints. This is okay for me.
Risky64 schrieb:It seems i
)
Not at all. The checkpointing mechanism is not obvious. It's important to discuss its details every now and then... how this works between the BOINC client and the science app, as did the useful repost by 'San-Fernando-Valley'.
Someone could write a guide,
)
Someone could write a guide, with hints about possible defaults for newbies and noobs.
Settings made easy, or Noob guide to Einsteinathome.