FGRP5 (FGRPSSE 1.08) tasks reset progress

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869501

RAC: 12422

"Normal" FGRP5 CPU tasks do

2 Oct 2024 11:42:24 UTC

Message 228681

(moderation:

)

"Normal" FGRP5 CPU tasks do some ~60 checkpoints between 0 % and ~90% progress. The final 10% progress is spend for final evaluation of 10 toplist candidate signals where there are further 10 checkpoints.

In the beginning of a new raw data file (e.g. LATeah2108.dat) there are lots of tasks with only a few skypoints (as few as 6, ... 4, or as can be currently seen only TWO skypoints!!!).

These workunits can be identified by a small number (< 100) in the workunit name next to the leading raw data file name:

e.g.: LATeah2108F_72.0_3124_-8.599999999999999e-11 --> only 2 skypoints: 2 checkpoints until 90% progress

see command line:

command line: projects/einstein.phys.uwm.edu/hsgamma_FGRP5_1.08_windows_intelx86__FGRPSSE.exe --inputfile [...] --numSkyPoints 2 --f1dot -8.699999999999998e-11 [...]

This workunit checkpoints at ~45%, ~90%, then ten further checkpoints until 100% progress.

See: stderr.txt logfile of running tasks:

% C 1 0   <-- 1st cp: ~45% progress
% C 2 0   <-- 2nd cp: ~90% progress
% C 3 2   <-- 1st toplist candidate checkpoint
% C 4 3   <-- 2nd toplist...
% C 5 4   <-- 3rd
% C 6 5
% C 7 6
% C 8 7
% C 9 8
% C 10 9
% C 11 10
% C 12 11 <-- 10th toplist candidate --> 100% progress
FPU status flags:  PRECISION
12:58:08 (1960): [normal]: done. calling boinc_finish(0).
12:58:08 (1960): called boinc_finish

It's almost impossible to run such tasks on a computer which is powered down regularly. You will loose ~~all~~ most of the computation effort due to frequent resets to the last checkpoint because checkpoints are extremely rare (HOURS between!). So either you run it 24/7 or you have to use hibernate mode (suspend to disk) instead of shutting the BOINC client down. Or avoid FGRP5 CPU with these "unnormal" tasks.

On the other hand: these occasionally occuring WUs complete in less than half the time of "normal" FGRP5 tasks, giving the same credit; that is: boosting your RAC 100...120% if you don't use a real GPU like me.

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869501

RAC: 12422

an old thread from "wish

2 Oct 2024 11:25:09 UTC

Message 228683

(moderation:

)

an old thread from "wish list" forum on the same issue* with FGRP5 CPU where I explained it in detail before:

https://einsteinathome.org/de/content/cpu-time-checkpoint-4h

* not an issue but a feature. ;-)

carmar

Joined: 27 May 21

Posts: 32

Credit: 535865

RAC: 506

Thank you.

2 Oct 2024 15:52:07 UTC

Message 228686

(moderation:

)

Thank you.

carmar

Joined: 27 May 21

Posts: 32

Credit: 535865

RAC: 506

This should be the final

12 Oct 2024 18:51:23 UTC

Message 228999

(moderation:

)

This should be the final update.

I removed the project and added it back several days later. Since last night, it has been checkpointing more frequently. Around every 20 minutes from my occasional review.

Thanks to all for teaching me more about this.

Jim Martin

Joined: 24 Jun 05

Posts: 9

Credit: 9596878

RAC: 11005

Emails/personal inputs,

15 Oct 2024 18:38:18 UTC

Message 229078

(moderation:

)

Emails/personal inputs, during program, result in program restarts.

I enjoy running Einstein@home, but don't feel it should dominate the computer's use.

Checkpoints seem fewer, lately. I'll not stop running E@home, have just let my "aborts" be a friendly

heads-up to you.

Resend, and will give it another try!

San-Fernando-Valley

Joined: 16 Mar 16

Posts: 397

Credit: 10113953455

RAC: 28518589

Jim Martin

16 Oct 2024 6:50:44 UTC

Message 229089 in response to message 229078

(moderation:

)

Jim Martin wrote:

Emails/personal inputs, during program, result in program restarts.

I enjoy running Einstein@home, but don't feel it should dominate the computer's use.

...

You should be able to control the behavior to your needs with the "Options" tab.

sfv

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869501

RAC: 12422

Jim Martin

16 Oct 2024 7:33:58 UTC

Message 229090 in response to message 229078

(moderation:

)

Jim Martin wrote:

Checkpoints seem fewer, lately.

We are back at 22 skypoints, that is 22 checkpoints within first 90% (resp. 89.989%) of progress for currently send out FGRP5 CPU tasks.

90% / 22 = 4.09% --> checkpointing each ~4% of progress.

As SFV already mentioned: memory usage can be limited to a fixed share:

Use max xx% of memory*:

...or by limiting the number of parallel running tasks (FGRP5 currently: ~700 MiB per task):

Use max xx% of CPUs:

CPU usage can be throttled as well: (see effect in Windows task manager)

Use max xx% of CPU time

There are also 3rd party throttling tools (e.g. TThrottle) for BOINC which allow you to set CPU temperature tresholds to control CPU throttling based on actual CPU thermal output. (e.g. different processing steps of a FGRP5 task causes different heat output).

[*] not shure about exact words bc I have a German localization in BOINC.

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869501

RAC: 12422

A ten year old post from

16 Oct 2024 7:43:55 UTC

Message 229091

(moderation:

)

A ten year old post from project admin on checkpoint frequency:

https://einsteinathome.org/de/content/more-checkpoints#comment-119513

Bernd Machenschalk wrote:

In general we design our Apps to (potentially) checkpoint as often as possible / feasible, i.e. after each reasonably independent computation.

Feasibility limits here include the programming effort (parameters in data structures modified in nested loops saved and restored) and the data volume (storage space and time to write) of the necessary checkpoints. It doesn't make much sense to checkpoint every minute when writing the checkpoint takes several seconds (multiplied by the number of instances that may be running and checkpointing at once) and thus noticeably slows down computation, or if initializing the application picking up from a checkpoint takes several minutes alone.

FGRP5 (FGRPSSE 1.08) tasks reset progress

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner