No progress made on tasks

ALex Lee
ALex Lee
Joined: 12 Jan 20
Posts: 3
Credit: 15914
RAC: 0
Topic 220426

I'm new to this thing, and really enjoying it! But this also means I don't know much as far as the technical side is concerned.

When I run tasks for this project, the progress bar stops progressing, and stays at a given total. Sometimes it'll even reduce the progress.

In task manager, my cpu is running at 100%, so I don't know what's going on. My cpu temps are mid to high sixties.

CPU: Intel i5 6200u

OS: Windows 10

Thanks in advance.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

Hi and welcome to

Hi and welcome to Einstein!

Depending on the speed of your computer some of the tasks here can run for quite some time before updating their progress, so the progress bar stays on the same number between updates.
When a new task starts and the application doesn't update progress to Boinc during the first 1-2 minutes Boinc will start showing simulated progress based on the estimated time the task will take. Then when the application does update it's progress Boinc will switch to show the real progress. This will often show as the progress going down, that's normal.

Based on your credits you seem to complete tasks OK.
If you want more details you need to go to your privacy settings and unhide your computers so I or someone else can look at your tasks. If you click on my name and then the Show computers link you'll see what type of info that will be shown.

ALex Lee
ALex Lee
Joined: 12 Jan 20
Posts: 3
Credit: 15914
RAC: 0

Ok, so I left the tasks going

Ok, so I left the tasks going overnight and they did make some progress.

I've also noticed that when the task is completing with the screensaver on, it makes progress, but not when when I can actually look at the progress.

I unhid my computers as well.

Thanks!

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

Upon looking at your devices

Upon looking at your devices I see that they seem to complete and return tasks successfully.
There's only one error listed and I can't tell what caused it by looking at the logs sent back to the server.

My advice is that you just let things run as they are to get a feel for how different tasks behave here on Einstein.

If you go to your task list and then click on one of the task IDs you'll see the log sent back to the server.
The Gammar-Ray tasks you've run has had 79 skypoints to process and they ran for 28000 - 36000 seconds, that's 5,9 - 7,6 minutes per skypoint and I think that the progress only updates upon completion of a skypoint. These task will also hold at 89 point something percent done while doing the final computing/sorting of the results and this stage may last for quite some time, up to half an hour or more.

Your in progress tasks at the time of writing are Gravity Wave tasks and you haven't completed any of those yet.
On my computer they run for about twice as long as the Gamma-Ray tasks so expect something like that on yours too.

And as for the screensaver I don't use it and tasks complete fine without it.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118404651971
RAC: 25761516

ALex Lee wrote:Ok, so I left

ALex Lee wrote:
Ok, so I left the tasks going overnight and they did make some progress.

Welcome to Einstein@Home!  We understand there's quite a steep learning curve in order to become familiar with how things work.  One of the first things you should do is browse through all the computing and project preference settings so that you become familiar with the available 'knobs' that you can adjust.  Please ask specific questions if you are not precisely sure of the purpose of each one.

Your CPU (6200U) is an Ultra low power variety designed for mobile devices like notebooks and tablets.  By its very nature, scientific number crunching is an Ultra high power occupation, so you need to be careful how you use your device and not to have too high an expectation for how quickly the numbers can be crunched.

Your device may very well be quite aggressive about how it attempts to conserve power/reduce heat.  When the user is not active, it may well be throttling itself down to a much lower operating frequency.  This may be why you used the term "some progress" in the above quote.  I took that to mean that you were perhaps a bit disappointed with how much progress was made.

One of your available preference settings is concerned with "suspending crunching when the user is active".  If that setting is turned on and you are using keyboard/mouse etc to view progress, you won't be seeing any.  Crunching will have been suspended until you stop 'using' your computer :-).  This is why you should go through all your settings and think about each one.  You probably need to experiment to find conditions where progress is not overly slowed and power/heat is not overly high.

Another potential 'gotcha' with this very setting is that tasks may not be "kept in memory when suspended".  You should have that activated if you have sufficient memory.  If not activated, when crunching restarts it will have to be from a 'checkpoint' saved on disk, rather than from an 'in-core' process image.  If the time between checkpoints is significant, you could lose a significant amount of progress each time tasks get suspended.

Just be aware that many mobile devices are NOT designed for continuous high power use like number crunching.  There is quite a risk that you may shorten its life if you try to over-extract higher performance.

Cheers,
Gary.

ALex Lee
ALex Lee
Joined: 12 Jan 20
Posts: 3
Credit: 15914
RAC: 0

Ok, I think I get it now,

Ok, I think I get it now, thank you all so much!

Gary, I did have the option to "suspend while in use" disabled. I'm pretty sure everything else is pretty good. I guess it just isn't the smooth progress transition that I see on SETI, for example. I'm also considering getting a desktop PC as well, hopefully I won't fry the cpu in my laptop! Thanks again.

Holmis, I have now completed two of the gravitational wave tasks, and the other two are (somewhat) close to being finished as well. I think it's just down to the weak cpu. Thank you as well.

hernco.com
hernco.com
Joined: 14 Feb 12
Posts: 2
Credit: 191411196
RAC: 85808

Aloha, I am having the same

Aloha,

I am having the same problem on all my computers. The task are not advancing or in some cases the remaining time is increasing. I had to un-install on my laptop as it would crash after 30 min of computing. Any ideas?

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

hernco.com wrote:I am having

hernco.com wrote:
I am having the same problem on all my computers. The task are not advancing or in some cases the remaining time is increasing. I had to un-install on my laptop as it would crash after 30 min of computing. Any ideas?

Hi!

I see you have four hosts. Currently the Einstein database shows that only two of the hosts have had some kind of problems with tasks.

The first one of those unlucky hosts is running Android 9 (a smart phone of some sort?). It has been running tasks with 'Binary Radio Pulsar Search (Arecibo) v1.46 () arm-android-linux-gnu' application. That has resulted in valid tasks, invalid tasks and tasks that have errored out already before actual validation. I don't know about Android stuff so I'll skip that.

Boinc says the other host has a "AMD FirePro W5130M (2048MB)". That GPU has GCN 1.0 architecture.

https://www.techpowerup.com/gpu-specs/firepro-w5130m.c2769

All GPUs with GCN 1.0 architecture have been observed to be incompatible with the current GW GPU application. You should disable getting any GW GPU tasks for that host for now. You can do that by visiting

https://einsteinathome.org/account/prefs/project and unchecking the box for Gravitational Wave search O2 Multi-Directional GPU'. Save changes and preferably restart Boinc.

That same host is able to run current GW CPU tasks (Gravitational Wave search O2 Multi-Directional). You can keep that application still checked. All those CPU tasks have run succesfully. That host with that AMD FirePro W5130M is also able to run tasks from 'Gamma-ray pulsar binary search #1' application. There is still one succesfull example of those tasks in the task history. You can keep also that application still checked for that host.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

I'd like to add one note to

I'd like to add one note to Richie's answer and that is to also select NO for "Allow non-preferred apps", this will prevent download of tasks for deselected applications if the selected applications doesn't have any tasks when you ask for more work.

As for the Android device, when I looked it had 14 valid tasks, 9 invalid and 4 errors. That's only about 50% success rate. Check if the device is overheating.
I would try to reduce the number of tasks that run at the same time and if that doesn't help I would seriously consider not running Einstein@home on it. Maybe another project works better?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118404651971
RAC: 25761516

hernco.com wrote:I am having

hernco.com wrote:
I am having the same problem on all my computers. The task are not advancing or in some cases the remaining time is increasing. I had to un-install on my laptop as it would crash after 30 min of computing. Any ideas?

Richie and Holmis have given you good advice.  I'd just like to add a couple of points.

If the machine with the FirePro W5130M is your laptop, it would be very considerate towards your quorum partners if you fired it back up and aborted and returned all the tasks that currently show as 'in progress'.  That way, your partners wont have to wait the full two weeks for those tasks to time out before being sent out again.

Before doing so, you could go to your preferences on the website and allocate that machine to one of the 4 available locations so that you can give it its own unique set of preferences.  Then change those (just for that location) to exclude all searches except the gamma-ray pulsar GPU search (code name FGRPB1G).  Being a laptop, you need to worry about excess heat but if you don't do any CPU crunching, maybe it will be OK with just running those gamma-ray pulsar GPU tasks.  Your existing completed task for that search shows about 2 hrs for the crunch time so the machine could be reasonably productive, just running that GPU search.

Your comment about, "in some cases the remaining time is increasing" is probably not any sort of actual problem but just an artifact of wrong task estimates (far too short) created by the fact that a single duration correction factor (DCF) is having wild oscillations as some tasks crunch more slowly than predicted whilst other tasks crunch more rapidly.  As long as you keep a moderate work cache size, these inaccurate estimates are nothing to worry about.

If you do restart the machine with the W5130M GPU, it probably could out-produce your currently most productive machine - the one with the i7-3770 CPU.  That is a decent CPU and it's taking way too long to crunch CPU tasks - I saw times up to 2 days.  You have 2 old, low end GPUs in that machine so you are probably overloading it.  You have 8 threads (only 4 real cores) so if you are running more than about 2 CPU tasks concurrently you are probably overloading it.  There are no invalids or errors so it just looks like the machine is fine - but overloaded.

Here's my advice.  Suspend (for the time being) all CPU tasks and see what sort of difference that makes to GPU crunch times.  I suspect they may improve a bit.  Then (when you have an indication about that and with GPU tasks still running) resume just a single CPU task and see how much faster it might run.  I suspect quite a bit.  When you have a feel for that, resume a second CPU task to see if it will do the same.  When any CPU task finishes, resume another one to replace it.  Keep going in this fashion until you find an extra CPU task that starts to slow things down.  Then you will know what the limit is.  You can set the % of cores that BOINC is allowed to use to make sure you don't exceed that limit in the future.  With that limit in place you can resume the remaining suspended CPU tasks.

If you really want to get adventurous and drastically increase the production of that machine, you would consider replacing the two low end GPUs with a single, or even two, more modern ones.  Depending on exactly how adventurous you wanted to be, you might have to upgrade the PSU as well.  You could get a GPU that didn't require extra PCIe power that would be quite cheap and would complete a task in around 20 mins instead of the several hours that you currently see :-).

It's entirely up to you, but if you wanted to really make that machine fly, I'm sure there would be lots of people ready to give advice :-).

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.