Tasks "Running" but no progress made

hermanm
hermanm
Joined: 4 Feb 14
Posts: 4
Credit: 2736
RAC: 0
Topic 197371

I'm running BOINC on a laptop with Windows 7. I currently have 4 tasks that claim they are in a "Running" state but they have not made any progress in the past few hours. They appear to be running since my CPU usage has been at 100% the entire time, each taking about a 22-25% share, but the progress bar is stuck at 46.153% for all four.

I have seen other people have issues when the tasks are close to being completed, but since these are only half way done, I don't think it should be a problem.

Anyone had similar issues or know how to fix this? Thanks.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7307361689
RAC: 2299180

Tasks "Running" but no progress made

I don't have any magic here, and can't promise success but in your circumstance I would:

1. at the BOINC interface place all of your tasks save one of the Gravitational Wave tasks at "suspended" (including any for other BOINC projects than Einstein--and also disable download of new work if you have other projects enabled but with no current WUs to suspend). This includes suspending your GPU work.
2. Do a full power off shut down of the laptop
3. after reboot monitor events

In the most conservative case, I might actually put all four of the "offending" WUs in suspend, after the reboot, and just enable a single fresh GW task which has never yet run. Watching that one progress could give you an idea of how things go on your machine with the least feasible impediment.

I think at least some laptops are equipped with far more computing power than they can usefully cool in steady state use. This it not useless, as much of the user perception of speed can be improved by short bursts of high computational throughput while the user is waiting for the computer, followed by long resting periods while the computer is waiting for the user.

Your laptop has two physical cores, which with hyperthreading enabled allows four CPU tasks, plus a Radeon GPU. If you can get things running again, you may find it prudent to throttle back considerably. You could put the number of running CPU jobs down to one by setting "On multiprocessors, use at most" 25% "of the processors" for the location (aka venue) of your host in the Computing Preferences page of your Einstein account online pages.

Even that may not help much if you have a heat problem, as between supporting your Radeon running Perseus work and one CPU job, your two physical cores would still be working pretty hard.

Good luck, and let us know what you see.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118627020665
RAC: 18211793

Hi hermanm, welcome to the

Hi hermanm, welcome to the project!

Quote:
I'm running BOINC on a laptop with Windows 7. I currently have 4 tasks that claim they are in a "Running" state but they have not made any progress in the past few hours. They appear to be running since my CPU usage has been at 100% the entire time, each taking about a 22-25% share, but the progress bar is stuck at 46.153% for all four.


Your laptop shows up as having a AMD 7700 series (CapeVerde) GPU and indeed you have both CPU and GPU tasks. Are the 4 'in progress' tasks all CPU tasks (S6 Gravitational Wave Directed Search)? You have 6 GPU tasks showing on the website (Binary Radio Pulsar). Are any of those in progress? Have you changed any project specific preference settings (eg GPU utilization factor of BRP apps) or are all your settings at default values? I'm a bit surprised that you don't seem to have a GPU task in progress.

Have you tried stopping and restarting BOINC? If not, give that a try and once restarted, have a look at the event log window - In BOINC Manager -> Advanced view, Advanced -> Event Log. If you can copy and paste about the first 40 or so lines of startup messages, it will give a good indication of what BOINC thinks about your machine. Because you have a laptop and because you may be crunching on all CPU cores plus the GPU, you may have a heat problem. You need to monitor that fairly carefully.

Quote:
I have seen other people have issues when the tasks are close to being completed, but since these are only half way done, I don't think it should be a problem.


That issue only applies to the new Fermi Gamma Ray Pulsar (FGRP3) run so it's certainly not your problem since you only have GW CPU tasks in your cache of work.

EDIT: I see that archae86 has replied whilst I was composing my reply. I think that limiting BOINC's use of your CPUs to 50% or 25% would be a good idea, but please do show us the startup messages. You should also install a CPU temperature monitoring utility to see how hot your CPUs are running. If you intend to crunch with your GPU, you should also use a tool to monitor load and temperature there as well.

Cheers,
Gary.

hermanm
hermanm
Joined: 4 Feb 14
Posts: 4
Credit: 2736
RAC: 0

Thanks for the welcome,

Thanks for the welcome, Gary.

After reading the advice both of you gave, I noticed that since I posted this, two of the gravitational wave tasks did make some progress. The other two stayed put. I decided to restart my machine as archae86 suggested and after I started BOINC back up I checked out the event log. Not sure if it will help at all but here it is:

2/5/2014 8:35:14 PM | | cc_config.xml not found - using defaults
2/5/2014 8:35:14 PM | | Starting BOINC client version 7.2.33 for windows_x86_64
2/5/2014 8:35:14 PM | | log flags: file_xfer, sched_ops, task
2/5/2014 8:35:14 PM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
2/5/2014 8:35:14 PM | | Data directory: C:\ProgramData\BOINC
2/5/2014 8:35:14 PM | | Running under account hermanm
2/5/2014 8:35:14 PM | | CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1703, 1024MB, 985MB available, 1728 GFLOPS peak)
2/5/2014 8:35:14 PM | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version CAL 1.4.1703 (VM), device version OpenCL 1.1 AMD-APP (898.1), 1024MB, 985MB available, 1728 GFLOPS peak)
2/5/2014 8:35:14 PM | | OpenCL CPU: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 2.0, device version OpenCL 1.1 AMD-APP (898.1))
2/5/2014 8:35:14 PM | | Host name: MSOE-5CB3270VZF
2/5/2014 8:35:14 PM | | Processor: 4 GenuineIntel Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz [Family 6 Model 58 Stepping 9]
2/5/2014 8:35:14 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx smx tm2 pbe
2/5/2014 8:35:14 PM | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00)
2/5/2014 8:35:14 PM | | Memory: 3.93 GB physical, 4.93 GB virtual
2/5/2014 8:35:14 PM | | Disk: 78.12 GB total, 16.56 GB free
2/5/2014 8:35:14 PM | | Local time is UTC -6 hours
2/5/2014 8:35:14 PM | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 10317833; resource share 100
2/5/2014 8:35:14 PM | | No general preferences found - using defaults
2/5/2014 8:35:14 PM | | Preferences:
2/5/2014 8:35:14 PM | | max memory usage when active: 2013.78MB
2/5/2014 8:35:14 PM | | max memory usage when idle: 3624.80MB
2/5/2014 8:35:14 PM | | max disk usage: 16.76GB
2/5/2014 8:35:14 PM | | don't use GPU while active
2/5/2014 8:35:14 PM | | suspend work if non-BOINC CPU load exceeds 25%
2/5/2014 8:35:14 PM | | (to change preferences, visit a project web site or select Preferences in the Manager)
2/5/2014 8:35:14 PM | | Not using a proxy
2/5/2014 8:35:27 PM | | Suspending GPU computation - computer is in use
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.10_S6Directed__S6CasAf40a_856.4Hz_837_0 using einstein_S6CasA version 105 (SSE2) in slot 0
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.10_S6Directed__S6CasAf40a_856.4Hz_838_1 using einstein_S6CasA version 105 (SSE2) in slot 1
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.10_S6Directed__S6CasAf40a_856.35Hz_893_2 using einstein_S6CasA version 105 (SSE2) in slot 2
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.05_S6Directed__S6CasAf40a_856.4Hz_780_0 using einstein_S6CasA version 105 (SSE2) in slot 3
2/5/2014 8:35:48 PM | | Suspending computation - CPU is busy
2/5/2014 8:35:58 PM | | Resuming computation
2/5/2014 8:36:18 PM | | Suspending computation - CPU is busy
2/5/2014 8:36:28 PM | | Resuming computation

All four tasks appear to be blocked again, despite their running state. There are two other gravitational wave tasks that were in progress momentarily before I suspended them, and there is also one GPU task (Binary Radio Pulsar Search) that has a small amount of progress on it but I have never actually seen it running. It says "GPU Suspended - Computer in use", so I am assuming this is why I have never seen it run.

I am going to take the advice and set my preferences to only use 25%. I didn't think it would have been an issue, but I don't really know what I am doing (any tips are appreciated). Any suggestions for a good temperature monitoring utility?

Anyways, I hope that maybe the event log will give someone some clues regarding my issue. I really appreciate everyone's help. Thank you.

P.S. quick question, if I close BOINC manager, do the tasks still execute in the background, and if so, is there some way to disable that?

hermanm
hermanm
Joined: 4 Feb 14
Posts: 4
Credit: 2736
RAC: 0

Found a similar post:

Found a similar post: http://einsteinathome.org/node/197347

Seems like it may just be a waiting game, especially since I was running four at once. I will keep the single job running for a while and see if anything changes. I would still like any feedback you have on my previous reply though.

EDIT:

The single running task caught up to the other two that had made progress and is currently up to 53.486% from 46.153%. Its almost like they are just taking a long time and then jumping ahead in progress rather than progressing smoothly.

The time remaining continues to increase though, so I guess that is not an estimate that I can rely on.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7307361689
RAC: 2299180

RE: 2/5/2014 8:35:14 PM |

Quote:
2/5/2014 8:35:14 PM | | max memory usage when active: 2013.78MB

I don't know the memory footprint of the GW applications, but it just might be that four GWs plus a Perseus (plus more for any suspended but still resident in memory) might get you up to this limit. If so, restricting to 25% of CPUs would help. but probably this is not your issue.

Quote:
2/5/2014 8:35:14 PM | | don't use GPU while active

This is why your GPU job ceases work if you have used keyboard or mouse anytime "recently", where "recently" is defined by a an item on your computing preferences page for that location (venue).

Quote:
2/5/2014 8:35:14 PM | | suspend work if non-BOINC CPU load exceeds 25

and this preference is shutting down your CPU jobs anytime your laptop finds much of anything to do (antivirus scan, helping you surf the web...)

Quote:
P.S. quick question, if I close BOINC manager, do the tasks still execute in the background, and if so, is there some way to disable that?

You can select that behavior. I prefer to select the behavior that the tasks shut down if boincmgr is exited--but sometimes they fail to do so in conditions I can't define. As to how to select the behavior, I seem to recall being offered the choice the first time I shut down boincmgr by hand, but can't find a option setting or preference item that controls it on a quick look.

As a general proposition, the fact that BOINC tasks run at low priority is hoped to make them "good neighbors" not appreciably impairing user response even when active. I think you could consider relaxing some of the restrictions currently set for this machine in parameters meant to protect you by holding back BOINC processing, since the holding back of boinc processing is what is currently troubling you.

I hope someone else will come along to offer additional and contrasting readings of the entrails here.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2990346380
RAC: 699578

RE: You can select that

Quote:
You can select that behavior. I prefer to select the behavior that the tasks shut down if boincmgr is exited--but sometimes they fail to do so in conditions I can't define. As to how to select the behavior, I seem to recall being offered the choice the first time I shut down boincmgr by hand, but can't find a option setting or preference item that controls it on a quick look.


In the Manager Advanced view, choose 'Options...' from the Tools menu.

On the first (general) tab, check 'Enable Manager exit dialog?'

hermanm
hermanm
Joined: 4 Feb 14
Posts: 4
Credit: 2736
RAC: 0

Thanks. It was already

Thanks. It was already enabled, I just wasn't using the "Exit BOINC" option from the File menu to close it so it wasn't asking me.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118627020665
RAC: 18211793

I'm sorry for the slow

I'm sorry for the slow response. I've had some commitments and haven't been able to respond to you immediately.

In addition to and in support of what others have mentioned, I'd like to make some suggestions. Firstly, here are some particular lines from the startup messages with important bits in bold. A lot of this was already mentioned by archae86.

Quote:

....
2/5/2014 8:35:14 PM | | Preferences:
2/5/2014 8:35:14 PM | | max memory usage when active: 2013.78MB
2/5/2014 8:35:14 PM | | max memory usage when idle: 3624.80MB
2/5/2014 8:35:14 PM | | max disk usage: 16.76GB
2/5/2014 8:35:14 PM | | don't use GPU while active
2/5/2014 8:35:14 PM | | suspend work if non-BOINC CPU load exceeds 25%
....

2/5/2014 8:35:27 PM | | Suspending GPU computation - computer is in use
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.10_S6Directed__S6CasAf40a_856.4Hz_837_0 using einstein_S6CasA version 105 (SSE2) in slot 0
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.10_S6Directed__S6CasAf40a_856.4Hz_838_1 using einstein_S6CasA version 105 (SSE2) in slot 1
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.10_S6Directed__S6CasAf40a_856.35Hz_893_2 using einstein_S6CasA version 105 (SSE2) in slot 2
2/5/2014 8:35:27 PM | Einstein@Home | Restarting task h1_0856.05_S6Directed__S6CasAf40a_856.4Hz_780_0 using einstein_S6CasA version 105 (SSE2) in slot 3
2/5/2014 8:35:48 PM | | Suspending computation - CPU is busy
2/5/2014 8:35:58 PM | | Resuming computation
2/5/2014 8:36:18 PM | | Suspending computation - CPU is busy
2/5/2014 8:36:28 PM | | Resuming computation

(Note that 'Resuming GPU computation' doesn't appear in the messages.)

By default, it looks like BOINC is going to halt progress for just about any reason it can think of. Sure, it's important not to let crunching interfere with your day to day use of your computer but you may well find that you can ease restrictions without suffering a dramatic penalty. The best way to find out is to start 'full throttle' and then apply restrictions if you need to. On your account page you have 'computing preferences' and 'project preferences'. Here is a list of the ones I would use for your 'full throttle' trial (all computing preferences).

  • * Suspend work while computer is in use? -- no
    * Suspend GPU work while computer is in use? -- no
    * Suspend work if CPU usage is above -- 0
    * Leave tasks in memory while suspended? -- yes
    * On multiprocessors, use at most -- 100%
    * Swap space: use at most -- 90%
    * Memory: when computer is in use, use at most -- 90%
    * Memory: when computer is not in use, use at most -- 100%

Once you change the preferences on the account page, you need to select the Einstein project on the projects tab of BOINC Manager and click 'update'. You should then have 4 CPU tasks and one GPU task, all making progress. To confirm this (and get an idea of the rate of progress), note the starting % values of all 5 running tasks and then note them all again at the end of a test period, say 30-60 mins later. Try not to run anything else that is compute intensive whilst the test is on. Each running task has an elapsed time counter and all time values should be incrementing steadily every second. Calculate the approximate time to complete a task with the following formula:-

Task time (mins) = Selected time interval (mins) X 100 / (Ending % done - Starting % done)


After the test period is finished but whilst still running 'full throttle', try using your computer for your day to day activities and see how responsive it feels. Try to work out if any particular part of your system is suffering. If performance is not acceptable, the first thing to try is setting the "use at most xx% of the processors" to 50% and then clicking 'update' to read the new setting from the website. If you prefer, you can set these things locally through BOINC Manager (Tools -> Computing preferences) but be aware that local preferences (until you clear them) will trump website preferences. With the change to 50%, you should see only 2 CPU tasks running instead of 4. The GPU task should still be running. This should reduce, in particular, the time taken for CPU tasks. You can easily confirm this by running the previous test again. The GPU task may perhaps speed up slightly as well although I suspect not much.

If you try these things, let us know how you get on and we can work out what to do next based on your report. My experience has been that the science apps run at low priority and do tend to get 'out of the way' when necessary so I don't normally use the the restrictive preference settings that tend to cripple BOINC. It very much depends on what your 'normal everyday work' is though :-).

Quote:
Any suggestions for a good temperature monitoring utility?


I'm running Linux on all my machines these days. A couple of years ago, when I had some Windows boxes, I used a utility called CoreTemp. I also used GPU-Z for GPU monitoring. There are quite a few choices that are easily found with Google.

Cheers,
Gary.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7307361689
RAC: 2299180

Regarding temperature

Regarding temperature monitoring, many of us like and use Speedfan on Windows machines. While the name hints that it has an optional fan speed control function, it is pretty good at displaying the reporting temperatures of CPU cores, GPUs, extra motherboard sensors (if present), and hard disk drives, any and all of which it can display on a short-term graph.

The good and bad news is that it has a lot of configuration options. For example you can choose which of the specific temperatures it finds reportable are displayed in either the list or graph views, and introduce temperature offsets if you think you have calibrated an error.

All of this flexibility can make it a little complicated to use.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7307361689
RAC: 2299180

One can see that hermanm's

One can see that hermanm's host returned three tasks (all CPU GW 6 CasA) recently and promptly had them validate and get credit. So something is working at least some of the time.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.