This almost never helps, except in certain very limited cases (corrupt application, entire string of corrupt WUs issued by project, no work but extreme LTD...) but does destroy any work you have on hand, and adds to the load on the servers, as they have to resend you all the application files, etc.
Not to pick on you, as I see "oh, just reset the project" advised all the time. But PLEASE DON'T DO THAT, not until _EVERY_ other option has been tried, and failed. If I were put in charge of redesigning BOINC Manager, the first thing I'd do is make "reset" and "detach" much harder to get to...
My thoughts exactly. Too many people just start punching whatever buttons they find, whether they're having a problem or not. "Reset" and "Detach" should be buried a bit, under a warning, at least.
"The arc of history is long, but it bends toward justice" - MLK
Gary has posted a recommendation to suspend Seti for the time being.
That's exactly what you need to do. Suspend Seti (on the projects tab) and then the other projects should get work. Tell us how you get on. Read the whole thread that Mark referred to if you want the gory details.
I did suspend SETI and Einstein did send work. Finished and uploaded work for Predictor and never heard from it again. At least I'm no longer idle. I'll check tomorrow to see if SETI is feeling better.
Your suggestions worked a treat, altho i did have to download the other files for EAH, now crunching away perfectly. But like everybody else it seems still have all my SETI units finished and waiting to upload... 3 days now on my works PC and 4 days on my Home PC so taken your suggestion to suspend.
Well I hoped I wouldn't see that "Removed from memory" & "Restarting results" messages again, but they cropped up again. Not only for EAH, but CPN as well. (I just joined Thursday AM.) I put pertainent lines in bold print. My comments are in italics preceeded by 3 plus signs.
I'll save the best till last, but if you begin to get a sour stomach seeing the error messages, jump down to the end of the message for some relief. At the bottom of this message there's a NEW PROBLEM (bug?) that has cropped up with ClimatePredictionNet! Ouch!
12/8/2005 12:59:23 AM|Einstein@Home|Started upload of l1_0570.5__0570.7_0.1_T08_S4lD_2_0
12/8/2005 12:59:43 AM|Einstein@Home|Finished upload of l1_0570.5__0570.7_0.1_T08_S4lD_2_0
12/8/2005 12:59:43 AM|Einstein@Home|Throughput 2749 bytes/sec
12/8/2005 1:01:57 AM|Einstein@Home|Sending scheduler request to
12/8/2005 1:01:57 AM|Einstein@Home|Reason: To fetch work
12/8/2005 1:01:57 AM|Einstein@Home|Requesting 8640 seconds of new work, and reporting 1 results
12/8/2005 1:02:18 AM||Couldn't connect to hostname []
12/8/2005 1:02:22 AM|Einstein@Home|Scheduler request to failed with a return value of -106
12/8/2005 1:02:22 AM|Einstein@Home|No schedulers responded
12/8/2005 1:03:23 AM|Einstein@Home|Requesting 8640 seconds of new work, and reporting 1 results
12/8/2005 1:03:44 AM||Couldn't connect to hostname []
12/8/2005 1:03:48 AM|Einstein@Home|Scheduler request to failed with a return value of -106
12/8/2005 1:03:48 AM|Einstein@Home|No schedulers responded
12/8/2005 1:04:30 AM||request_reschedule_cpus: project op
12/8/2005 1:20:53 AM|Einstein@Home|Starting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
+++ (Signed up & downloading files for ClimatePredictor.Net .)
12/8/2005 4:59:02 AM||Master file download succeeded
12/8/2005 4:59:02 AM||Sending scheduler request to
12/8/2005 4:59:02 AM||Reason: Requested by user
12/8/2005 4:59:02 AM||Requesting 8640 seconds of new work
12/8/2005 4:59:05 AM||Scheduler request to succeeded
12/8/2005 4:59:05 AM||Successfully attached to
12/8/2005 4:59:07 AM||Started download of sulphur_4.22_windows_intelx86.exe
12/8/2005 4:59:07 AM||Started download of
12/8/2005 5:10:30 AM||Finished download of sulphur_4.22_windows_intelx86.exe
12/8/2005 5:10:30 AM||Throughput 3114 bytes/sec
12/8/2005 5:10:30 AM||Started download of
12/8/2005 5:18:25 AM||Finished download of
12/8/2005 5:18:25 AM||Throughput 2532 bytes/sec
12/8/2005 5:18:25 AM||Started download of
12/8/2005 5:27:53 AM||Finished download of
12/8/2005 5:27:53 AM||Throughput 2468 bytes/sec
12/8/2005 5:27:53 AM||Started download of
12/8/2005 5:28:04 AM||Finished download of
12/8/2005 5:28:04 AM||Throughput 1678 bytes/sec
12/8/2005 6:31:45 AM||Finished download of
12/8/2005 6:31:45 AM||Throughput 4328 bytes/sec
12/8/2005 6:31:46 AM||request_reschedule_cpus: files downloaded
+++ OK! Here's where EAH removes the result from memory as though it has UL'd, which it hasn't yet.
12/8/2005 6:31:46 AM|Einstein@Home|Pausing result l1_0570.5__0570.8_0.1_T08_S4lD_2 (removed from memory)
12/8/2005 6:31:46 AM||Starting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 6:31:47 AM||request_reschedule_cpus: process exited
12/8/2005 6:53:52 AM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 6:53:53 AM||Couldn't connect to hostname []
12/8/2005 6:53:54 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.14242.940902.89_1_0: system I/O
12/8/2005 6:53:54 AM|SETI@home|Backing off 3 hours, 15 minutes, and 57 seconds on upload of file 13au01aa.6597.14242.940902.89_1_0
+++ EAH restarts result for same file AND CPN removes result from memory!
12/8/2005 7:31:48 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 7:31:48 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 7:31:51 AM||request_reschedule_cpus: process exited
12/8/2005 7:54:56 AM|SETI@home|Started download of 15oc03aa.1719.32993.742326.236
12/8/2005 7:54:56 AM||Couldn't connect to hostname []
12/8/2005 7:54:57 AM|SETI@home|Temporarily failed download of 15oc03aa.1719.32993.742326.236: system I/O
12/8/2005 7:54:57 AM|SETI@home|Backing off 3 hours, 1 minutes, and 39 seconds on download of file 15oc03aa.1719.32993.742326.236
12/8/2005 8:28:57 AM|SETI@home|Started upload of 13au01aa.6597.24417.429814.109_0_0
12/8/2005 8:28:58 AM||Couldn't connect to hostname []
12/8/2005 8:28:58 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.24417.429814.109_0_0: system I/O
12/8/2005 8:28:58 AM|SETI@home|Backing off 2 minutes and 28 seconds on upload of file 13au01aa.6597.24417.429814.109_0_0
12/8/2005 8:31:27 AM|SETI@home|Started upload of 13au01aa.6597.24417.429814.109_0_0
12/8/2005 8:31:28 AM||Couldn't connect to hostname []
12/8/2005 8:31:28 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.24417.429814.109_0_0: system I/O
12/8/2005 8:31:28 AM|SETI@home|Backing off 2 hours, 53 minutes, and 4 seconds on upload of file 13au01aa.6597.24417.429814.109_0_0
+++ EAH removes its result from memory AND CPN restarts same file!
12/8/2005 8:31:52 AM|Einstein@Home|Pausing result l1_0570.5__0570.8_0.1_T08_S4lD_2 (removed from memory)
12/8/2005 8:31:52 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 8:31:53 AM||request_reschedule_cpus: process exited
12/8/2005 8:38:45 AM|SETI@home|Started upload of 13au01aa.6597.28354.311076.55_3_0
12/8/2005 8:38:46 AM||Couldn't connect to hostname []
12/8/2005 8:38:46 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.28354.311076.55_3_0: system I/O
12/8/2005 8:38:46 AM|SETI@home|Backing off 2 hours, 14 minutes, and 47 seconds on upload of file 13au01aa.6597.28354.311076.55_3_0
+++ EAH restarts result for same file AND CPN removes result from memory!
12/8/2005 9:31:53 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 9:31:53 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 9:32:11 AM||request_reschedule_cpus: process exited
12/8/2005 9:33:35 AM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 9:33:36 AM||Couldn't connect to hostname []
12/8/2005 9:33:36 AM|SETI@home|Temporarily failed upload of 15ap04aa.22158.4642.798576.145_2_0: system I/O
12/8/2005 9:33:36 AM|SETI@home|Backing off 1 hours, 10 minutes, and 24 seconds on upload of file 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 10:09:51 AM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 10:09:52 AM||Couldn't connect to hostname []
12/8/2005 10:09:52 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.14242.940902.89_1_0: system I/O
12/8/2005 10:09:52 AM|SETI@home|Backing off 58 minutes and 54 seconds on upload of file 13au01aa.6597.14242.940902.89_1_0
+++ EAH removes its result from memory AND CPN restarts same file!
12/8/2005 10:32:12 AM|Einstein@Home|Pausing result l1_0570.5__0570.8_0.1_T08_S4lD_2 (removed from memory)
12/8/2005 10:32:12 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 10:32:13 AM||request_reschedule_cpus: process exited
12/8/2005 11:32:13 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 11:32:13 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 11:32:37 AM||request_reschedule_cpus: process exited
12/8/2005 10:44:01 AM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
+++ Omitted for brievity: Multiple attempts to upload 4 WUs to SETI in vain... all "SYSTEM I/O" error messages.
+++ EAH finishes computation, but CPN restarts results for its same file. The reason EAH can't connect is due to I haven't been able to get it use my dial-up connection on its own. It keeps disconnecting after making connection. I'm working on this!
12/8/2005 11:50:25 AM||request_reschedule_cpus: process exited
12/8/2005 11:50:25 AM|Einstein@Home|Computation for result l1_0570.5__0570.8_0.1_T08_S4lD_2 finished
12/8/2005 11:50:25 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 11:50:27 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:50:27 AM||Couldn't connect to hostname []
12/8/2005 11:50:28 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:50:28 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM||Couldn't connect to hostname []
12/8/2005 11:51:29 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:51:29 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM||Couldn't connect to hostname []
12/8/2005 11:52:30 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:52:30 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:30 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:31 AM||Couldn't connect to hostname []
12/8/2005 11:53:31 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:53:31 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:31 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:32 AM||Couldn't connect to hostname []
12/8/2005 11:54:32 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:54:32 AM|Einstein@Home|Backing off 1 minutes and 37 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:10 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:11 AM||Couldn't connect to hostname []
12/8/2005 11:56:11 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:56:11 AM|Einstein@Home|Backing off 3 minutes and 52 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:04 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:05 PM||Couldn't connect to hostname []
12/8/2005 12:00:05 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:00:05 PM|Einstein@Home|Backing off 17 minutes and 39 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:45 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:46 PM||Couldn't connect to hostname []
12/8/2005 12:17:46 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:17:46 PM|Einstein@Home|Backing off 35 minutes and 57 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
+++ I got on the Internet here and EAH ULs the file, but fails to get another WU.
12/8/2005 12:34:43 PM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 12:34:43 PM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 12:34:43 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Finished upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Throughput 3217 bytes/sec
12/8/2005 12:34:59 PM|Einstein@Home|Sending scheduler request to
12/8/2005 12:34:59 PM|Einstein@Home|Reason: To report results
12/8/2005 12:34:59 PM|Einstein@Home|Reporting 1 results
12/8/2005 12:35:05 PM|Einstein@Home|Scheduler request to succeeded
+++ Multiple attempts to connect to SETI in vain. 30 lines. All "System I/O" error messages. I omitted these for brievity.
+++ No WU downloaded for EAH, so I tried "Retry Communications" and only received the following.
+++ No WU downloaded for EAH, but see the next line? EAH is restarting to crunch the same file AND CPN is doing the same shortly thereafter.
12/8/2005 11:32:13 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 11:32:13 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 11:32:37 AM||request_reschedule_cpus: process exited
12/8/2005 11:50:25 AM||request_reschedule_cpus: process exited
12/8/2005 11:50:25 AM|Einstein@Home|Computation for result l1_0570.5__0570.8_0.1_T08_S4lD_2 finished
12/8/2005 11:50:25 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 11:50:27 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:50:27 AM||Couldn't connect to hostname []
12/8/2005 11:50:28 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:50:28 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM||Couldn't connect to hostname []
12/8/2005 11:51:29 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:51:29 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM||Couldn't connect to hostname []
12/8/2005 11:52:30 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:52:30 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:30 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:31 AM||Couldn't connect to hostname []
12/8/2005 11:53:31 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:53:31 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:31 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:32 AM||Couldn't connect to hostname []
12/8/2005 11:54:32 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:54:32 AM|Einstein@Home|Backing off 1 minutes and 37 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:10 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:11 AM||Couldn't connect to hostname []
12/8/2005 11:56:11 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:56:11 AM|Einstein@Home|Backing off 3 minutes and 52 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:04 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:05 PM||Couldn't connect to hostname []
12/8/2005 12:00:05 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:00:05 PM|Einstein@Home|Backing off 17 minutes and 39 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:45 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:46 PM||Couldn't connect to hostname []
12/8/2005 12:17:46 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:17:46 PM|Einstein@Home|Backing off 35 minutes and 57 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:43 PM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 12:34:43 PM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
+++ EAH finally ULs its file, but no DL'd WU.
12/8/2005 12:34:43 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Finished upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Throughput 3217 bytes/sec
12/8/2005 12:34:59 PM|Einstein@Home|Sending scheduler request to
12/8/2005 12:34:59 PM|Einstein@Home|Reason: To report results
12/8/2005 12:34:59 PM|Einstein@Home|Reporting 1 results
12/8/2005 12:35:05 PM|Einstein@Home|Scheduler request to succeeded
+++ From 12:37:53 PM to 1:32:14 PM, omitted 31 lines for SETI (in vain).
+++ No WU DL'd. and none showing in under the work tab.
+++ From 1:32:41 PM to 8:16:36 PM, 86 lines in attempts to Upload to SETI in vain. But there were 2 messages RE: SETI that weren't a "SYSTEM I/O" or "error 500", I included a few lines before & after this message.
12/8/2005 5:17:49 PM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 5:17:49 PM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 5:21:18 PM||Attempting to send data to [] failed [failed sending data to the peer]
12/8/2005 5:21:18 PM|SETI@home|Temporarily failed upload of 13au01aa.6597.14242.940902.89_1_0: system write
12/8/2005 5:21:18 PM|SETI@home|Backing off 3 hours, 44 minutes, and 40 seconds on upload of file 13au01aa.6597.14242.940902.89_1_0
+++ But low and behold - a Download!
12/8/2005 8:16:36 PM|SETI@home|Started download of 15oc03aa.1719.32993.742326.236
12/8/2005 8:18:09 PM|SETI@home|Finished download of 15oc03aa.1719.32993.742326.236
12/8/2005 8:18:09 PM|SETI@home|Throughput 3932 bytes/sec
12/8/2005 8:18:10 PM||request_reschedule_cpus: files downloaded
12/8/2005 8:19:39 PM||request_reschedule_cpus: project op
+++ Here I tried to get a WU for EAH as I still had nothing showing under the work tab.
+++ This is followed by 61 lines to 12/9/2005 2:44:53 AM. Attempts to UL 4 WUs to SETI in vain. All "SYSTEM I/O" error messages, with the exception that the last 3 were "error 500". That was when I got off the Internet. Hmmm.
Still no WU for EAH!
After 16:16:00 CPU Time, ClimatePredictionNet is only 0.60 % completed AND its saying it will take another 1,143:39:31 to complete! And the time is slowing getting greater instead of shorter! Good Lord! That's about 5 months! Something is amiss!
I suspended the CPN WU, got on the Internet to post this on the message board, and CPN DL'd another WU and started crunching that. It too showed it would take over 1141 hours to completion. I suspended CPN altogether so it wouldn't DL any more WUs. Low and behold! A WU from EAH DLs and starts running! You did say to trust BOINC, now didn't you?
Here's the messages from when I got on the Internet a few moments ago. Note that CPN removed from memory the results of the file I originally DL'd and worked at for 16 hours, although the file & stats still appear in the Work tab. Maybe they'll disappear later on when there's some free CPU cycles. Okay, it's snowing and I've got to get some sleep before I have to get the wife's shovel and snow blower out for her. I'm a "sidewalk superintendent" don't you know! ;0)
12/9/2005 3:53:54 AM||Sending scheduler request to
12/9/2005 3:53:54 AM||Reason: To fetch work
12/9/2005 3:53:54 AM||Requesting 1 seconds of new work
12/9/2005 3:54:04 AM||Scheduler request to succeeded
12/9/2005 3:54:06 AM||Started download of
12/9/2005 3:54:17 AM||Finished download of
12/9/2005 3:54:17 AM||Throughput 1729 bytes/sec
12/9/2005 3:54:18 AM||request_reschedule_cpus: files downloaded
12/9/2005 3:54:18 AM||Starting result sulphur_f6yj_000708859_0 using sulphur_cycle version 422
12/9/2005 3:56:00 AM||request_reschedule_cpus: project op
12/9/2005 3:56:00 AM||Pausing result sulphur_f6yj_000708859_0 (removed from memory)
12/9/2005 3:56:04 AM|Einstein@Home|Sending scheduler request to
12/9/2005 3:56:04 AM|Einstein@Home|Reason: To fetch work
12/9/2005 3:56:04 AM|Einstein@Home|Requesting 8640 seconds of new work
12/9/2005 3:56:09 AM|Einstein@Home|Scheduler request to succeeded
12/9/2005 3:56:11 AM||request_reschedule_cpus: files downloaded
12/9/2005 3:56:11 AM|Einstein@Home|Starting result l1_0570.5__0570.7_0.1_T09_S4lD_3 using einstein version 479
12/9/2005 4:07:07 AM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/9/2005 4:10:17 AM|SETI@home|Temporarily failed upload of 15ap04aa.22158.4642.798576.145_2_0: error 500
12/9/2005 4:10:17 AM|SETI@home|Backing off 2 hours, 25 minutes, and 43 seconds on upload of file 15ap04aa.22158.4642.798576.145_2_0
12/9/2005 4:16:52 AM|SETI@home|Started upload of 13au01aa.6597.24417.429814.109_0_0
12/9/2005 4:20:03 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.24417.429814.109_0_0: error 500
12/9/2005 4:20:03 AM|SETI@home|Backing off 2 hours, 30 minutes, and 17 seconds on upload of file 13au01aa.6597.24417.429814.109_0_0
Mine says that all the time, every time it switches from one work unit to the next ie SETI to EAH, I get the same message removed from memory... but when it switches back again it appears to just continue the work from where it left off an hour before.. I thought it was just a standard message, does this mean I have another glitch I didnt know about...??
I'll save the best till last, but if you begin to get a sour stomach seeing the error messages, jump down to the end of the message for some relief. At the bottom of this message there's a NEW PROBLEM (bug?) that has cropped up with ClimatePredictionNet! Ouch!
Jim, there's nothing wrong with any of what you posted, other than the SETI problems. It's all normal behavior. I'll attempt to explain why...
+++ OK! Here's where EAH removes the result from memory as though it has UL'd, which it hasn't yet.
12/8/2005 6:31:46 AM|Einstein@Home|Pausing result l1_0570.5__0570.8_0.1_T08_S4lD_2 (removed from memory)
If you look on your general preferences webpage, you'll see that "leave applications in memory" is set to "no". Generally, it is better to leave this at "yes", as otherwise you _will_ lose a little bit of crunching time (it will restart from last checkpoint instead of from exactly where it left off) so you probably should change this, but with SETI, Einstein, and CPDN, it's not a real problem either way. Some projects will error out if this is a "no".
+++ EAH restarts result for same file AND CPN removes result from memory!
Again - you have "switch between" set to the default 60 minutes, so when everything is working perfectly, it's going to stop work on one and start another, every hour.
+++ No WU downloaded for EAH, so I tried "Retry Communications" and only received the following.
12/8/2005 1:32:18 PM|Einstein@Home|Note: not requesting new work or reporting results
This is saying that BOINC doesn't "want" Einstein work right now, it has enough of SETI and CPDN, and Einstein is a bit "ahead" of where your resource share settings say it should be.
+++ No WU downloaded for EAH, but see the next line? EAH is restarting to crunch the same file AND CPN is doing the same shortly thereafter.
This is not at an hour break, because you manually updated. That made it recalculate which project was "farthest behind", and switch to it.
+++ But low and behold - a Download!
12/8/2005 8:16:36 PM|SETI@home|Started download of 15oc03aa.1719.32993.742326.236
+++ Here I tried to get a WU for EAH as I still had nothing showing under the work tab.
12/8/2005 8:19:44 PM|Einstein@Home|Note: not requesting new work or reporting results
It decided that the number of SETI's in your cache was low, but it still says Einstein is far enough ahead that it still doesn't want any.
After 16:16:00 CPU Time, ClimatePredictionNet is only 0.60 % completed AND its saying it will take another 1,143:39:31 to complete! And the time is slowing getting greater instead of shorter! Good Lord! That's about 5 months! Something is amiss!
No... CPDN's "sulphur" WUs have a ONE YEAR deadline. They take months to run!!! The "estimated" time to complete is a guess from the project based on your benchmarks. As it runs, BOINC can make a better guess of how fast your CPU will actually finish the result, so it updates this time continuously. It could easily continue to grow all the way to 50% completion, three or four or six months from now.
I suspended the CPN WU, got on the Internet to post this on the message board, and CPN DL'd another WU and started crunching that. It too showed it would take over 1141 hours to completion. I suspended CPN altogether so it wouldn't DL any more WUs. Low and behold! A WU from EAH DLs and starts running! You did say to trust BOINC, now didn't you?
Suspending a single WU does just that; suspends that WU. So BOINC looks and says "hey, you said to do CPDN, but I don't have one, I need to download one" and does. Suspending the _project_ CPDN on the other hand says "don't do ANY CPDN work" - so it looked just at SETI and Einstein, and said "well, Einstein is still 'ahead', but not too far... I'll get some work."
Here's the messages from when I got on the Internet a few moments ago. Note that CPN removed from memory the results of the file I originally DL'd and worked at for 16 hours, although the file & stats still appear in the Work tab. Maybe they'll disappear later on when there's some free CPU cycles.
Nope - they'll sit right there until you either "abort" the WU or unsuspend it and it completes. (Yeah, right... especially competing with the other one...)
Here's what you need to do. Resume the suspended CPDN _WORK_UNIT_. Then resume the CPDN _project_, but put it on "no new work". It should start to work (maybe at the next hour break) on the original one, which I assume will have more time on it than the new one (zero or nearly zero). In the work tab, click on the CPDN unit with the least work (this way you'll waste the bare minimum of time). Hit "Abort" on it. Then suspend the _other_ (running) WU (still in the work tab). This will force BOINC to work on the one you have said to abort - in just a few seconds, it should upload the aborted WU to CPDN and get it off your system. Once it has, resume the remaining 'good' CPDN WU. I would leave the project set on "no new work" and _only_ get CPDN work manually, when you want to...
Now - other than changing "leave in memory" to yes, so you lose the minimum number of crunching minutes possible... you're done. If the three projects have equal resource shares, you can sit back and watch it run. "Trust BOINC" - each week (not necessarily each _day_) it will give the "right" amount of CPU time to each project. When the CPDN WU is somewhere around 12 CPU hours, go look at your CPDN account - you'll have credit (161 of them, to be exact). CPDN gives you credit something like 192? times per WU in "trickles"; I haven't been running mine long enough to really be familiar with their system.
There are two things you need to worry about. First, if you get too many SETI results stuck in a "downloading" state, it can cause BOINC to stop getting work from Einstein. Ignore that for the first day. If the second day it still has WU's "stuck", and still isn't getting Einstein, then suspend the SETI project for a day or two, until they have their problems solved.
Second, if four or five months from now you haven't made significant progress on that CPDN result, BOINC _could_ decide that it is in danger of not finishing in time, and will devote 100% of it's time to CPDN. That's really still OK, but instead of your resource share being honored "weekly", suddenly it's only going to be honored over the next year... and you probably don't want that. So make sure CPDN gets at least 30% of your resource share, and keep an eye on that "to completion" figure. If it starts looking like it's not going to finish by the deadline, you should give CPDN a bigger share of your computer, and reconsider if your computer is fast enough to run three projects with one of them being CPDN.
@Lynette - your answer's buried in there too... :-)
Thanks for setting me straight, Bill. Now I understand, especially about CPDN's deadline time. That really had me worried. I was beginning to think there was something wrong with my CPU!
Gary was right - just trust BOINC! It'll do its thing, in due course.
I'll do what you suggest and get back to my original CDPN WU. No sense in wasting those 16+ hours of CPU time.
Trust BOINC... Trust BOINC... Trust BOINC. Okay, that's implanted in my head. ;o)
Trust BOINC... Trust BOINC... Trust BOINC. Okay, that's implanted in my head. ;o)
Jim, I don't want to give you the impression that BOINC is perfect... So PLEASE feel free to continue to ask questions, whenever you have one. What I'm trying to get across to EVERYONE, is that when it seems like something is wrong, 90% of the time, the best thing to do is "nothing". Everyone has this itch to "try something" to fix a problem. The things _you_ have tried are very logical, natural possibilities, with low risk. Some people on the other hand will just hit every button and select every menu option, at random, hoping something will work. Then, after they've mangled everything beyond repair, they come here and scream that "bionic sux".
With the current SETI problems, many people _are_ finding it necessary to Suspend SETI. This is an exceptional case; we have a project that is "lying to the scheduler", saying "this work is about to be there", and the scheduler is doing it's job of protecting us from having too much work to finish in time. This makes people think the scheduler isn't working, because they don't have work... Do I, personally, think there's a bug in the scheduler? Yes. Simply because if the CPU is totally idle, "downloading" work should only be counted against the project it's downloading _from_. (This still wouldn't make everybody happy, or get everybody work, but it would solve the REAL issue.) But this is one of those "happens 0.000001% of the time" cases, that could NEVER have been predicted to have the effects it's having. The rules the scheduler is operating under are correct... except. :-)
What ever is going on is effecting all the projects except for Climate Predictor.
Robert, to help, we need SOME info from you. Go to the Projects tab, select one of the projects that you do NOT have work from, and hit "update". Then go to the Messages tab and copy the messages you get in response and paste them here.
Gary,I suspended SETI, However, each of the other agents that are running provide the error message "communication deferred for X minutes" when asking for an update.
RE: This almost never
My thoughts exactly. Too many people just start punching whatever buttons they find, whether they're having a problem or not. "Reset" and "Detach" should be buried a bit, under a warning, at least.
"The arc of history is long, but it bends toward justice" - MLK
RE: RE: Gary has posted a
I did suspend SETI and Einstein did send work. Finished and uploaded work for Predictor and never heard from it again. At least I'm no longer idle. I'll check tomorrow to see if SETI is feeling better.
Hi Gary Your suggestions
Hi Gary
Your suggestions worked a treat, altho i did have to download the other files for EAH, now crunching away perfectly. But like everybody else it seems still have all my SETI units finished and waiting to upload... 3 days now on my works PC and 4 days on my Home PC so taken your suggestion to suspend.
(must be manflu...... ;-)... )
thanks again
Hi Gary, Well I hoped I
Hi Gary,
Well I hoped I wouldn't see that "Removed from memory" & "Restarting results" messages again, but they cropped up again. Not only for EAH, but CPN as well. (I just joined Thursday AM.) I put pertainent lines in bold print. My comments are in italics preceeded by 3 plus signs.
I'll save the best till last, but if you begin to get a sour stomach seeing the error messages, jump down to the end of the message for some relief. At the bottom of this message there's a NEW PROBLEM (bug?) that has cropped up with ClimatePredictionNet! Ouch!
12/8/2005 12:59:23 AM|Einstein@Home|Started upload of l1_0570.5__0570.7_0.1_T08_S4lD_2_0
12/8/2005 12:59:43 AM|Einstein@Home|Finished upload of l1_0570.5__0570.7_0.1_T08_S4lD_2_0
12/8/2005 12:59:43 AM|Einstein@Home|Throughput 2749 bytes/sec
12/8/2005 1:01:57 AM|Einstein@Home|Sending scheduler request to
12/8/2005 1:01:57 AM|Einstein@Home|Reason: To fetch work
12/8/2005 1:01:57 AM|Einstein@Home|Requesting 8640 seconds of new work, and reporting 1 results
12/8/2005 1:02:18 AM||Couldn't connect to hostname []
12/8/2005 1:02:22 AM|Einstein@Home|Scheduler request to failed with a return value of -106
12/8/2005 1:02:22 AM|Einstein@Home|No schedulers responded
12/8/2005 1:03:23 AM|Einstein@Home|Requesting 8640 seconds of new work, and reporting 1 results
12/8/2005 1:03:44 AM||Couldn't connect to hostname []
12/8/2005 1:03:48 AM|Einstein@Home|Scheduler request to failed with a return value of -106
12/8/2005 1:03:48 AM|Einstein@Home|No schedulers responded
12/8/2005 1:04:30 AM||request_reschedule_cpus: project op
12/8/2005 1:20:46 AM|Einstein@Home|Sending scheduler request to
12/8/2005 1:20:46 AM|Einstein@Home|Reason: To fetch work
12/8/2005 1:20:46 AM|Einstein@Home|Requesting 8640 seconds of new work, and reporting 1 results
12/8/2005 1:20:51 AM|Einstein@Home|Scheduler request to succeeded
12/8/2005 1:20:53 AM|Einstein@Home|Starting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
+++ (Signed up & downloading files for ClimatePredictor.Net .)
12/8/2005 4:59:02 AM||Master file download succeeded
12/8/2005 4:59:02 AM||Sending scheduler request to
12/8/2005 4:59:02 AM||Reason: Requested by user
12/8/2005 4:59:02 AM||Requesting 8640 seconds of new work
12/8/2005 4:59:05 AM||Scheduler request to succeeded
12/8/2005 4:59:05 AM||Successfully attached to
12/8/2005 4:59:07 AM||Started download of sulphur_4.22_windows_intelx86.exe
12/8/2005 4:59:07 AM||Started download of
12/8/2005 5:10:30 AM||Finished download of sulphur_4.22_windows_intelx86.exe
12/8/2005 5:10:30 AM||Throughput 3114 bytes/sec
12/8/2005 5:10:30 AM||Started download of
12/8/2005 5:18:25 AM||Finished download of
12/8/2005 5:18:25 AM||Throughput 2532 bytes/sec
12/8/2005 5:18:25 AM||Started download of
12/8/2005 5:27:53 AM||Finished download of
12/8/2005 5:27:53 AM||Throughput 2468 bytes/sec
12/8/2005 5:27:53 AM||Started download of
12/8/2005 5:28:04 AM||Finished download of
12/8/2005 5:28:04 AM||Throughput 1678 bytes/sec
12/8/2005 6:31:45 AM||Finished download of
12/8/2005 6:31:45 AM||Throughput 4328 bytes/sec
12/8/2005 6:31:46 AM||request_reschedule_cpus: files downloaded
+++ OK! Here's where EAH removes the result from memory as though it has UL'd, which it hasn't yet.
12/8/2005 6:31:46 AM|Einstein@Home|Pausing result l1_0570.5__0570.8_0.1_T08_S4lD_2 (removed from memory)
12/8/2005 6:31:46 AM||Starting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 6:31:47 AM||request_reschedule_cpus: process exited
12/8/2005 6:53:52 AM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 6:53:53 AM||Couldn't connect to hostname []
12/8/2005 6:53:54 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.14242.940902.89_1_0: system I/O
12/8/2005 6:53:54 AM|SETI@home|Backing off 3 hours, 15 minutes, and 57 seconds on upload of file 13au01aa.6597.14242.940902.89_1_0
+++ EAH restarts result for same file AND CPN removes result from memory!
12/8/2005 7:31:48 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 7:31:48 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 7:31:51 AM||request_reschedule_cpus: process exited
12/8/2005 7:54:56 AM|SETI@home|Started download of 15oc03aa.1719.32993.742326.236
12/8/2005 7:54:56 AM||Couldn't connect to hostname []
12/8/2005 7:54:57 AM|SETI@home|Temporarily failed download of 15oc03aa.1719.32993.742326.236: system I/O
12/8/2005 7:54:57 AM|SETI@home|Backing off 3 hours, 1 minutes, and 39 seconds on download of file 15oc03aa.1719.32993.742326.236
12/8/2005 8:28:57 AM|SETI@home|Started upload of 13au01aa.6597.24417.429814.109_0_0
12/8/2005 8:28:58 AM||Couldn't connect to hostname []
12/8/2005 8:28:58 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.24417.429814.109_0_0: system I/O
12/8/2005 8:28:58 AM|SETI@home|Backing off 2 minutes and 28 seconds on upload of file 13au01aa.6597.24417.429814.109_0_0
12/8/2005 8:31:27 AM|SETI@home|Started upload of 13au01aa.6597.24417.429814.109_0_0
12/8/2005 8:31:28 AM||Couldn't connect to hostname []
12/8/2005 8:31:28 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.24417.429814.109_0_0: system I/O
12/8/2005 8:31:28 AM|SETI@home|Backing off 2 hours, 53 minutes, and 4 seconds on upload of file 13au01aa.6597.24417.429814.109_0_0
+++ EAH removes its result from memory AND CPN restarts same file!
12/8/2005 8:31:52 AM|Einstein@Home|Pausing result l1_0570.5__0570.8_0.1_T08_S4lD_2 (removed from memory)
12/8/2005 8:31:52 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 8:31:53 AM||request_reschedule_cpus: process exited
12/8/2005 8:38:45 AM|SETI@home|Started upload of 13au01aa.6597.28354.311076.55_3_0
12/8/2005 8:38:46 AM||Couldn't connect to hostname []
12/8/2005 8:38:46 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.28354.311076.55_3_0: system I/O
12/8/2005 8:38:46 AM|SETI@home|Backing off 2 hours, 14 minutes, and 47 seconds on upload of file 13au01aa.6597.28354.311076.55_3_0
+++ EAH restarts result for same file AND CPN removes result from memory!
12/8/2005 9:31:53 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 9:31:53 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 9:32:11 AM||request_reschedule_cpus: process exited
12/8/2005 9:33:35 AM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 9:33:36 AM||Couldn't connect to hostname []
12/8/2005 9:33:36 AM|SETI@home|Temporarily failed upload of 15ap04aa.22158.4642.798576.145_2_0: system I/O
12/8/2005 9:33:36 AM|SETI@home|Backing off 1 hours, 10 minutes, and 24 seconds on upload of file 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 10:09:51 AM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 10:09:52 AM||Couldn't connect to hostname []
12/8/2005 10:09:52 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.14242.940902.89_1_0: system I/O
12/8/2005 10:09:52 AM|SETI@home|Backing off 58 minutes and 54 seconds on upload of file 13au01aa.6597.14242.940902.89_1_0
+++ EAH removes its result from memory AND CPN restarts same file!
12/8/2005 10:32:12 AM|Einstein@Home|Pausing result l1_0570.5__0570.8_0.1_T08_S4lD_2 (removed from memory)
12/8/2005 10:32:12 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 10:32:13 AM||request_reschedule_cpus: process exited
12/8/2005 11:32:13 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 11:32:13 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 11:32:37 AM||request_reschedule_cpus: process exited
12/8/2005 10:44:01 AM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
+++ Omitted for brievity: Multiple attempts to upload 4 WUs to SETI in vain... all "SYSTEM I/O" error messages.
+++ EAH finishes computation, but CPN restarts results for its same file. The reason EAH can't connect is due to I haven't been able to get it use my dial-up connection on its own. It keeps disconnecting after making connection. I'm working on this!
12/8/2005 11:50:25 AM||request_reschedule_cpus: process exited
12/8/2005 11:50:25 AM|Einstein@Home|Computation for result l1_0570.5__0570.8_0.1_T08_S4lD_2 finished
12/8/2005 11:50:25 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 11:50:27 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:50:27 AM||Couldn't connect to hostname []
12/8/2005 11:50:28 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:50:28 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM||Couldn't connect to hostname []
12/8/2005 11:51:29 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:51:29 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM||Couldn't connect to hostname []
12/8/2005 11:52:30 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:52:30 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:30 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:31 AM||Couldn't connect to hostname []
12/8/2005 11:53:31 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:53:31 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:31 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:32 AM||Couldn't connect to hostname []
12/8/2005 11:54:32 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:54:32 AM|Einstein@Home|Backing off 1 minutes and 37 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:10 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:11 AM||Couldn't connect to hostname []
12/8/2005 11:56:11 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:56:11 AM|Einstein@Home|Backing off 3 minutes and 52 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:04 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:05 PM||Couldn't connect to hostname []
12/8/2005 12:00:05 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:00:05 PM|Einstein@Home|Backing off 17 minutes and 39 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:45 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:46 PM||Couldn't connect to hostname []
12/8/2005 12:17:46 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:17:46 PM|Einstein@Home|Backing off 35 minutes and 57 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
+++ I got on the Internet here and EAH ULs the file, but fails to get another WU.
12/8/2005 12:34:43 PM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 12:34:43 PM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 12:34:43 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Finished upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Throughput 3217 bytes/sec
12/8/2005 12:34:59 PM|Einstein@Home|Sending scheduler request to
12/8/2005 12:34:59 PM|Einstein@Home|Reason: To report results
12/8/2005 12:34:59 PM|Einstein@Home|Reporting 1 results
12/8/2005 12:35:05 PM|Einstein@Home|Scheduler request to succeeded
+++ Multiple attempts to connect to SETI in vain. 30 lines. All "System I/O" error messages. I omitted these for brievity.
+++ No WU downloaded for EAH, so I tried "Retry Communications" and only received the following.
12/8/2005 1:32:18 PM|Einstein@Home|Sending scheduler request to
12/8/2005 1:32:18 PM|Einstein@Home|Reason: Requested by user
12/8/2005 1:32:18 PM|Einstein@Home|Note: not requesting new work or reporting results
12/8/2005 1:32:23 PM|Einstein@Home|Scheduler request to succeeded
+++ Tried "Retry Communications" again after getting home from our grandson's school concert.
12/8/2005 8:19:44 PM|Einstein@Home|Sending scheduler request to
12/8/2005 8:19:44 PM|Einstein@Home|Reason: Requested by user
12/8/2005 8:19:44 PM|Einstein@Home|Note: not requesting new work or reporting results
12/8/2005 8:19:49 PM|Einstein@Home|Scheduler request to succeeded
+++ No WU downloaded for EAH, but see the next line? EAH is restarting to crunch the same file AND CPN is doing the same shortly thereafter.
12/8/2005 11:32:13 AM|Einstein@Home|Restarting result l1_0570.5__0570.8_0.1_T08_S4lD_2 using einstein version 479
12/8/2005 11:32:13 AM||Pausing result sulphur_eodw_000684788_0 (removed from memory)
12/8/2005 11:32:37 AM||request_reschedule_cpus: process exited
12/8/2005 11:50:25 AM||request_reschedule_cpus: process exited
12/8/2005 11:50:25 AM|Einstein@Home|Computation for result l1_0570.5__0570.8_0.1_T08_S4lD_2 finished
12/8/2005 11:50:25 AM||Restarting result sulphur_eodw_000684788_0 using sulphur_cycle version 422
12/8/2005 11:50:27 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:50:27 AM||Couldn't connect to hostname []
12/8/2005 11:50:28 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:50:28 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:51:28 AM||Couldn't connect to hostname []
12/8/2005 11:51:29 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:51:29 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:52:29 AM||Couldn't connect to hostname []
12/8/2005 11:52:30 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:52:30 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:30 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:53:31 AM||Couldn't connect to hostname []
12/8/2005 11:53:31 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:53:31 AM|Einstein@Home|Backing off 1 minutes and 0 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:31 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:54:32 AM||Couldn't connect to hostname []
12/8/2005 11:54:32 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:54:32 AM|Einstein@Home|Backing off 1 minutes and 37 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:10 AM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 11:56:11 AM||Couldn't connect to hostname []
12/8/2005 11:56:11 AM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 11:56:11 AM|Einstein@Home|Backing off 3 minutes and 52 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:04 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:00:05 PM||Couldn't connect to hostname []
12/8/2005 12:00:05 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:00:05 PM|Einstein@Home|Backing off 17 minutes and 39 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:45 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:17:46 PM||Couldn't connect to hostname []
12/8/2005 12:17:46 PM|Einstein@Home|Temporarily failed upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0: system I/O
12/8/2005 12:17:46 PM|Einstein@Home|Backing off 35 minutes and 57 seconds on upload of file l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:43 PM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 12:34:43 PM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
+++ EAH finally ULs its file, but no DL'd WU.
12/8/2005 12:34:43 PM|Einstein@Home|Started upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Finished upload of l1_0570.5__0570.8_0.1_T08_S4lD_2_0
12/8/2005 12:34:57 PM|Einstein@Home|Throughput 3217 bytes/sec
12/8/2005 12:34:59 PM|Einstein@Home|Sending scheduler request to
12/8/2005 12:34:59 PM|Einstein@Home|Reason: To report results
12/8/2005 12:34:59 PM|Einstein@Home|Reporting 1 results
12/8/2005 12:35:05 PM|Einstein@Home|Scheduler request to succeeded
+++ From 12:37:53 PM to 1:32:14 PM, omitted 31 lines for SETI (in vain).
12/8/2005 1:32:18 PM|Einstein@Home|Sending scheduler request to
12/8/2005 1:32:18 PM|Einstein@Home|Reason: Requested by user
12/8/2005 1:32:18 PM|Einstein@Home|Note: not requesting new work or reporting results
12/8/2005 1:32:23 PM|Einstein@Home|Scheduler request to succeeded
+++ No WU DL'd. and none showing in under the work tab.
+++ From 1:32:41 PM to 8:16:36 PM, 86 lines in attempts to Upload to SETI in vain. But there were 2 messages RE: SETI that weren't a "SYSTEM I/O" or "error 500", I included a few lines before & after this message.
12/8/2005 5:17:49 PM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/8/2005 5:17:49 PM|SETI@home|Started upload of 13au01aa.6597.14242.940902.89_1_0
12/8/2005 5:21:18 PM||Attempting to send data to [] failed [failed sending data to the peer]
12/8/2005 5:21:18 PM|SETI@home|Temporarily failed upload of 13au01aa.6597.14242.940902.89_1_0: system write
12/8/2005 5:21:18 PM|SETI@home|Backing off 3 hours, 44 minutes, and 40 seconds on upload of file 13au01aa.6597.14242.940902.89_1_0
+++ But low and behold - a Download!
12/8/2005 8:16:36 PM|SETI@home|Started download of 15oc03aa.1719.32993.742326.236
12/8/2005 8:18:09 PM|SETI@home|Finished download of 15oc03aa.1719.32993.742326.236
12/8/2005 8:18:09 PM|SETI@home|Throughput 3932 bytes/sec
12/8/2005 8:18:10 PM||request_reschedule_cpus: files downloaded
12/8/2005 8:19:39 PM||request_reschedule_cpus: project op
+++ Here I tried to get a WU for EAH as I still had nothing showing under the work tab.
12/8/2005 8:19:44 PM|Einstein@Home|Sending scheduler request to
12/8/2005 8:19:44 PM|Einstein@Home|Reason: Requested by user
12/8/2005 8:19:44 PM|Einstein@Home|Note: not requesting new work or reporting results
12/8/2005 8:19:49 PM|Einstein@Home|Scheduler request to succeeded
+++ This is followed by 61 lines to 12/9/2005 2:44:53 AM. Attempts to UL 4 WUs to SETI in vain. All "SYSTEM I/O" error messages, with the exception that the last 3 were "error 500". That was when I got off the Internet. Hmmm.
Still no WU for EAH!
After 16:16:00 CPU Time, ClimatePredictionNet is only 0.60 % completed AND its saying it will take another 1,143:39:31 to complete! And the time is slowing getting greater instead of shorter! Good Lord! That's about 5 months! Something is amiss!
I suspended the CPN WU, got on the Internet to post this on the message board, and CPN DL'd another WU and started crunching that. It too showed it would take over 1141 hours to completion. I suspended CPN altogether so it wouldn't DL any more WUs. Low and behold! A WU from EAH DLs and starts running! You did say to trust BOINC, now didn't you?
Here's the messages from when I got on the Internet a few moments ago. Note that CPN removed from memory the results of the file I originally DL'd and worked at for 16 hours, although the file & stats still appear in the Work tab. Maybe they'll disappear later on when there's some free CPU cycles. Okay, it's snowing and I've got to get some sleep before I have to get the wife's shovel and snow blower out for her. I'm a "sidewalk superintendent" don't you know! ;0)
12/9/2005 3:53:54 AM||Sending scheduler request to
12/9/2005 3:53:54 AM||Reason: To fetch work
12/9/2005 3:53:54 AM||Requesting 1 seconds of new work
12/9/2005 3:54:04 AM||Scheduler request to succeeded
12/9/2005 3:54:06 AM||Started download of
12/9/2005 3:54:17 AM||Finished download of
12/9/2005 3:54:17 AM||Throughput 1729 bytes/sec
12/9/2005 3:54:18 AM||request_reschedule_cpus: files downloaded
12/9/2005 3:54:18 AM||Starting result sulphur_f6yj_000708859_0 using sulphur_cycle version 422
12/9/2005 3:56:00 AM||request_reschedule_cpus: project op
12/9/2005 3:56:00 AM||Pausing result sulphur_f6yj_000708859_0 (removed from memory)
12/9/2005 3:56:04 AM|Einstein@Home|Sending scheduler request to
12/9/2005 3:56:04 AM|Einstein@Home|Reason: To fetch work
12/9/2005 3:56:04 AM|Einstein@Home|Requesting 8640 seconds of new work
12/9/2005 3:56:09 AM|Einstein@Home|Scheduler request to succeeded
12/9/2005 3:56:11 AM||request_reschedule_cpus: files downloaded
12/9/2005 3:56:11 AM|Einstein@Home|Starting result l1_0570.5__0570.7_0.1_T09_S4lD_3 using einstein version 479
12/9/2005 4:07:07 AM|SETI@home|Started upload of 15ap04aa.22158.4642.798576.145_2_0
12/9/2005 4:10:17 AM|SETI@home|Temporarily failed upload of 15ap04aa.22158.4642.798576.145_2_0: error 500
12/9/2005 4:10:17 AM|SETI@home|Backing off 2 hours, 25 minutes, and 43 seconds on upload of file 15ap04aa.22158.4642.798576.145_2_0
12/9/2005 4:16:52 AM|SETI@home|Started upload of 13au01aa.6597.24417.429814.109_0_0
12/9/2005 4:20:03 AM|SETI@home|Temporarily failed upload of 13au01aa.6597.24417.429814.109_0_0: error 500
12/9/2005 4:20:03 AM|SETI@home|Backing off 2 hours, 30 minutes, and 17 seconds on upload of file 13au01aa.6597.24417.429814.109_0_0
+++ end of messages.
Hi... Mine says that all
Mine says that all the time, every time it switches from one work unit to the next ie SETI to EAH, I get the same message removed from memory... but when it switches back again it appears to just continue the work from where it left off an hour before.. I thought it was just a standard message, does this mean I have another glitch I didnt know about...??
RE: I'll save the best till
Jim, there's nothing wrong with any of what you posted, other than the SETI problems. It's all normal behavior. I'll attempt to explain why...
If you look on your general preferences webpage, you'll see that "leave applications in memory" is set to "no". Generally, it is better to leave this at "yes", as otherwise you _will_ lose a little bit of crunching time (it will restart from last checkpoint instead of from exactly where it left off) so you probably should change this, but with SETI, Einstein, and CPDN, it's not a real problem either way. Some projects will error out if this is a "no".
Again - you have "switch between" set to the default 60 minutes, so when everything is working perfectly, it's going to stop work on one and start another, every hour.
This is saying that BOINC doesn't "want" Einstein work right now, it has enough of SETI and CPDN, and Einstein is a bit "ahead" of where your resource share settings say it should be.
This is not at an hour break, because you manually updated. That made it recalculate which project was "farthest behind", and switch to it.
It decided that the number of SETI's in your cache was low, but it still says Einstein is far enough ahead that it still doesn't want any.
No... CPDN's "sulphur" WUs have a ONE YEAR deadline. They take months to run!!! The "estimated" time to complete is a guess from the project based on your benchmarks. As it runs, BOINC can make a better guess of how fast your CPU will actually finish the result, so it updates this time continuously. It could easily continue to grow all the way to 50% completion, three or four or six months from now.
Suspending a single WU does just that; suspends that WU. So BOINC looks and says "hey, you said to do CPDN, but I don't have one, I need to download one" and does. Suspending the _project_ CPDN on the other hand says "don't do ANY CPDN work" - so it looked just at SETI and Einstein, and said "well, Einstein is still 'ahead', but not too far... I'll get some work."
Nope - they'll sit right there until you either "abort" the WU or unsuspend it and it completes. (Yeah, right... especially competing with the other one...)
Here's what you need to do. Resume the suspended CPDN _WORK_UNIT_. Then resume the CPDN _project_, but put it on "no new work". It should start to work (maybe at the next hour break) on the original one, which I assume will have more time on it than the new one (zero or nearly zero). In the work tab, click on the CPDN unit with the least work (this way you'll waste the bare minimum of time). Hit "Abort" on it. Then suspend the _other_ (running) WU (still in the work tab). This will force BOINC to work on the one you have said to abort - in just a few seconds, it should upload the aborted WU to CPDN and get it off your system. Once it has, resume the remaining 'good' CPDN WU. I would leave the project set on "no new work" and _only_ get CPDN work manually, when you want to...
Now - other than changing "leave in memory" to yes, so you lose the minimum number of crunching minutes possible... you're done. If the three projects have equal resource shares, you can sit back and watch it run. "Trust BOINC" - each week (not necessarily each _day_) it will give the "right" amount of CPU time to each project. When the CPDN WU is somewhere around 12 CPU hours, go look at your CPDN account - you'll have credit (161 of them, to be exact). CPDN gives you credit something like 192? times per WU in "trickles"; I haven't been running mine long enough to really be familiar with their system.
There are two things you need to worry about. First, if you get too many SETI results stuck in a "downloading" state, it can cause BOINC to stop getting work from Einstein. Ignore that for the first day. If the second day it still has WU's "stuck", and still isn't getting Einstein, then suspend the SETI project for a day or two, until they have their problems solved.
Second, if four or five months from now you haven't made significant progress on that CPDN result, BOINC _could_ decide that it is in danger of not finishing in time, and will devote 100% of it's time to CPDN. That's really still OK, but instead of your resource share being honored "weekly", suddenly it's only going to be honored over the next year... and you probably don't want that. So make sure CPDN gets at least 30% of your resource share, and keep an eye on that "to completion" figure. If it starts looking like it's not going to finish by the deadline, you should give CPDN a bigger share of your computer, and reconsider if your computer is fast enough to run three projects with one of them being CPDN.
@Lynette - your answer's buried in there too... :-)
Thanks for setting me
Thanks for setting me straight, Bill. Now I understand, especially about CPDN's deadline time. That really had me worried. I was beginning to think there was something wrong with my CPU!
Gary was right - just trust BOINC! It'll do its thing, in due course.
I'll do what you suggest and get back to my original CDPN WU. No sense in wasting those 16+ hours of CPU time.
Trust BOINC... Trust BOINC... Trust BOINC. Okay, that's implanted in my head. ;o)
RE: Trust BOINC... Trust
Jim, I don't want to give you the impression that BOINC is perfect... So PLEASE feel free to continue to ask questions, whenever you have one. What I'm trying to get across to EVERYONE, is that when it seems like something is wrong, 90% of the time, the best thing to do is "nothing". Everyone has this itch to "try something" to fix a problem. The things _you_ have tried are very logical, natural possibilities, with low risk. Some people on the other hand will just hit every button and select every menu option, at random, hoping something will work. Then, after they've mangled everything beyond repair, they come here and scream that "bionic sux".
With the current SETI problems, many people _are_ finding it necessary to Suspend SETI. This is an exceptional case; we have a project that is "lying to the scheduler", saying "this work is about to be there", and the scheduler is doing it's job of protecting us from having too much work to finish in time. This makes people think the scheduler isn't working, because they don't have work... Do I, personally, think there's a bug in the scheduler? Yes. Simply because if the CPU is totally idle, "downloading" work should only be counted against the project it's downloading _from_. (This still wouldn't make everybody happy, or get everybody work, but it would solve the REAL issue.) But this is one of those "happens 0.000001% of the time" cases, that could NEVER have been predicted to have the effects it's having. The rules the scheduler is operating under are correct... except. :-)
RE: RE: What ever is
Gary,I suspended SETI,
Gary,I suspended SETI, However, each of the other agents that are running provide the error message "communication deferred for X minutes" when asking for an update.