Web replica down

TPCBF

Joined: 24 Nov 12

Posts: 17

Credit: 216330463

RAC: 1552860

RE: They are all flattened

6 Mar 2013 21:49:01 UTC

Message 115156 in response to message 115154

(moderation:

)

Quote:

They are all flattened out to keep the chaos within bounds. My rough estimate is that getting this fixed will take the rest of the week.

As Einstein@Home is working relatively good and reliably atm. I don't think they will bother spending a minute on it.

I'll try one or two things tomorrow to get at least the stats updated again.

BM

Besten Dank fÃ¼r das Update... ;-)

Seems every time something with the hardware on pretty much any of the DC projects goes tits up, it does so big time... :-(

Ralf

Anonymous

I am not sure its just a

6 Mar 2013 23:18:45 UTC

Message 115157 in response to message 115156

(moderation:

)

I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2956889726

RAC: 719998

RE: I am not sure its just

6 Mar 2013 23:53:53 UTC

Message 115158 in response to message 115157

(moderation:

)

Quote:

I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?

That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

Darth Beaver

Joined: 28 Jul 08

Posts: 49

Credit: 14208989

RAC: 0

Hi thank's for the update .

7 Mar 2013 0:01:50 UTC

Message 115159 in response to message 115154

(moderation:

)

Hi thank's for the update . So Scootty that week you give it ? Trekky time ?? like the captains says ,"so Scootty that's 7hrs then" ....bloody hope tits not a week

paris

Joined: 11 Jan 06

Posts: 50

Credit: 10120977

RAC: 12044

I am having a similar issue

7 Mar 2013 0:49:19 UTC

Message 115160

(moderation:

)

I am having a similar issue and have had for a few days now. The messages I get are as follows:

2013-03-06 23:47:50.1974 [PID=30612] 2013-03-06 23:47:50.2035 [PID=30612] [debug] 2013-03-06 23:47:50.2036 [PID=30612] 2013-03-06 23:47:50.2036 [PID=30612] 2013-03-06 23:47:50.2036 [PID=30612] 2013-03-06 23:47:50.2036 [PID=30612] 2013-03-06 23:47:50.2037 [PID=30612] 2013-03-06 23:47:50.2037 [PID=30612] 2013-03-06 23:47:50.2037 [PID=30612] 2013-03-06 23:47:50.2043 [PID=30612] 2013-03-06 23:47:50.2044 [PID=30612] 2013-03-06 23:47:50.2185 [PID=30612] 2013-03-06 23:47:50.2191 [PID=30612] 2013-03-06 23:47:50.2192 [PID=30612] 2013-03-06 23:47:50.2192 [PID=30612] 2013-03-06 23:47:50.2192 [PID=30612] 2013-03-06 23:47:50.2192 [PID=30612] 2013-03-06 23:47:50.2192 [PID=30612] 2013-03-06 23:47:50.2238 [PID=30612] 2013-03-06 23:47:50.2239 [PID=30612] 2013-03-06 23:47:50.2255 [PID=30612] [debug] 2013-03-06 23:47:50.2255 [PID=30612] [debug] 2013-03-06 23:47:50.2255 [PID=30612] [debug] 2013-03-06 23:47:50.2255 [PID=30612] 2013-03-06 23:47:50.2258 [PID=30612] Request: [USER#xxxxx] [HOST#2952443] [IP xxx.xxx.xxx.47] client 6.10.56
[HOST#2952443] Resetting nresults_today
[send] effective_ncpus 2 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
[send] effective_ngpus 0 max_jobs_on_host_gpu 999999
[send] Not using matchmaker scheduling; Not using EDF sim
[send] CPU: req 34091.36 sec, 0.20 instances; est delay 0.00
[send] work_req_seconds: 34091.36 secs
[send] available disk 54.81 GB, work_buf_min 172800
[send] active_frac 0.990677 on_frac 0.994236 DCF 1.163416
[send] [HOST#2952443] is reliable
[send] set_trust: random choice for error rate 0.003387: yes
[version] Checking plan class 'BRP4cuda32OSX'
[version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
[version] OS version required min: 100800, supplied: 81101
[version] Checking plan class 'opencl-ati-lion'
[version] OS version required min: 110000, supplied: 81101
[version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#6 (i686-apple-darwin) min_version 0
[version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#3 (powerpc-apple-darwin) min_version 0
[send] stopping work search - no locality app selected
[send] stopping work search - no locality app selected
[HOST#2952443] MSG(high) No work sent
[HOST#2952443] MSG(high) see scheduler log messages on http://einstein.phys.uwm.edu//host_sched_logs/2952/2952443
[HOST#2952443] MSG(high) No work available for the applications you have selected. Please check your preferences on the web site.
Sending reply to [HOST#2952443]: 0 results, delay req 60.00
Scheduler ran 0.035 seconds

I have noticed that the BRP work generators are frequently not running. Is anyone else having a problem? If I am interpreting the above correctly, there has been an upgrade that requires a newer operating system. Any help or enlightenment would be appreciated.

(edit): I have been running OS X (Tiger) 10.4.11 on a Mac mini core duo for a long time with no problems. Do I now need Lion?

Plus SETI Classic = 21,082 WUs

Nobody316

Joined: 14 Jan 13

Posts: 141

Credit: 2008126

RAC: 0

RE: I have noticed that the

7 Mar 2013 1:30:54 UTC

Message 115161 in response to message 115160

(moderation:

)

Quote:

I have noticed that the BRP work generators are frequently not running. Is anyone else having a problem? If I am interpreting the above correctly, there has been an upgrade that requires a newer operating system. Any help or enlightenment would be appreciated.

I can't say for sure if OS has any effect. I am running windows 7 x64 and at the moment only run BRP4 for CPU and GPU. I still get work daily with no problems on that point only problem at the moment is stats update on Boinc stats as the update for on site works fine.

PC setup MSI-970A-G46 AMD FX-8350 8 core OC'd 4.45GHz 16GB ram PC3-10700 Geforce GTX 650Ti Windows 7 x64 Einstein@Home

Anonymous

RE: RE: I am not sure its

7 Mar 2013 4:00:08 UTC

Message 115162 in response to message 115158

(moderation:

)

Quote:

Quote:
I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?

That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

I am running on Linux X64. If I update E&H using the "update" button in Boinc Manager I download new WUs. It almost seems as though the "automatic" update is not taking place when jobs are complete. I have looked at the various parameters on this site and do not see any that could effect automatic download of WUs.

What am I missing?

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

RE: RE: RE: I am not

7 Mar 2013 7:36:04 UTC

Message 115163 in response to message 115162

(moderation:

)

Quote:

Quote:
Quote:
I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?

That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

I am running on Linux X64. If I update E&H using the "update" button in Boinc Manager I download new WUs. It almost seems as though the "automatic" update is not taking place when jobs are complete. I have looked at the various parameters on this site and do not see any that could effect automatic download of WUs.

What am I missing?

The settings for the cache of work works like this for Boinc version 7:
"Computer is connected to the Internet about every: xx days" is a low water mark.
"Maintain enough work for an additional xx days" forms a high water mark.
Boinc will request enough work for low + high and then wait until it drops below the low water mark again before asking for more.
So if you set it to something like 1 + 0.1 Boinc will always keep about one days worth of work.

If you run more than one project you have to consider resource share, you have probably run more Einstein work in recent time than Seti work and now Seti is allowed to catch up.

TPCBF

Joined: 24 Nov 12

Posts: 17

Credit: 216330463

RAC: 1552860

RE: That's an interesting

7 Mar 2013 8:02:58 UTC

Message 115164 in response to message 115158

(moderation:

)

Quote:

That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

For me, running E@H on several different hosts (4xXP (one exclusively), 3xWin7), they just get new tasks as usual, I did not see any significant variation in getting new task since the issue with the server showed up. And I checked all machines just because I noticed that the stats didn't update a couple of days ago.
The missing stats update is the only thing I noticed and as long as everything else seems to be working OK, that's all fine with me. Just a bit more effort in monitoring a few hosts of anything out of the ordinary for a few days...

Ralf

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250470891

RAC: 35329

- stats have been dumped from

7 Mar 2013 11:08:14 UTC

Message 115165

(moderation:

)

- stats have been dumped from the master DB now, it depends on the stats sites when they'll pick it up.

- work at UWM progresses faster than feared, we may have the replica working (or at least been worked on) later today.

- BOINC offers three "schedulers", referred to as "old"/"array", "locality" and "matchmaker". On Einstein@Home we're using the array scheduler to send work for BRP(4) and FGRP(2) and the locality scheduler for GW (S6BucketLVE) work, the matchmaker isn't used. The log entry about it can safely be ignored.

- scheduling (i.e. reporting and getting tasks) is completely independent of the replica.

- An App selection in BOINC is opt-in. If the project issues a new application after you once made a selection, you won't get work for this new application until you revisit your preferences and select this application. On Einstein@Home FGRP1 has been superseded by FGRP2, and previous GW apps (S6LV1, S6Bucket and older) by the recent S6BucketLVE. If you once chose to run applications that don't exist anymore, you may not get any work at all. The "run other apps if no work is available for selected apps" setting is meant to work around this problem, but it doesn't work reliably on Einstein@Home. For now you need to revisit your Einstein@Home preferences every time we release a new application. Changing this behavior is under discussion, but hasn't been implemented yet.

Web replica down

Forums › Technical News

Comment viewing options

Forums › Technical News