Scheduler request failed: HTTP internal server error

PCRCC
PCRCC
Joined: 23 Oct 07
Posts: 4
Credit: 126574491
RAC: 5304
Topic 194666

Did you see something like this?

09/12/2009 20:15:50 Einstein@Home Sending scheduler request: To fetch work.
09/12/2009 20:15:50 Einstein@Home Requesting new tasks for GPU
09/12/2009 20:17:55 Einstein@Home Scheduler request failed: HTTP internal server error

Running a BOINC client version 6.10.18 for windows_intelx86, ATI gpu, and both CPU and GPU working fine for other projects (SETI, Collatz, etc).

Pooh Bear 27
Pooh Bear 27
Joined: 20 Mar 05
Posts: 1376
Credit: 20312671
RAC: 0

Scheduler request failed: HTTP internal server error

There are no ATI tasks here, so you will not be getting work for them. The issue is BOINC server is set to answer the way it does depending on your client.

Gallandro
Gallandro
Joined: 8 Dec 05
Posts: 2
Credit: 507157
RAC: 0

RE: There are no ATI tasks

Message 95895 in response to message 95894

Quote:
There are no ATI tasks here, so you will not be getting work for them. The issue is BOINC server is set to answer the way it does depending on your client.

Sorry, but I don't understand.

I'm seeing the exact same messages as the OP.

Are you saying that because of my hardware configuration (i.e. my ATI video card) that Einstein@Home will never work for me?

PCRCC
PCRCC
Joined: 23 Oct 07
Posts: 4
Credit: 126574491
RAC: 5304

Ok, no Einstein@Home work for

Ok, no Einstein@Home work for ATI GPUs, but

07-Dec-2009 22:36:09 [Einstein@Home] Sending scheduler request: To fetch work.
07-Dec-2009 22:36:09 [Einstein@Home] Requesting new tasks for GPU
07-Dec-2009 22:36:44 [Einstein@Home] Scheduler request completed: got 0 new tasks
07-Dec-2009 22:36:44 [Einstein@Home] Message from server: No work sent
07-Dec-2009 22:36:44 [Einstein@Home] Message from server: Your computer has no NVIDIA GPU

and less than 4 minutes later...

07-Dec-2009 22:40:05 [Einstein@Home] Sending scheduler request: To fetch work.
07-Dec-2009 22:40:05 [Einstein@Home] Requesting new tasks for GPU
07-Dec-2009 22:42:10 [Einstein@Home] Scheduler request failed: HTTP internal server error

and since then, without change on my side, the server ever report the same error, event when the request was for GPU and CPU (but no problem for CPU only requests).
A problem is that when it request new tasks for GPU and CPU, the server failure mean no work send for CPU either, so any work is send for crunch.
And the BOINC client wait that 2 minutes 5 seconds before end the request without start another, and that for no continuous connection to Internet is a waste of time.
And client 6.10.18 is the BOINC recomended version for Windows, so should be not a protocol error.

I tested to restart the project, but the server error is still there (no surprise, I think).
Then I tested dettach/attach. Now its worst. Request for only CPU work obtain the same internal server error. So any E@H wu of any kind is being send to my Pentium/ATI computer.

May be someone did something to the E@H server just before 07-Dec-2009 22:40:05 (UTC+1)? SETI did not work with ATI but did not have this error.

PCRCC
PCRCC
Joined: 23 Oct 07
Posts: 4
Credit: 126574491
RAC: 5304

More information: The

More information:

The scheduler log show something weird at 2009-12-07 21:36:09.7217 (UTC), and after that its Apache who claim, like here

2009-12-07 21:40:06.5495 [PID=1457 ] Request: [HOST#1042085] client 6.10.18
...
2009-12-07 21:40:06.9728 [PID=1457 ] [locality] send_new_file_work(0): try to send working set
2009-12-07 21:40:06.9728 [PID=1457 ] [locality] Host speed 0.002362

2009-12-07 21:42:07.1334 [PID=1457 ] [CRITICAL] Caught SIGTERM (sent by Apache); exiting

As the detach/attach exercise should have clean the client, the other possibility is a corruption of the DB in the server.
Any other idea?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118666893789
RAC: 19176295

RE: More information: The

Message 95898 in response to message 95897

Quote:

More information:

The scheduler log show something weird at 2009-12-07 21:36:09.7217 (UTC), and after that its Apache who claim, like here

2009-12-07 21:40:06.5495 [PID=1457 ] Request: [HOST#1042085] client 6.10.18
...
2009-12-07 21:40:06.9728 [PID=1457 ] [locality] send_new_file_work(0): try to send working set
2009-12-07 21:40:06.9728 [PID=1457 ] [locality] Host speed 0.002362

2009-12-07 21:42:07.1334 [PID=1457 ] [CRITICAL] Caught SIGTERM (sent by Apache); exiting

As the detach/attach exercise should have clean the client, the other possibility is a corruption of the DB in the server.
Any other idea?


Thanks very much for the report.

I'm also seeing the same HTTP internal server error and I've sent a message to the Devs for their attention.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118666893789
RAC: 19176295

RE: RE: There are no ATI

Message 95899 in response to message 95895

Quote:
Quote:
There are no ATI tasks here, so you will not be getting work for them. The issue is BOINC server is set to answer the way it does depending on your client.

Sorry, but I don't understand.

I'm seeing the exact same messages as the OP.

Are you saying that because of my hardware configuration (i.e. my ATI video card) that Einstein@Home will never work for me?


No, you can still get CPU work if your preferences allow it.

I have a number of machines that have been doing E@H on the CPUs and Milkyway on the ATI HD4850 GPUs. They have been working fine for months but very recently have been getting the same error messages with increasing frequency. It's not really a problem that the client is asking for GPU work. The server should tell the client when it asks that the client doesn't have a suitable GPU or preferences don't allow it, or whatever. The server shouldn't be sending the error message and I've only noticed it in the last few days.

I believe that something may have changed recently on the E@H server that is causing this. If you retry the work request, the server eventually sends the work you need. It appears to be getting worse in this regard (more retries needed) and quite often it says 'resending lost results' which means that work was actually allocated on a previous request but didn't actually get sent at that point. Completed results are also having an issue with reporting. Today, very frequently, I saw results being reported (that had failed being reported previously) where the server would announce that it was ignoring the report as the results had already been successfully reported. This is also a sign of server problems.

EDIT: I thought I was seeing this only on machines running 6.10.x BOINC and fitted with ATI HD4850 GPUs. I've just had a look at machines using 6.2.x BOINC and without an added GPU. The same errors are showing there. I've only been monitoring the GPU equipped machines with any regularity so hadn't noticed it elsewhere.

Cheers,
Gary.

Svenie25
Svenie25
Joined: 21 Mar 05
Posts: 139
Credit: 2436862
RAC: 0

After several messages

After several messages "Message from server: Server can't open database" now the message is "Message from server: Project is temporarily shut down for maintenance".

Maybe there is something broken? Or maintenance for the new ABP2?

Gallandro
Gallandro
Joined: 8 Dec 05
Posts: 2
Credit: 507157
RAC: 0

I'm not sure if it's related

I'm not sure if it's related or coincidence, but after the maintenance outage earlier, I'm now getting CPU work.

So... maybe fixed?

PCRCC
PCRCC
Joined: 23 Oct 07
Posts: 4
Credit: 126574491
RAC: 5304

For me its fixed

For me its fixed :-)

Thank you

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118666893789
RAC: 19176295

RE: I'm not sure if it's

Message 95903 in response to message 95901

Quote:

I'm not sure if it's related or coincidence, but after the maintenance outage earlier, I'm now getting CPU work.

So... maybe fixed?


If you take a look at your list of tasks, you can see that you had 8 which are now shown as 'client detached' and now you have got two new ones. It looks like you may have detached from the project or reset the project at some point. The 'client detached' status points to something a bit drastic happening but I wouldn't have thought it was solely caused by server problems unless you took some action as well.

You have a quite new hostID created at 8 Dec 2009 23:40:20 UTC - ie around 10:00AM 9 Dec local time, depending on which state you are in. You have previous history with E@H but most of that came from an Athlon XP which last contacted back in 2007. Did your new host first join the project just a couple of days ago?

If you have just rejoined after a long period of absence, BOINC may have been trained to think that your computer's and (which are stored in your state file under the tag) are extremely low. This will affect BOINCs ability to download work. Of course it will also depend on whether or not you have just restarted Seti as well. Your state file is called 'client_state.xml' and it will exist in your BOINC Data folder. You can browse it with a simple text editor like notepad but please don't change anything unless you know what you are doing.

As you have a quad core and you only have 2 active E@H tasks (plus whatever you have from Seti) it does look like BOINC is being restricted in what it can download. Perhaps you'd like to tell us your resource shares, your cache sizes and any other specific preference setting you might have changed so that we can suggest how to get a bit more reserve work in place. The two above mentioned values from your state file would also be interesting to know.

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.