Serious BUG: Phantom WUs NOT on user client machines but on result pages

John McLeod VII
John McLeod VII
Moderator
Joined: 10 Nov 04
Posts: 547
Credit: 632255
RAC: 0

> > How stable is the 4.25

Message 6412 in response to message 6411

> > How stable is the 4.25 client with Einstein? I've seen references to bugs
> with
> > Einstein and earlier beta clients.
>
> I don't know. I'm running 4.24 on an XP test machine without obvious
> problems, but I simply don't know if 4.25 is reliable or not. In the worst
> case you can always put 4.19 back!
>
> Cheers,
> Bruce
>
4.25 appears to be stable from here.

Juerschi
Juerschi
Joined: 4 Jan 05
Posts: 62
Credit: 31245
RAC: 0

First I have to say, that I

First I have to say, that I updated yesterday from 4.19 to 4.25 and everything is working fine here.

On Host 3794 I have 2 of this so called phantom WU's. In Boincmanager I only could find 3 WU's whether in result page I should have 5 WU's.
The ID's of missing WU's are 419103 and 466949

Hope this is useful for you


chelliot
chelliot
Joined: 9 Feb 05
Posts: 9
Credit: 135202
RAC: 0

Bruce, It took a while to

Bruce,

It took a while to reproduce, but finally I had a WU failure while I was capturing outside of my firewall. It indeed does include the Content-Length field.

Note that I am still using the 4.19 client here. It appears that the Microsoft Windows 4.25 client does not work under linux/wine, so I will have to stay with 4.19 on at least one system for now. I'll go ahead and upgrade the others now that I've recreated the problem and captured the packets outside of my firewall.

The WU is 498669 and the result I was to calcuate was 1856576 or H1_0955.9__0956.4_0.1_T10_Test02_0.

Here's the client side log:

--- - 2005-03-10 10:25:30 - May run out of work in 0.01 days; requesting more
Einstein@Home - 2005-03-10 10:25:30 - Requesting 1726 seconds of work
Einstein@Home - 2005-03-10 10:25:30 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2005-03-10 10:25:39 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi failed
Einstein@Home - 2005-03-10 10:25:39 - No schedulers responded
Einstein@Home - 2005-03-10 10:25:39 - Deferring communication with project for 1 minutes and 0 seconds

Here's the HTTP header from your side (remember this is captured outside of my firewall, so I'm not putting in the Content-Length field):

Internet Protocol, Src Addr: 129.89.61.70 (129.89.61.70), Dst Addr: 69.134.215.43 (69.134.215.43)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
Total Length: 1500
Identification: 0x522b (21035)
Flags: 0x04 (Don't Fragment)
Fragment offset: 0
Time to live: 48
Protocol: TCP (0x06)
Header checksum: 0x17a0 (correct)
Source: 129.89.61.70 (129.89.61.70)
Destination: 69.134.215.43 (69.134.215.43)
Transmission Control Protocol, Src Port: 80 (80), Dst Port: 64976 (64976), Seq: 3512258347, Ack: 163238194, Len: 1460
Source port: 80 (80)
Destination port: 64976 (64976)
Sequence number: 3512258347
Next sequence number: 3512259807
Acknowledgement number: 163238194
Header length: 20 bytes
Flags: 0x0010 (ACK)
Window size: 11680
Checksum: 0xd533 (correct)
Hypertext Transfer Protocol
HTTP/1.1 200 OK\r\n
Response Code: 200
Date: Thu, 10 Mar 2005 15:25:32 GMT\r\n
Server: Apache/2.0.52 (Fedora)\r\n
Content-Length: 6239\r\n
Connection: close\r\n
Content-Type: text/plain; charset=UTF-8\r\n
\r\n

I can send you the packet trace via email or extract more information if you require.

Note that, over the just less than 2 days I was capturing packets, your side sent 119 HTTP replies that did not include the Content-Length field and 9 replies that did. Only the last reply was sending a new WU result to be calculated and it appears that that is the only one that the 4.19 client does not handle well.

Back in your court! :-)

Enjoy!
Chris.

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

Chris, Nice work! I've

Message 6415 in response to message 6414

Chris,

Nice work!

I've had a long discussion with David Anderson about this. First, the problems provoked by having 'Content-Length' in the reply have been fixed in the core client. But this still leaves the question of 'where to these come from'?

They are not coming from Apache at my end: I've used tcpdump to look at some of the packets. David says that he thinks that your ISP is using some 'hidden proxies' to do filtering before the packets ever get to your network port. He says that depending upon the routing that they use, sometimes they are using one set of filtering and sometimes another.

Personally I am sceptical of this explanation, but can't see any alternative. What do you think?

Cheers,
Bruce

Director, Einstein@Home

chelliot
chelliot
Joined: 9 Feb 05
Posts: 9
Credit: 135202
RAC: 0

Bruce, How many packets

Bruce,

How many packets did you look at? Note that less than 10% of the packets originating from your scheduler showed this field when I was looking--and that was a pretty small sample size--128 packets over a 2-day period.

I'd also suggest downloading Ethereal--it can accept tcpdump-format packets and will give you much more sophisticated packet display and display filtering capabilities. For example, I was able to setup a filter to look for packets with HTTP content type of text/plain (to eliminate packets to/from the browser interface) and where the Content-Length field is present. The display filter for this is as follows:

http.content_type contains "text/plain" && http.content_length

Due to continuation packets, I don't know how to also look for packets delivering new WU results for processing. I did that manually, although the show TCP stream feature in Ethereal can help here.

Note that Ethereal uses tcpdump-style filtering for capturing (and it uses tcpdump or libpcap to actually do the capturing--technically Ethereal doesn't capture packets itself, but it will direct the capture process). It uses an entirely different syntax for display filters that is much richer. However, many users of Ethereal find this baffling at first.

I use Time Warner's Road Runner service. I don't know if they are using any hidden proxies or not. I can try asking some folks that might know, but I don't know if I'll get an answer or not.

It might be interesting to know what SP's other people experiencing this problem are using.

I'm upgrading my Windows systems to 4.25. I'm convinced that I should be able to make my Linux system be a proxy for itself and filter out the Content-Length field on the system, but I don't have any experience with doing that. Any advice would be appreciated if you or anyone else has experience with similar things--and I'll post something if I get anything working.

Enjoy!
Chris.

ChinookFoehn
ChinookFoehn
Joined: 22 Jan 05
Posts: 28
Credit: 4637957
RAC: 0

I have, so far, received 12

I have, so far, received 12 phantom/ghost work units (g.w.u.).

Is it also the case w.u.s are ony being sent out 1-at-a-time now with the second unit not being sent until the first is returned, the third, not being sent until the second is returnes, etc.?

All mine, for the last two days, including 4 g.w.u., seem to be of this pattern.

P4 2.8 HT Win2000SP4 v4.19

JoeB
JoeB
Joined: 24 Feb 05
Posts: 124
Credit: 89610116
RAC: 30040

I seem to be having the same

I seem to be having the same sort of problem - the web site (view results) says I have 3 WU due on 18 March, while my work page in Boinc says I have only 2 due that day.

I have an AMD 1.4 running windows 98SE and Boinc 4.25. I use ATT dialup.

Joe B

Joe B

JoeB
JoeB
Joined: 24 Feb 05
Posts: 124
Credit: 89610116
RAC: 30040

Chris,

Chris,

Joe B

JoeB
JoeB
Joined: 24 Feb 05
Posts: 124
Credit: 89610116
RAC: 30040

Chris,

Chris,

Joe B

JoeB
JoeB
Joined: 24 Feb 05
Posts: 124
Credit: 89610116
RAC: 30040

Chris,

Chris,

Joe B

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.