The switch should not be the problem, cause Win 7 crunches the BRP tasks without any error, or do you think Windows uses a different package size?
This is my suspicion, judging from the symptoms.
Something else to try would be to connect the computers directly, without a switch, just a cable.
BM
I had tried a direct connection to the Internet router(Fritzbox) with the DHCP-Server. Network activity was suspended before and again 50% of all WU's were invalid. I have a crossover cable that I will try and also I can again try a direct connect to the Fritzbox.
Anyway I did not get any errors doing work for Milkyway, Primegrid and GPUGRID.
I still get the 'transient upload errors' with h1 and BUCKET tasks, but they have been all valid.
So what is the difference to the BRP tasks?
Today I changed the MTU to 1492 and there was only one transient upload error.
Yesterday and today:
0 invalid, 7 valid and 11 pending BRP tasks.
If it stays like this I will be happy.
I don't run any BRP CPU tasks, cause that is ineffective. If you think it is a good idea I will try. In general these are to few results to consolidate an opinion.
If I get the first invalid result I will connect the host directly to the Fritzbox. I guess the DHCP Server in there might negotiate the correct MTU size with that box.
If this is working(would surprise me), I will use the Crossover cable to come closer to the solution. The switch was a cheap one from D-Link and better ones are pretty expensive. :(
What I don't understand is that the communication is over TCP and not UDP, so any faulty packet should be asked to send again from the receiving server. Doesn't an 'transient..' error indicate that the server got an invalid packet?
And I probably did not mention before that the Phenom doesn't get any transient upload errors at all.
[Edit] It will need some time to do these checks cause of a lot of other work units.
Replacing the file upload handlers will take a few hours, but should be finished by 16:00 UTC (18:00 CEST). To avoid further validation errors from upload retries I suggest you suspend your network connection until then.
RE: RE: The switch should
)
I had tried a direct connection to the Internet router(Fritzbox) with the DHCP-Server. Network activity was suspended before and again 50% of all WU's were invalid. I have a crossover cable that I will try and also I can again try a direct connect to the Fritzbox.
Anyway I did not get any errors doing work for Milkyway, Primegrid and GPUGRID.
I still get the 'transient upload errors' with h1 and BUCKET tasks, but they have been all valid.
So what is the difference to the BRP tasks?
Regards,
Michael
RE: I still get the
)
There are a few:
- the files are uploaded to different servers (einstein.phys.uwm.edu and einstein-dl.aei.uni-hannover.de). These servers differ in hard- and software.
- previously two different versions of the file upload handler were running on the two servers (not anymore, though)
- the result of a GW task (S5* or S6Bucket) is a single file of a few hundred kB, while a result of a BRP task consists of four files of a few kB.
I'm not sure how this affects your network problems, though.
Do you have the same trouble with BRP CPU tasks?
BM
BM
Today I changed the MTU to
)
Today I changed the MTU to 1492 and there was only one transient upload error.
Yesterday and today:
0 invalid, 7 valid and 11 pending BRP tasks.
If it stays like this I will be happy.
http://einsteinathome.org/host/737731/tasks&offset=0&show_names=1&state=4
I don't run any BRP CPU tasks, cause that is ineffective. If you think it is a good idea I will try. In general these are to few results to consolidate an opinion.
If I get the first invalid result I will connect the host directly to the Fritzbox. I guess the DHCP Server in there might negotiate the correct MTU size with that box.
If this is working(would surprise me), I will use the Crossover cable to come closer to the solution. The switch was a cheap one from D-Link and better ones are pretty expensive. :(
What I don't understand is that the communication is over TCP and not UDP, so any faulty packet should be asked to send again from the receiving server. Doesn't an 'transient..' error indicate that the server got an invalid packet?
And I probably did not mention before that the Phenom doesn't get any transient upload errors at all.
[Edit] It will need some time to do these checks cause of a lot of other work units.
RE: Replacing the file
)
Whatever you did, Bernd: it's magic. :-)
There's a big difference between
http://einsteinathome.org/host/2069906/tasks
and
http://einsteinathome.org/host/2069906/tasks&offset=20&show_names=0&state=0
... as well as
http://einsteinathome.org/host/702599/tasks
It seems that with the new FUH (nearly?) all errors are gone .. at least for me.
Stephan
Not even a single invalid
)
Not even a single invalid task so far. :)
Great job updating the FUH!