Have you wandered into a gravitationally induced time warp? :-)
I just checked and it's Feb 7th here - Feb 6th in your location - so nothing to do with that old maintenance news item :-).
In the last several days, there have been three separate instances of this known problem. Fortunately, either the automatic recovery script is responding in a more timely fashion, or the problem was spotted and manually corrected quickly, since the upload failure didn't last very long each time.
The latest instance started around 3:30PM UTC and seems to have been corrected by about 5:30PM. I'm guessing these times based on entries in the logs kept by scripts I run overnight here. The local times were approximately 1:30AM and 3:30AM (UTC+10).
What gets recorded in the logs is the time interval since the last successful scheduler contact for each host in the fleet, but only when that interval is considered to be 'excessive'. Hosts are checked every hour so there's quite a bit of guesswork in translating to wall clock times. There were over a hundred entries in the log when I checked it this morning :-). By that time the problem was over and everything was back to normal.
I'm sure glad we aren't getting longer outages any more. They were causing quite some issues for me.
Looks like a further upload problem has started. Fortunately, I started my work cache replenishment around 6.00am local time (8.00PM UTC) which just happens to be about when uploads stopped again. Since the initial number of failed uploads was very low, hosts were able to get work, even though they couldn't upload. The fact that work could be received meant that the excessive last scheduler contact time was disguised. That's showing up with a vengeance now that cache replenishment finished around 7.00AM (9.00PM UTC). By 8.30AM the warnings started filling the logs :-).
I imagine everyone has gone home by now (11.00PM UTC) so it will be interesting to see if the automatic (scripted) restart procedure brings this event to a timely close. Otherwise I'll be running out of work again.
Uploads are go again! Looks like the automatic restart is happening in a timely manner. You Bewdy!! :-).
As a precaution, and while the going's good, I've started an extra 0.3 days cache update run on all hosts. When that finishes, all hosts will have 0.8 days of work.
I did actually see a new batch of upload problems, mid-afternoon UTC Weds 6 Feb, round about the time Betreger posted about the January network maintenance. But they cleared by themselves before I had a chance to comment.
At the moment, I'm not seeing any new problems - and probably won't look for them, since I'll be off to bed within the hour. But there certainly seems to be an increased frequency of problems, and maybe we should check with Bernd (agreed, not at this time of night) to see whether the cron restarts happen often enough.
I would class that as a
)
I would class that as a problem or bug. There's a thread in Problems and Bug Reports which has been reopened for this purpose - Upload trouble 18/01
It seems the issue has been
)
It seems the issue has been solved, all units from my computer were uploaded.
Mine just uploaded.
)
Mine just uploaded.
JAN 6TH: UWM NETWORK
)
JAN 6TH: UWM NETWORK MAINTENANCE
It seems this is causing uploads to be stuck, oh well.
Have you wandered into a
)
Have you wandered into a gravitationally induced time warp? :-)
I just checked and it's Feb 7th here - Feb 6th in your location - so nothing to do with that old maintenance news item :-).
In the last several days, there have been three separate instances of this known problem. Fortunately, either the automatic recovery script is responding in a more timely fashion, or the problem was spotted and manually corrected quickly, since the upload failure didn't last very long each time.
The latest instance started around 3:30PM UTC and seems to have been corrected by about 5:30PM. I'm guessing these times based on entries in the logs kept by scripts I run overnight here. The local times were approximately 1:30AM and 3:30AM (UTC+10).
What gets recorded in the logs is the time interval since the last successful scheduler contact for each host in the fleet, but only when that interval is considered to be 'excessive'. Hosts are checked every hour so there's quite a bit of guesswork in translating to wall clock times. There were over a hundred entries in the log when I checked it this morning :-). By that time the problem was over and everything was back to normal.
I'm sure glad we aren't getting longer outages any more. They were causing quite some issues for me.
Cheers,
Gary.
Looks like a further upload
)
Looks like a further upload problem has started. Fortunately, I started my work cache replenishment around 6.00am local time (8.00PM UTC) which just happens to be about when uploads stopped again. Since the initial number of failed uploads was very low, hosts were able to get work, even though they couldn't upload. The fact that work could be received meant that the excessive last scheduler contact time was disguised. That's showing up with a vengeance now that cache replenishment finished around 7.00AM (9.00PM UTC). By 8.30AM the warnings started filling the logs :-).
I imagine everyone has gone home by now (11.00PM UTC) so it will be interesting to see if the automatic (scripted) restart procedure brings this event to a timely close. Otherwise I'll be running out of work again.
Cheers,
Gary.
I saw that fresh problem
)
I saw that fresh problem about 2-1 hours ago but at the moment uploads work again (with my micro-tiny fleet here).
Uploads are go again! Looks
)
Uploads are go again! Looks like the automatic restart is happening in a timely manner. You Bewdy!! :-).
As a precaution, and while the going's good, I've started an extra 0.3 days cache update run on all hosts. When that finishes, all hosts will have 0.8 days of work.
Cheers,
Gary.
I did actually see a new
)
I did actually see a new batch of upload problems, mid-afternoon UTC Weds 6 Feb, round about the time Betreger posted about the January network maintenance. But they cleared by themselves before I had a chance to comment.
At the moment, I'm not seeing any new problems - and probably won't look for them, since I'll be off to bed within the hour. But there certainly seems to be an increased frequency of problems, and maybe we should check with Bernd (agreed, not at this time of night) to see whether the cron restarts happen often enough.
atm uploads are possible but
)
atm uploads are possible but reporting or getting new work is not possible :(