You can't have everything. When I was working at the IT support of Sony Europe, in Tilburg, The Netherlands, we once had a lightning strike power outage. I had seen the big generator that ran the UPS systems in the cellar, but heck, just a couple of days before the power outage they had serviced it... and forgot to refill the Diesel tank.
The generator was a big DAF truck engine that could keep us powered up for over an hour, but when the power outage broke in, it had only a minimal tank (around 1 liter). So by the time all of us had managed to get to the server room (out the door, left, in the next door), it had run through its minimum and the generator went down. Bye servers. ;)
Took us well into the next day to get everything back up & running. But at least it was fun. :)
>
> > The way I read this was that they had the UPSes, had the shutdown
> software,
> > but the batteries just didn't last long enough for things to come down
> > gracefully.
>
> Yes, thats what I was talking about. You have to configure a save shutdown
> (possible synchronized between servers) without a user action, then to test
> how long it needs to shutdown all operations and then to test how long the UPS
> holds the power, how long the security buffer should be and then to decide how
> long to wait until you start with the shutdown.
I'm thinking that they could have easily done everything right, tested everything, and after everything was all set and everyone was happy, but that was a few months ago and the batteries simply quit holding a charge.....
> I'm thinking that they could have easily done everything right, tested
> everything, and after everything was all set and everyone was happy, but that
> was a few months ago and the batteries simply quit holding a charge.....
It doesn't really matter if they have done all as we may think it is right. It is not possible to do everything right in such a migration. They are working hard and under difficulty conditions. I personal am sure they did all possible. We all can only hope that the situation become more stable very quickly when the migration is finished. This should reduce the risk in the future.
You can't have everything.
)
You can't have everything. When I was working at the IT support of Sony Europe, in Tilburg, The Netherlands, we once had a lightning strike power outage. I had seen the big generator that ran the UPS systems in the cellar, but heck, just a couple of days before the power outage they had serviced it... and forgot to refill the Diesel tank.
The generator was a big DAF truck engine that could keep us powered up for over an hour, but when the power outage broke in, it had only a minimal tank (around 1 liter). So by the time all of us had managed to get to the server room (out the door, left, in the next door), it had run through its minimum and the generator went down. Bye servers. ;)
Took us well into the next day to get everything back up & running. But at least it was fun. :)
> > > The way I read this
)
>
> > The way I read this was that they had the UPSes, had the shutdown
> software,
> > but the batteries just didn't last long enough for things to come down
> > gracefully.
>
> Yes, thats what I was talking about. You have to configure a save shutdown
> (possible synchronized between servers) without a user action, then to test
> how long it needs to shutdown all operations and then to test how long the UPS
> holds the power, how long the security buffer should be and then to decide how
> long to wait until you start with the shutdown.
I'm thinking that they could have easily done everything right, tested everything, and after everything was all set and everyone was happy, but that was a few months ago and the batteries simply quit holding a charge.....
> I'm thinking that they
)
> I'm thinking that they could have easily done everything right, tested
> everything, and after everything was all set and everyone was happy, but that
> was a few months ago and the batteries simply quit holding a charge.....
It doesn't really matter if they have done all as we may think it is right. It is not possible to do everything right in such a migration. They are working hard and under difficulty conditions. I personal am sure they did all possible. We all can only hope that the situation become more stable very quickly when the migration is finished. This should reduce the risk in the future.
Greetings from Bremen/Germany
Jens Seidler (TheBigJens)