How does Einstein backup it's servers?

Chris S
Chris S
Joined: 27 Aug 05
Posts: 2469
Credit: 19550265
RAC: 0
Topic 205969

I ask the question because today, as per usual, Seti has gone off air for the rest of the day because of it's weekly 10 hour outage. I am quite sure that Einstein does back up it's stuff, but they seem to do it whilst still being a 7 days a week Boinc project. What lessons here could Seti learn?

 

 

Waiting for Godot & salvation :-)

Why do doctors have to practice?
You'd think they'd have got it right by now

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 188634236
RAC: 134091

Yes, we have several

Yes, we have several mechanisms in place to make sure that we can restore certain services in case of a hardware failure or unintended loss of data. We put in a lot of effort to research an efficient way to do backups (and updates) without interrupting the project for the volunteers. I don't know the specific reasons for Seti to go offline this long but our database is setup in such a way that we have a master database and a read-only replica which are always kept in sync. When we backup the database, we use the replica in order to not interfere with normal project operations (technical sidenote: if we would do the backup using the master we would need to stop the project because no write access is possible until the backup is complete). Since the database hardware and setup is tuned to our specific needs, the replica synchronizes again with the master with minimal delay. This is all just the BOINC database not the scientific result files, those are handled separately.

To give you a perspective of sizes: our current database is 165 GB in total at 3.3 million tasks. We strive to keep the database small but we know that we can easily handle up to 5 million tasks without problems. The result files for the finished O1MD1G run are 4 TB in size (compressed). The ongoing O1MD1CV search is currently at 11 TB.

mikey
mikey
Joined: 22 Jan 05
Posts: 12715
Credit: 1839117536
RAC: 3614

Chris S_2 wrote:I ask the

Chris S_2 wrote:
I ask the question because today, as per usual, Seti has gone off air for the rest of the day because of it's weekly 10 hour outage. I am quite sure that Einstein does back up it's stuff, but they seem to do it whilst still being a 7 days a week Boinc project. What lessons here could Seti learn?  

To be honest Seti KNOWS how to do it, they have just chosen to do it the way they do for budgetary and 'resource' reasons. They operate all of the Servers out of a small closet, or did a year or so ago, which limits any and all hardware upgrades to basically replacing what they have or not doing anything. Backups are done within the closet as well so they are limited in that too.

Chris S
Chris S
Joined: 27 Aug 05
Posts: 2469
Credit: 19550265
RAC: 0

Mikey & Christian, thank you

Mikey & Christian, thank you both for your comments which are appreciated.

Firstly, Seti has the following main servers

BOINC master database, oscar, Running
BOINC replica database, carolyn, Running
SETI@home science database, paddym, Running
Astropulse science database, marvin   Running

They also have

data-driven web pages , muarae1, Disabled

That is why the project is down to users.

They used to have all their kit in a closet as Mikey says, here is a pic of me and Dr Eric Korpela in front of the closet, during my tour of the SSL Seti lab in 2011.

 

usa80.jpg

 

But in 2013 they decided to move all the Seti servers to the Main University Campus Co-Location Computer Room  (CoLo) because it had better cooling, better UPS facilities, and staff on-site to change hot plug disks without a need for a Seti staff presence.  

They may very well do this for "budgetary and 'resource' reasons" but I still query if there is a better way. Anyhow this is not a matter for Einstein, I just wondered how you guys do it. 

Waiting for Godot & salvation :-)

Why do doctors have to practice?
You'd think they'd have got it right by now

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.