How to move 160 servers without moving 160 servers

What are some of the challenges faced by IT Services staff? Here is a guest contribution from “Bobby G” – one of our senior IT technical staff:

“It may not be the type of question we ask ourselves everyday but we have recently been in a position where we have been required to move 160 of the university’s key servers onto new computer hardware often in diverse locations, and we wondered how to carry this work out as quickly as possible and with as little impact to our customers as possible.

The servers are all real working servers providing many important roles for the University including Library, Teaching, Financial, Research, and Support services. The amount of data involved is also quite large with around 5TB of data being involved (For those of us familiar with 1.4MB Floppy Disks that’s around 3.5 Million Floppy disks worth of information).

There is a trick as I suppose there usually is with these types of questions, and the answer is to move most of the servers in a “virtual” manner. This still involves moving where the server really is in terms of all of its intelligence (CPU/Memory), but actually leaves the data with all of the information and disks exactly appearing to be where they always were. There are now a number of systems which allow us to carry out this type of work and the University has used a tool from VMware in this instance. This has allowed us to reduce the total number of real physical servers used by half from 20 to 10 servers while almost doubling the amount of computing power available.

This makes the system much greener as there are significant savings in electricity and room cooling costs, and makes it much easier to add additional servers at a very low cost.

With a little careful planning we were able to move all 160 servers in around 9 hours one weekend with most services being unaffected by the move and most of those that were affected only being shut down for around 10 minutes.

The new setup of the system has been automated in such a manner that as servers get busy or if a physical server fails the “virtual” server will now move around to find a comparatively quiet working server and no one will even know it has moved (unless they have access to the log files). So we currently know the room that your server runs in but not exactly where it is as it may have moved itself in the last 10 mins. One of the next challenges we are giving ourselves is to setup the system so that we don’t even know which room the server is running in to allow the systems to move between buildings for themselves when a service is busy or there is some form of problem (e.g. power outage) in a building.

So in answer to the question “how do you move 160 servers without moving 160 servers” – you only move the little bit of intelligence that runs the servers and leave the rest set up as it is. (i.e. move where the server thinks it is).”

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s