Goodbye St Andrew Street!

Moving a data centre from one location to another without disrupting the organisation is a major undertaking. We moved one data centre to a shared facility with the University of Aberdeen and North East Scotland College last year successfully, and on the weekend of 24th / 25th May we moved our second data centre from St Andrew Street down to a new purpose built facility on our Garthdee Campus. This one will also be shared with the University of Aberdeen and North East Scotland College.

The St Andrew Street building has been the University’s IT hub pretty much since IT began. When the national JANET network came into being, its main entry point was into St Andrew Street, and for many years the University’s main computer room was there, with network links to the rest of the Campus. The St Andrew Street building will be sold at the end of June, and this data centre move is part of a wider set of tasks to decommission the whole building. This has involved re-routing our external fibre connections (sorry about the roadworks!), and significant changes to the University’s overall Campus network so that we can unplug St Andrew St.

Down at Garthdee, there was construction work to build the new data centre, and this also involved significant changes to the Campus network at Garthdee so that the new data centre was connected into our network. All this work has been going on quietly over the past several months stage by stage.

Prior to the weekend of 24th / 25th of May, all of our critical systems had to be switched so that they were running fully from the North East Shared Data centre at the University of Aberdeen. That allowed the IT team over the weekend in May to shut everything down at St Andrew Street, move it down the road, and bring it back up again without any disruption. Once everything was up and running, critical services had to be rebalanced to run across the two data centres.

The School of Computing and Digital Media also had to move their servers from St Andrew Street down to the new data centre and they had completed their physical moves ahead of the 24th and 25th of May.

In a couple of weeks, the three remaining University departments will leave the St Andrew Street building for ever. Once they are gone, IT Engineers will disconnect the building and decommission the internal networks – that truly will represent the end of an era for us, but we are looking forward to moving down to Garthdee!

Here is the St Andrew Street Data centre before the move:
062-01

And here it is looking pretty empty after everything was moved down to Garthdee:
062-02

And here is the new datacenter down at Garthdee:
062-03

Did you Notice?

Sometimes the greatest successes are the ones which go largely un-noticed. I reminded you last week about the work that was taking place this past weekend to move our Faculty of Health and Social Care datacentre across to the new shared datacentre at the University of Aberdeen. I’m pleased to say that the work went very well and that all the servers were successfully relocated and are now operating in the new datacentre with little disruption over the weekend.

Yes, there were plenty of glitches along the way but the team had, and used, plans “B” and “C” along the way and successfully overcame all the critical problems. For most of the weekend, the University was running on just one datacentre instead of the normal two. Thanks to much work over the past several years, however, all our critical services continued to operate as normal over the weekend with just the occasional short pause when things were being restarted. E-mail, web site, Moodle VLE, My Apps etc were all working over the weekend. A few pieces of hardware had problems starting up, but thanks to the use of “virtualisation” technology we were able to just move the “virtual” servers onto other hardware and continue services as normal until the faulty hardware is repaired.

My thanks and well done to all those involved from IT Services!

    OLD DATA CENTRE

server1
The First Items to be removed from the Old Data Centre.

server2
Server No. 7 on its way out.

server3
1 engineer happy to be found amongst the cables.

server4
Lets get some bubble wrap around server No. 16

server5
35 Servers, 2 large Tape Libraries all wrapped up and ready to go.

    NEW DATA CENTRE

server6
Where did I put that server?
(It’s behind you mate).

server7
A collection of worried engineers.

server8
A nice collection of cables.
(Connected at Last)

Moving to Riverside East

It’s just a few weeks before the first part of Riverside East, our new building down at the Garthdee Campus, will be open for staff and students.  The Library is moving first and they plan to have moved out of the Aberdeen Business School and into the new building around the end of May. Sounds like they are having fun with red and green dots getting ready to move thousands of books – have a look at their blog.

We’re having fun too in IT Services, although we’re not quite seeing dots in front of our eyes yet. Before anyone can move into the new building we need to get all the IT equipment set up ready for staff and students to move in.

First priority is to build the IT network. All the cabling work has been done as part of the construction project, but what we have to do now is install all the network routers and switches which will drive the whole network across the building, and connect it up to the rest of the Campus. The network kit has been bought and the IT Engineers are on site to start the installation. It’s a complex process. The network equipment needs to be physically installed into the communications rooms and communications risers in the building, and thousands of cables need to be connected up to the correct network equipment. It’s essential that this is done systematically and neatly from the beginning to make access for future maintenance easy. Then it’s all got to be systematically configured and tested. The priority is to get everything ready for the Library first, but in order to do that much of the whole building network core needs to be done anyway. So in terms of sequencing, we’ll get the Library done first and then move on to the remainder of the building according to the move schedule for the Schools and Departments moving in. Because some of the construction work is still ongoing, health and safety is an important priority and all the IT engineers who will be working on site are going through formal safety training first.

Next come the IT workstations – approximately 400 are going into the Library space in time for it to open for students. There is a team of IT staff, ably assisted by some students from the School of Engineering, who are currently taking lots of workstations out of their boxes and cabling them up ready for installation in the new building. That’s fine, but of course you can’t put the workstations into the Library until there are desks to sit them on. So, all this IT work is actually now part of a very intensive programme of work to co-ordinate all the activities for the move into the building. The University has contracted with a company called Space Solutions and they are managing the overall programme of work – actually right down to scheduling the use of the lifts in the building. There’s no point in a bunch of IT guys arriving with hundreds of workstations and furniture guys with loads of desks and then fighting over who gets to use the lift. Just goes to show what level of detail has to be planned at this stage.

At the tail end of all of this there will be a fairly intensive period when the desks are going into the library, the workstations are going on the desks, are being connected up and tested, and somewhere round about this time thousands of books will be getting moved across and into their proper places on the shelves. Then of course there will be printers to install, and the self service issuing terminals for borrowers to use.

I’m sure it will look like an oasis of calm when the doors open and students go into the new building. Inevitably there will be some snagging at that stage, but spare a thought for a month of very hard work which will have preceded the opening!

 

How to move 160 servers without moving 160 servers

What are some of the challenges faced by IT Services staff? Here is a guest contribution from “Bobby G” – one of our senior IT technical staff:

“It may not be the type of question we ask ourselves everyday but we have recently been in a position where we have been required to move 160 of the university’s key servers onto new computer hardware often in diverse locations, and we wondered how to carry this work out as quickly as possible and with as little impact to our customers as possible.

The servers are all real working servers providing many important roles for the University including Library, Teaching, Financial, Research, and Support services. The amount of data involved is also quite large with around 5TB of data being involved (For those of us familiar with 1.4MB Floppy Disks that’s around 3.5 Million Floppy disks worth of information).

There is a trick as I suppose there usually is with these types of questions, and the answer is to move most of the servers in a “virtual” manner. This still involves moving where the server really is in terms of all of its intelligence (CPU/Memory), but actually leaves the data with all of the information and disks exactly appearing to be where they always were. There are now a number of systems which allow us to carry out this type of work and the University has used a tool from VMware in this instance. This has allowed us to reduce the total number of real physical servers used by half from 20 to 10 servers while almost doubling the amount of computing power available.

This makes the system much greener as there are significant savings in electricity and room cooling costs, and makes it much easier to add additional servers at a very low cost.

With a little careful planning we were able to move all 160 servers in around 9 hours one weekend with most services being unaffected by the move and most of those that were affected only being shut down for around 10 minutes.

The new setup of the system has been automated in such a manner that as servers get busy or if a physical server fails the “virtual” server will now move around to find a comparatively quiet working server and no one will even know it has moved (unless they have access to the log files). So we currently know the room that your server runs in but not exactly where it is as it may have moved itself in the last 10 mins. One of the next challenges we are giving ourselves is to setup the system so that we don’t even know which room the server is running in to allow the systems to move between buildings for themselves when a service is busy or there is some form of problem (e.g. power outage) in a building.

So in answer to the question “how do you move 160 servers without moving 160 servers” – you only move the little bit of intelligence that runs the servers and leave the rest set up as it is. (i.e. move where the server thinks it is).”

Good Bye PBX – did you notice?

Ok, what’s a PBX? It’s the old University telephone exchange that until Tuesday 22nd January 2013 was used to connect the University’s internal telephony system to the public phone system. All incoming and outgoing calls went through that system. We actually had two – one at Schoolhill and one down at Scott Sutherland.

 I explained back in November what we were doing to move from the old “analogue” phones to the new “VOIP” phones. The old “analogue” phones were each connected to the PBX by their own copper wire, and the PBX then connected us to the public phone network. As long as we had some analogue phones, we had to keep the PBX – but now that all the analogue phones have been replaced we don’t need the PBX systems any more. They are old, they are very inflexible, and they are “single points of failure”. If one of the PBX systems fails, then we lose all outgoing and incoming calls linked to that PBX. Not something you want to happen at critical times of the year.

So, the last step in the telephone switch over was to remove the PBX systems and that was done on Tuesday. In their place, our phone system is now connected through the National JANET network. We have dual connections into the University, and the fact that we are connected through a modern voice and data network will give us greater flexibility in future. For example, if we ever need to add additional capacity (more lines) it can now be done much more quickly than was possible before.

The switchover had to be done without a hitch – it was essential that incoming and outgoing calls across the University were not disrupted. So there was a heavy programme of testing the week before the switchover, and a final batch of testing on the Monday, and then the system was handed over from one telecom provider to another early on Tuesday morning. Hopefully you didn’t notice!

That’s the core of the new, modern, telephone system very much complete now apart from some tidying up. Oh- apart from fax machines. We still have 70 people wanting to keep their fax machines . . . that’s maybe a topic for another day.

A Busy Week!

If you saw my recent Blog “When things get hot”, you will know the challenges faced by IT staff when server room cooling fails. Well, the air conditioning in that server room failed again on 8th December which was really unexpected because it had received a major overhaul. In the light of this further incident, we are now putting in place permanent 24×7 monitoring of the temperature in that room until we cease using it, and we have mobile cooling units on site that we can use should the need arise – these can keep the room sufficiently chilled in the event of any further failure of the main system. In fact, we will only be using this room for a few more months before moving all the kit into a new state of the art datacentre shared with Aberdeen University and Aberdeen College. The 24×7 monitoring will be in place in time for the Christmas break, and as we do every year, a number of IT Services staff are on call over the holiday.

As if that was not enough, on Tuesday 11th December we had a very unusual technical problem on our storage system, which hosts “home” and “shared” network drives for all staff and students. IT Services staff worked through the day, and through the night until 2am in conjunction with global support engineers from the manufacturer before a chap called Adam in their Australian support centre identified the problem and we got the “home” and “shared” drives back. Big sigh of relief! We will meet with the manufacturer early in January to carry out a review of what happened. Meantime, once again big thanks to a number of IT staff who worked well into the evening and night to sort this.

For staff and students, over these few days they would see a short outage of some services early Saturday morning, and the loss of network drives on Tuesday. Behind the scenes, however, staff from IT Services had a heavy programme of work to keep services running and secure for the whole of that week. With one of the server rooms operating at reduced capacity, they had to move some services to the other server room. Systems like e-mail, the web site, our Moodle Virtual Learning Environment, the Portal and many others kept operating throughout all of this period. A lot of the week was spent in conjunction with our Estates Department arranging for the cooling to be fixed – and I’m pleased to say that the faulty parts have been replaced and cooling is working again. IT Services staff also had to work for several days to re-establish the backup systems which had been significantly affected by the cooling and technical problem. All this is almost finished as I write. Staff and students don’t see that work, but it is essential to ensure that all our services are protected and properly backed up – certainly before the holiday period. Apart from the work to re-establish our backups systems, we have put a freeze on all other changes now until the University re-opens in January.

 

When Things get Hot

 Anybody who works in IT today will know that when important IT Services fail, IT staff feel the heat. When they are working normally you don’t normally get thanked. Let’s face it – when did you last write to the Electricity Company to thank them for such a wonderful job delivering electricity? So, for us this situation to some extent goes with the job, but I’ll use this Blog today to say thanks to a number of RGU IT staff who did a great job over the weekend of 17th November. 

We have two main server rooms from which we deliver our main IT services. Servers in each room are running all the time so that we can use the full capacity for all of our services. If one room fails, however, we can continue to run critical services from the other. 

Now, servers pump out lots of heat and one of the most common causes of failure is around cooling. That’s exactly what happened to one of our rooms in November – one of the cooling units failed, and then the second cooling unit, now under a heavier load, started to wobble. The temperature climbed rapidly and it wasn’t long before some of the servers started to automatically shut down to protect themselves – a nightmare scenario for IT Staff. 

IT Services and Estates staff managed to get the temperature under control, but it was clear that the air conditioning had to be repaired quickly and that meant having to shut down ALL the cooling for several hours. IT Services staff prepared a plan of action, and worked over the weekend to move essential services into the other computer room ahead of the shutdown. When the time came, the cooling was shut down as were most of the servers in the room. We were able to continue running the most essential University IT services throughout the day from the backup room and once the air conditioning was repaired things were back to normal fairly quickly. 

We were able to do this because for several years we have been building resilience into our overall IT Architectures. We have dual communications links, dual server rooms, and we use technologies that allow us to move services from one room to the other, and keep copies of critical data in both computer rooms. 

It all paid off that weekend, and indeed it has paid off on a number of occasions. More than once, we have had some kind of problem with our network links, or servers or server rooms, but you would never have known because we were able to keep essential services running. 

So, my thanks to all the IT Staff who make this possible!