The problem was one faced by many network admins. That is the equipment of EOL/EOS and although it was still doing the work required of it thankfully, it was not going to be able to moving forward. We all know the story well. It's been my career and it feels like the movie by Tom Hanks called 'Groundhog Day'.
Ok - So initially I was contending with a pair of PIX Firewalls, a pair of Cisco VPN Concentrators - 3xxx series, a 4006 L2 Switch - 4013 Supervisor, and a pair of trusty Cisco 3550-12T's that provided L3 services to the DMZ.
That's really not that bad. Oh yes, there were a few other issues to contend with - physical and logical - the Cabling, Rack, Power, and Physical locations, as well as logical CSS Load Balancing.
The technical issues were not so bad to date. Some finely tuned Spanning-Tree parameters, a few choice VLANs, HSRP/VRRP were present but not issues of themselves, and there will always be rules.
Oh yes... Layer 8 came into play... Layer 8 always comes into play. In this case it came in the form of Responsibility for each device, server, or service in the DMZ and then whatever dependencies were in play. I'd call sensitivity of the organization to change is always a bit of a factor. FUD (Fear, Uncertainty, and Doubt) can be crippling to a project. We had some financial barriers to overcome and with finance come the specter of time, waiting, and budget approvals. We had to create that word... Yes the "D" word... Documentation... and finally we had to get buy-in from everyone, both technical and non-technical and identify the business drivers to help motivate this phase of the project...
FYI - The Co-located Data center serves a few states, several sites, and 3 main sites, including a 24/7/365 Hospital not to mention of course the primary Internet presence.
This was a pretty decent undertaking and one of the biggest challenges both my team and I were faced with was getting everyone to agree that this was critical, necessary, and of course a date to perform it on... Kind of felt like a doctor proposing a major surgery for a patient. :)
I had some help along the way. I had a predecessor who managed to get the ASA 5540's purchased somehow. I had my director and my manager who have a vision of what they want to do for the company going forward and where the Data Center/DMZ fit into the picture. My partner and co-designer (a CCIE) was helpful in clearing major roadblocks along the way, our manager was instrumental in gaining the buy-in and confidence in "Uniting the Tribes" and ultimately made everything possible... And of course there are a lot of little pieces that I'm leaving out for sake of brevity... but I can't forget the two other co-workers who came in to help me get the cabling done and tested real-time on the Weekend before Christmas. Of course having the supervisor of the Intel/Systems Team show up for testing verification was awesome and all those folks who had patience on the conference call came through for us like troopers over a long 4-hours to pull off this operation.
So that is how things started and this was hand I was dealt. Let's get on with it.
The first thing I had to do when I started with the company was survive probation. I normally consider this a time to get to know the network and the people. I was additionally challenged with provisioning my own lodging. No biggie. Did this all in spades.
Now remember people just don't all at once trust other people to rip out their entire business's veins leading to the heart without a kiss first...
Also recall all of my predecessors before me were CCIE's and some were employees, consultants, or contracted from one of a few Cisco Gold Partners... I'm just a CCNP.
I also had to contend with the normal stuff we all have to do in the course our jobs over the year too.
Needless to say, this past year I did not take any additional certification exams. I did sit the CCIE RS Lab immediately before taking the job and I did sit it in October. I was fortunate to be allowed to attend Cisco Live 2009 and to attend 2 Weeks worth of CCIE RS Training with IPexpert that I had won immediately before taking the job too.
I did have to come to speed on the CSS Load Balancers asap. I did have to take a step back and solidify by MPLS/BGP knowledge and I can probably recite the QoS SRND 3.3 right now in a few languages... So anyway... Oh yes a lot of high availability stuff too had to be tuned and so I had spent a bit of the year getting this right.
I was challenged to come up to speed on the Wireless LAN Controllers and helping work out the final details of the Mobility Wireless.
I did spend a bit of time coming up to speed this year on the Cisco MDS SAN Fabric too.
I used my home lab most of the year off and on for these purposes.
So let's see...
1. The PIX to ASA Migration. Not really that bad. However anything in the DMZ had to be identified. This was not a problem for me actually. Indeed it was something I was getting rather good after my stint at SunGuard and being the PIX/ASA guy... meaning I got a lot of experience working on and taking apart PIX/ASA configs.
2. Now during this whole project I had to contend with Financing and doing more with less. So I ended up having to swap 2 6509's that I needed for a 6506 in one site and a pair of 4500's at another site. This came with time delays due to logistics, availability, etc. One issue to contend with had to do with Power and ordering the correct power outlets for the 4500's and removing those for the 6500's at one site. Anyway these things happen and so it had to be planned, budgeted, financed, and dealt with. All layer 8.
3. The Data Center itself needed some upgrades in the form of electrical provisioning, an additional Chatsworth Cabinet (was there another option?) and cross-connect cabling to get from one set of rows to the where the new DMZ would be physically located. Again things had to be ordered, arranged, and installed.
4. For the DMZ itself, those 6509's had to be upgraded with blades, FWSMs, and flash. Oh yes... IOS too. All under Smartnet. Done.
5. Ok so that took care of the physical requirements. During all this time one of my main goals was to actually spend most of my time on the MPLS/BGP Project. However, when all of the purchasing is done and the smoke clears it's time to do the work... That time was coming.
6. The Responsibility Matrix was born from this requirement. What requirement? The requirement that everything in the DMZ be 100% positively identified before receiving permission for the change. This is normal actually. Surprising to some but I've seen a lot worse. When I first started the job I documented the Firewalls ASAP, then the Switches (required after a blade totally went belly up - we call that experience too...). The Load Balancers are self-documented by virtue of what they are. So the issues that were left:
Needless to say this allowed updating the FW Rules, documenting a lot of stuff that may have been previously undocumented or not documented completely.
It's all par for the course. Actually I'm writing this off the top of my head, in reality I use a spreadsheet for this and I fill out as much as I can of it and then send it to the rest of the team leads and key personnel for the rest of it.
I also was very concerned about which devices were dual-homed and could/would not be impacted by the migration - so as to make it as "hitless" and painless as possible.
This sounds easy as I write it. In reality, this is not always easy; it always takes time, and must be as accurate as possible to have the greatest successful results.
This is a key milestone.
7. After this was achieved, presented to the team leaders and discussed resources from each team would be selected as delegates to represent and test each item on the spreadsheet. A date would be selected. The change control would be presented to the committee and all questions answered as best and truthfully as possible. This is where we face the litmus of "Have we thought of everything?"... Well it looks like we did and we got the go-ahead to proceed before the end of the year. We were conditionally required to provide a priority list that we would try to adhere to as best as possible.
8. The day came and I had to drive from Orlando to Jax the day before. I ended recabling the 6509's and the Patch Panel the evening before and doing some last minute cabling documentation with labels that were more visible to present a decent fallback should it be necessary.
9. We started promptly by securing a conference bridge and ensuring we had communications to the rest of the teams who would be verifying each item, one by one. Communications are critical but you have to consider that we were re-patching the very lines that the bridge was on too... Hmmm... Interesting huh?
10. So we started. I was in charge of the priority list and ensuring each device came up as expected per vlan and my two co-workers each worked one end of the cable in the other row. Logistics were a challenge and we had specific instructions to do one cable then verify one cable and so on. It was a condition we had to agree to in order to get the buy-in and approval to complete the project. No problem, but we only had 4 hours and we had some 96 ports to contend with multiplied by 2. Pretty much 1 cable per 80 seconds or so. The truth is we had the atypical rat's nest of a decade's worth of cabling the old Rack so... time was going to be tight.
The first goal was to get the infrastructure moved over. Well... I had prior approval to move the redundant stuff like the secondary Firewall and everything else over prior to the start date. I took the evening before to take advantage of this. Needless to say when you move the backup or secondary devices, fail them over... they are the primary devices at that point, so I pushed the envelope and then moved the new backup/secondary devices/interfaces to the new DMZ. Hitless. Kewl. Oh yes, I suspended network monitoring and alerts for all devices prior to beginning for the length of the window. We track our uptime.
So I kind of started the project with a toe over the start line. It also gave me the peace of mind that the VLANs were extended and functionally operational before I started moving everything else that was business critical.
11. We had one hitch of a reported outage for a length of just about 5 minutes from the Citrix VPN Solution. Aside from that we were pretty much hitless. About 4 hours later we were done... We should of had a video of the event. It was pretty kewl to see everything going off without a hitch.
Whew! So what were you working on for the Holiday Season?