Mailing List

By Joining the mailing list you will be notified of site updates.

Show Your Support For
This Site By Donating:

Audience: System Admins - Experts
Last Updated: 3/24/2011 6:22:45 PM
Original Creation Date: 3/24/2011 6:22:45 PM
**All times are EST**

Complicated Linux Environment #1 -
The Server Nest

By Erik Rodriguez

Tags: Complicated Linux Environment, Fedora Core Mirror, Old Linux Servers, Linux Backup Software
SuSE Linux, Novell eDirectory, Custom Java Software with Linux, Windows 2000 Server

This article contains information about my actual experiences with overly-complicated Linux environments. Also see Complicated Linux Environment #2 and Complicated Linux Environment #3.

Overview

A small financial advisement company that had a programmer behind the wheel of their IT one-man IT shop. In order to "cut costs" they were advised to use Linux and the IT guy convinced them he could create a CRM/CMS/all-in-one software solution for their business. Exchange server? Nope; An earlier version of Zimbra was the build was several versions behind and therefore no longer supported.

Their internal application was written in Java. It required a specific version of Java on each desktop and the most recent version would also break their app and employees would not be able to login. The (former) IT guy was the only one who knew anything about the software. He had next to no documentation and no schema for the database. The database was a PostgresSQL database that ran on 1 of 7, (that's right 7) servers.

IT infrastructure:

As I mentioned, this organization was utilizing 7 servers for a 25 person company. They were all on-site, in a closet with a door that may as well been constructed of cardboard. The closet was barely large enough to move around in and trouble shoot things. It had dual 30 amp power feeds specifically run for the IT equipment. I consulted for this company from late 2007 until around late 2009. The servers in use were circa 2005 at best. They had 1 server running a Windows 2000 active directory instance. The rest were a mix of Linux operating systems ranging from SuSE 9 to Fedora 5. None had active repositories for updates, and by late 2009, the hardware was starting to show early signs of failure.

Backups were conducted via nightly cronjobs that would tar the directories and files of importance and copy them to a multi-purpose server. The office administrator claimed they had an offsite backup, but nobody could tell me where it was and what was being backed up. I searched the environment high and low, but could not find any evidence of such.

Challenges:

In 2007-2008, my primary object was to maintain the environment and troubleshoot/fix any problems that came up. I was basically providing "managed services." My objective was to eventually migrate to a hosted solution. The following is a list of challenges with the overall environment.

Aged and failing hardware
Software/OS versions deemed EOL (no updates or support)
Propriety applications
Spaghetti infrastructure
No offsite backup
Full hard drives (more than 95% in some cases)
Huge databases (a single database was >40GB!)
Distorted perception of environment by the owner

I covered some of these in detail already. The Spaghetti infrastructure I referred to is the cluster of "dependent services" running across multiple machines. Rebooting a machine at any one time, would cause some type of issue. Rebooting the "firewall" machine (running IPcop or something like that) would obviously take out the Internet connection, but also the LDAP services on a SuSE server that would cause Samba shares on the network to go haywire. How these two were related I never knew. Rebooting the Windows server would affect their internal application because of some ODBC mapping between a Windows service running on that box and the database used by the application? The old IT guy did at least provide an "order" to reboot the servers. This allowed all services and dependencies ample time to start the proper fashion. Although, with the age of the hardware/software on each server I was honestly nervous each time I went through the process.

Large databases and full partitions made it very hard to manage backups and other routing operations. I remember one server in particular had a very small root partition of only 3.5 GB. I was constantly monitoring the log files and other portions of the root partition. I removed EVERYTHING I possibly could but with only 3.5 GB to work with, the situation was hard to manage. By far, the worst part of this was the owner's perception of the environment. On one hand, I felt like telling him that his entire company was a ticking time bomb, and because of the way it was designed over the last 5 years, it was nearly impossible to upgrade, migrate, or recover from a disaster. On the other hand, I felt I had an obligation to get them out of such a mess. However, with what I was billing them, and the hours I was putting it, the account was quickly losing its profit margin. I had several "coming to Jesus" meetings about what it would take to dig out of the hole. Writing it up as a separate project in addition to the on-going management was "not feasible from a budgeting standpoint." How do you walk away from a ticking time bomb knowing it's going to blow at any moment?

Conclusion

In late 2009, I ended up migrating what I could to new hardware. I wasn't able to sell them on a hosted solution because they didn't want to pay the monthly fee associated with it, didn't have the bandwidth, and their internal application was designed so poorly, I doubted it would even function over a WAN. To be honest, I figured it would be more trouble than it was worth, and gave up trying to pitch that solution. The moral of this story is that Linux environments are complicate enough without someone making them even more confusion. The discovery process is much harder on Linux and can be a real headache to take over. If you have Linux deployed in your organization, make sure you have the proper documentation from your IT guy and preferably have a backup IT solution. That could be a second IT person, or a company on retainer that works closely with your internal IT staff. This whole situation may have been avoided had the owner and IT guy had a consultant take a second look at the plan before it went into motion. You can read more about complicated Linux