Communications infrastructure
High availability is like a jigsaw puzzle--it isn't complete until all the pieces are in place. Even 100 percent uptime for the servers and storage does you no good if the rest of the organisation cannot connect to them.
"By simplifying the network architecture, without losing functionality, the risk of outage can be reduced. Network outage caused by operator errors is [more frequent] in a highly complex network environment, driving the need for risk mitigation in the design phase. Intelligent network elements and management systems will be able to detect human errors and automatically correct error or ask for confirmation of change," says Nortel's Gallagher.
Individual elements in the network may use some of the techniques discussed above to ensure high availability, but there are also some specific issues that must be addressed. Rather than go through each kind of network hardware, we will take firewalls as a representative of the category.
Mike Lee, senior product marketing manager for Check Point, explains there are several ways of implementing high-availability firewalls. Firewall appliances and servers can be built with redundant power supplies and other components to maximise uptime.
Running two or more firewalls in parallel as a cluster goes a step further, as it provides coverage if a software failure affects one of them.
This can be achieved by putting a network switch in front of the firewalls, which may also provide load sharing, but while it is easy to re-route packets to the second firewall, information about existing connections (such as authentication) will be lost. Lee suggests switching without maintaining connections does not really count as high availability.
Check Point's Cluster XL high-availability add-on performs state synchronisation between the clustered firewalls, so connections can fail over without interruption. This sounds straightforward, but when you are dealing with hundreds of thousands of connections, "it gets pretty tricky," says Lee.
Firewall failover to a remote site with synchronisation is not generally practical, he says. "It requires an ultra high-speed communications medium" between the firewalls, and while it has been done by some military users, "that's unreasonable for most enterprise users."
In any case, if the problem is big enough to switch to a backup site, forcing users to reauthenticate themselves is no big deal in most cases, he suggests.
However, Stonesoft has developed a clustering technology that allows firewalls to be kept in synchronisation without replicating all the traffic on the internal and external ports.
This arrangement means that multiple firewalls at the same location can failover to each other, and if the entire subcluster comes to a halt, it fails over to the other location.
This still requires substantial bandwidth with low latency, but it can be done over 45km of single mode fibre, and one Australian customer uses this arrangement for failover between its primary data centre and the backup site 26km away, says senior network specialist Mathew Butler. Simple primary to secondary failover can be done via a 9600bps serial line: "it wasn't pretty, but it did work," says Butler.
Another advantage is that the nodes in the cluster only have to be running the same operating system. A mix of hardware can be used, such as an old 1GHz Pentium 3 with a new Athlon 2000+ XP. "We've avoided the forklift upgrade process," says Butler.
Stonesoft's clustering technology isn't limited to its own firewalls--it also works with third-party firewalls, plus Web and proxy servers and content scanners such as MIMESweeper. The same GUI is used to manage all these applications.
Lee also points out changing firewall configurations can lead to outages. "You can make mistakes that bring a firewall down, but it's not super-common," he says. Check Point aims to make its products as easy to use as possible, and helps customers prepare and test appropriate configurations.
Redundant communication links are also important, as you don't want to be taken offline by a single backhoe accident. If your systems are located in a hosting centre, redundant connections to the Internet are probably part of the service, but if you are operating from your own premises you may need multiple connections through different ISPs to ensure high availability.
Stonesoft's StoneGate high-availability firewall and VPN product includes the ability to treat these links as a single virtual connection even if the various servers' public IP addresses have been allocated from different ranges. Outbound traffic is distributed according to which connections are the best performing at the time, Butler says.











