CSE 588 Network Systems
Spring 1997

    A Comparison of IP Switching Technologies from 3Com, Cascade, and IBM.


Executive summary

Network usage continues to grow rapidly. Web-based computing and an ever
increasing number of users has brought unprecedented challenges to network
infrastructures. LAN switching is currently a popular and cost effective 
means of increasing bandwidth. However, it creates new problems. For example,
conventional routers can't handle the increased traffic made possible by the 
high performance switches. Further, emerging real-time applications such as
video conferencing loom on the horizon which will require massive bandwidth 
as well as high quality of service. Are faster routers and larger pipes, e.g.,
gigabit Ethernet, the way to go?

Ideally, we would really like to have something that scales well into the 
future and provides good QoS. ATM hardware looks promising but, being 
connection-oriented, it doesn't mesh well with connectionless IP. A number of 
ways have been concocted to run IP on top of ATM, none of which are very 
satisfying because they don't take the most advantage of ATM. They are either
too complex, inefficient, and/or don't scale well. But in early 1996 a startup
company, Ipsilon Networks, introduced an elegant solution to this problem
which they called IP Switching. Industry has since then accepted it. Some
companies license it from Ipsilon while others have created their own versions
of it with different twists. Indeed, a number of them provide some sort of 
"cut-through" switching over other link-level technologies, not just ATM. In 
this paper we compare the offerings from 3Com, Cascade and IBM who have 
recently joined forces to provide an integrated, end-to-end, desktop to LAN to
WAN to LAN to server IP switching solution.


Background

The Ipsilon approach emerged as a high performance alternative to many 
existing ATM protocols, such as Multiprotocol Encapsulation (RFC 1483) and the
ATM Forum's Multiprotocol over ATM (MPOA). In the first protocol, every router
is connected to every other router by a direct ATM virtual circuit (VC) to 
minimize the number of layer 3 hops. But this full connectivity leads to the 
"n-squared" problem where the number of VCs required grows quadratically as new
routers are added thus limiting scalability. The MPOA model groups workstations
and servers within virtual subnets and uses ATM LAN Emulation (LANE) to move 
packets from one subnet to another. An external route server forwards an 
arriving packet, and simultaneously downloads layer 3 information down to the
source device, which determines the ATM address of the destination using 
Next-Hop Routing Protocol (NHRP). Subsequent packets between the same source 
and destination now use this ATM address, bypassing the route server
completely. Overall response to the MPOA model remains mixed, perhaps because
of the extensive hype and high expectations. Critics point out that MPOA 
requires many new complicated protocols and depends on a route server, which 
could limit scalability.

Ipsilon proposed what they feel is a simpler, more robust solution. Ipsilon IP 
Switches are still based on high speed, high capacity ATM hardware, but 
General Switch Management Protocol (GSMP) and Ipsilon Flow Management Protocol
(IFMP) software (at approximately 2,000 and 10,000 lines of code, respectively)
replace more complicated MPOA protocols (coming in at 300,000 lines of code).
GSMP (RFC1987) replaces standard ATM signaling to request, tear down, and 
monitor VCs. IFMP (RFC1953) is used between the IP Switches and between IP
Switches and edge devices to associate flows with VCs.

An IP switch controller routes like an ordinary router, forwarding packets on
a default VC. However, it also performs flow classification for traffic 
optimization. A flow is an extended IP conversation, i.e. a long-lived 
sequence of IP packets sent from a particular source to a particular 
destination sharing the same protocol type. Once a flow is identified, the IP
switch sets up a cut-through connection by first establishing a VC for 
subsequent flow traffic, and then by asking the upstream node to use this VC.
If the upstream node concurs, the traffic begins to flow on this new VC
bypassing the routing software and its associated processing overhead. 
Ipsilon estimates that flows make up more than 80% of internetwork traffic.
The remaining traffic is forwarded by layer 3 hops in the usual way. Flows also
provide a convenient hook for QoS. By analyzing IP headers an IP Switch can
relate individual flows to performance requirements and request ATM VCs with
the proper type of service. Individual QoS requests for each flow will be
supported using RSVP.

This scheme's greatest attribute is its performance: Ipsilon claims throughput
of up to 5.3 million packets/second (pps) for their first-generation product.
More expensive high-end routers max out at around 1 million pps or less. It is
fully compatible with existing and emerging IP protocols such as RIP, OSPF, 
DVMRP and IGMP. Support for IPv6, RSVP, and BGP is planned. 

On the other hand, because IP switching is based on IP, tunneling or
encapsulation techniques are needed for non-IP protocols. The multiservice 
capability that you normally expect from ATM doesn't exist in IP Switches. 
However, IPX will be supported soon. The bulk of the criticism, however,
relates to Ipsilon's use of virtual circuits. Flows are associated with 
application-to-application conversations and each flow gets its very own VC.
Large environments like the Internet with millions of individual flows would
exhaust VC tables. Under these conditions, their design would need to be 
modified to support flows with less granularity.

In January of 1997, networking industry leaders 3Com, Cascade, and IBM
announced their plans to cooperate in the implementation of end-to-end IP
switching solutions across enterprise and public networks. The plan looks very 
promising: it has the advantage of being interoperable with other vendor's
gear through industry standards, and promises tremendous gains in end-to-end
network performance.  Each company has it's own area of expertise.  3Com
will represent the desktop and LAN community with its Transcend architecture
and Fast IP product, where Cascade's IP Navigator will focus on superior WAN 
service.  IBM will continue to develop its Aggregate Route-based IP Switching 
(ARIS) and Multiprotocol Switched Services (MSS) software, which control the 
routing, bridging, traffic and congestion functions of its switches.  The 
rest of the industry has clearly taken notice of the cooperative effort and 
eagerly awaits demonstrations in May.


3Com's Fast IP

3Com's Fast IP product focuses on LANs and provides IP switching across all
types of backbone technologies including Ethernet, Fast Ethernet, Gigabit
Ethernet, FDDI, Token Ring, and ATM. A significant portion of Fast IP's
functionality lies in new end-system software which provide 802.1p VLAN
registration and NHRP address resolution protocols. In environments where
end systems cannot be upgraded, Fast IP switches supporting 802.1p, 802.1Q, and
NHRP may establish a Fast IP connection.

Fast IP is based on several emerging standards. These include 802.1Q, 802.1p,
and NHRP. 802.1Q provides an architecture, protocol, and mapping for bridges
between VLANs. 802.1Q will enable a standards-based identification of VLANs
and deliver VLAN communications across common switched backbones. 802.1p 
involves protocol mechanisms that allow end systems and switches to 
dynamically register membership and convey other information. Fast IP will use
the Generic Attribute Registration Protocol (GARP) defined in 802.1p to 
provide VLAN membership registration and enable switches to map and exchange
topology information. For example, a desktop will issue a GARP message
indicating its VLAN membership and location. In this way, all switches will
learn the VLAN topology. Lastly, NHRP specifies a mechanism that allows a 
source node to determine the subnetworking layer address of either the 
destination node or the next hop towards the destination node. Although 
primarily designed for non-broadcast multi-access networks such as ATM, NHRP 
techniques may be extended to broadcast multi-access networks such as 
Ethernet, FDDI, and Token Ring.

The Fast IP process can be started by either an end system or a Fast IP-enabled
switch. An end system will issue an NHRP request based on data to be forwarded
to a separate subnet or VLAN. The event that triggers the NHRP request is
configurable, e.g., after a specified number of packets are sent to the
destination address, or when certain types of packets are sent such as those
based on QoS priorities. The NHRP request is a standard format packet with
source and destination MAC and IP addresses and frame type indicating an NHRP
packet. Contained in the data portion of the packet are the source end 
node's MAC address and VLAN ID. These will be used by the receiving end
system to send an NHRP response back to the originating source. The NHRP packet
will be forwarded to a router just like any other packet. The router can filter
the packet or forward the packet according to configured policies, e.g. it 
may be configured to deny access to the destination subnet based on the
source node's subnet address. Alternatively, the router may filter the
packet based on the NHRP type field. If there are no restrictions, the router
will forward the NHRP request to the destination node.

The destination node will issue an NHRP response directly to the originating
source node using the source MAC address and VLAN ID contained in the NHRP
request. Switches along the data path of the NHRP response will forward the
packet based on either the destination MAC address or VLAN ID. In cases
where the switch does not have the destination MAC address in its address
tables, it will forward the packet based on VLAN ID. This ensures that as
long as the underlying infrastructure is switched, the NHRP response will
reach the originating node. In returning the NHRP response, switches
in the data path will also learn and map the address of the source node.
An NHRP response received by the originating source node will indicate
that there is an underlying switched connection between VLANS. The source
node will then redirect data packets directly to the destination node using
its MAC address, effectively bypassing the router and enabling wire-speed 
switching. If a response to a Fast IP connection request is not received, the
requesting node will continue to send packets through the default router 
gateway.

Fast IP is designed to work over multiple network architectures. Further, none
of the underlying proposed techniques, 802.1p/Q, or NHRP are tied to IP, so
Fast IP can easily be extended to other protocols. It is the only IP switching
proposal that works across multiple backbone technologies and for multiple
protocols.

A distinctive feature of the 3Com solution is that flow policy is based on
requests initiated from desktop and server systems. It seems straightforward:
data senders can explicitly tag associated frames with desired policy, thus 
relieving the guesswork and performance-compromising analysis at downstream 
devices.


Cascade's IP Navigator

Cascade's IP switching product, IP Navigator, is IP switching for WANs. Its
a software upgrade to their existing ATM and frame relay switches running
Virtual Network Navigator (VNN). VNN is Cascade's OSPF-based networking 
architecture which provides the internal communications for their family of
multiservice WAN switches. VNN includes management of frame relay and ATM 
attributes like available bandwidth and QoS and it performs the routing 
functions needed to establish end-to-end VCs throughout the network taking 
into consideration QoS, for example, to calculate the best routes.
IP Navigator can run together with other ATM protocols unlike Ipsilon's IP
switching which replaces standard ATM signaling with GSMP.

IP Navigator adds an IP routing table to VNN except instead of recording the
IP address of the next hop, the end-destination switch id (e.g., a virtual 
circuit identifier (VCI) in the case of ATM) for each IP destination address 
is recorded. This is much like Cisco's Tag switching or IBM's ARIS.
IP Navigator/VNN addresses the virtual circuit O(n^2) scaling problem by 
defining a new type of virtual circuit, a Multipoint-to-Point Tunneling (MPT)
virtual circuit. In MPT, a switch, call it A, uses itself as the root and 
it establishes a single multicast circuit to all other switches in the network,
adding them as leaves. The multicast circuit informs all other switches of the
circuit that is to be used for forwarding data to switch A. This multicast
circuit actually functions as a reverse forwarding tree for data destined to
switch A. To forward traffic to switch A, another switch looks for the 
multicast circuit rooted at switch A. The multicast circuit then forwards
data in the reverse direction, from leaf to root. This dramatically reduces
the total potential number of VCs in the core to N, where N is the number of
edge switches. 

So when a frame is received by a port of a Cascade switch configured for IP 
Navigator its IP header is examined and its egress switch is looked up on the
IP routing table. The packet then moves rapidly through a preestablished
Multipoint-to-Point Tunnel to the egress switch. Another routing table
lookup is performed at the destination switch to determine the egress port.
Unlike Ipsilon's IP Switching, under this scheme every packet gets switched 
and there is no time wasted in setting up a session. The MPT VCs are 
established at startup time. Latency is reduced and packet processing speeds
are increased by removing the layer 3 routing hops in the core.

Cascade's literature makes a big deal of this MPT technology. We thought we 
must be missing something here but as far as we can tell, with the limited 
amount of information we had, this is pretty much what Cisco's TDP and IBM's 
ARIS are doing as well, its just that they give it other names and don't make 
it sound so grandiose. Once an incoming packet is assigned a VCI (in the case
ATM) by an edge switch, the ATM switches do all the rest to move it to its
egress switch. Perhaps there is more to it for traffic that isn't best-effort,
i.e. some higher QoS. For best-effort traffic the normal IP hop routed path
and the reverse MPT path are the same. 

In the case of ATM because IP packets from different sources can converge and
share a VCI on the way to the same destination it is possible that cells from
two packets can become interleaved and there is no way to sort them out again.
One solution is to buffer colliding packets which conserves VCs but may
require additional hardware. Cisco is taking this approach. Another way is to
use ATM virtual path (VP) labels, one VP per egress point and each source
point would use a different VC within the VP. Now the destination switch can
sort out interleaved cells. Although more VCs are used in this method the 
amount of state information is still O(N), where N is the number of
destinations. IP Navigator takes this latter approach. IBM's ARIS supports
both approaches. Note this is not a problem with Ipsilon's IP Switching since 
every flow receives its own VC and they are never shared.

IP Navigator provides full ATM-quality QoS for IP by using VNN's existing
base of QoS features including large buffers, weighted fair queuing, Quad-Plane
architecture (4 separate routing planes in their switches, giving 4 different
QoS levels which can be further subdivided), and rapid convergence and 
rerouting with OSPF. It allows programmable QoS for IP connections 
configurable by port, route, IP address, or user defined through RSVP.
Later enhancements will coordinate QoS support between campus and wide-area 
backbones by mapping QoS from IFMP, NHRP, PNNI, and TDP to IP Navigator.
Also, because the number of VCs can be kept relatively small using MPT VCs
it is practical to later add additional MPT VCs dedicated to guaranteed
levels of service.
 
The number of layer 3 hops can be reduced even further by using IFMP to
link NHRP local-area connections to IP Navigator wide area connections
essentially moving the edge of the MPT tree to the campus. And with 3Com's
Fast IP using 802.1p/Q it will move the edge of the MPT tree all the way 
to the desktop/server.


IBM's ARIS

IBM's Aggregate Route-based IP Switching (ARIS) according to IBM takes a more 
general approach to IP switching, one that is not specific to any one set of
products. Its actually quite similar to Cisco's Tag Switching and IP Navigator.
Indeed, IBM and Cisco are co-chairs of the IETF Multi-Protocol Label Switching
(MPLS) working group. This group will define where label-swapping based 
forwarding, i.e. tag switching, is going for the ultimate standard. IBM will 
use ARIS in their ATM and frame relay switches as well LAN switches.

In ARIS parlance a switch that has had IP routing capability added to it is
known as an Integrated Switch Router (ISR). Edge ISRs perform the usual 
forwarding of IP datagrams except the next hop field in the IP routing table
now contains a reference to a switched path known as the "egress identifier" 
and, in the case of ATM, it would contain an ATM VCI. This switched path may 
lead just to a neighboring ISR (comparable to IP next hops on conventional
routers) or it may traverse a series of ISRs following a standard IP routing
path to an egress ISR. 

ARIS pre-establishes switched paths to "well known" egress ISRs. As a result,
virtually all best-effort traffic is switched. These well known egress nodes
are learned through standard routing protocols such as OSPF and BGP. Egress 
ISRs initiate the setup of switched paths by sending Establish messages to 
their upstream neighbors. These neighboring ISRs forward the messages onto 
their own upstream neighbors in Reverse Path Multicast style (RPM) only after
ensuring the switched path is loop free. Eventually all ISRs establish 
switched paths to all egress ISRs. The switched path to an egress ISR in 
general takes the form of a tree rooted at the egress ISR. A tree results 
because of the "merging" of switched paths that occurs at a node when multiple
upstream switched paths for an egress point are spliced to a single downstream 
switched path for that egress point. 

So these ARIS switched path trees which look very similar if not identical to
the MPT trees in IP Navigator also solve the VC scaling problem, keeping the 
number of VCs used in the core to O(N). However, this isn't the whole story for
ARIS. ARIS uses different types of egress identifiers to balance the desire to
share the same egress identifier among many IP destination prefixes with the
desire to maximize switching benefits. ISRs choose the type of egress
identifier to use based on routing protocol information and local 
configuration. 

The first type of egress identifier is the IP destination prefix. This type
results in each IP destination prefix getting its own switched path tree and
thus it will not scale in large backbone and enterprise networks. However,
this is the only information that some routing protocols, such as RIP, can
provide. This egress identifier type may work well in networks where the
number of destination prefixes is limited, such as in campus environments.

The second type is the egress IP address. This type is used primarily for 
BGP protocol updates which carry this information in the next_hop attribute.
The third type is OSPF router id, which allows aggregation of traffic on 
behalf of multiple datagram protocols by OSPF. The fourth type is multicast
pair used by multicast protocols such as DVMRP, MOSPF, and PIM. Other egress
identifiers may be defined like IS-IS NSAP addresses, NSLP IPX addresses, and
IPv6 destination prefixes.

As mentioned before, in the case of ATM, ATM cells corresponding to IP packets
from different sources can become interleaved. ARIS supports both the usage
of ATM switching hardware that has the capability of preventing cell
interleaving, of which there is very little currently, as well as using ATM
virtual paths (VPs) to the egress points rather than VCs.

The current ARIS specifications say ARIS can be extended to support QoS
parameters. This will be addressed in a future ARIS revision. So currently
there is no QoS in ARIS, just best effort.


Conclusions

There are a number of concerns surrounding these various IP switching
implementations. A significant issue is scalability. The granularity of what
gets switched determines in large part how scalable a given solution is. In 
Ipsilon's IP Switching an application-to-application communication get its
own VC and, at least currently, this is the only granularity it provides. 
In Fast IP it seems it can also provide this same application-to-application
granularity but it would only use this if it needed some higher than 
best-effort QoS. For best-effort traffic, Fast IP seems to provide node-to-node
granularity, aggregating all the applications communicating between two nodes
into one circuit. ARIS provides a few different granularities ranging from
node-to-node to switch-to-switch by using its egress identifier type.
Switch-to-switch granularity aggregates traffic to all nodes reachable from
an egress switch into one circuit. IP Navigator provides only switch-to-switch
granularity.

Other issues include achievable QoS, what protocols the implementation is tied
to, what backbone technologies it can use, how well does it interoperate
with other equipment, and can it upgrade easily. This combined with the fact
that the standards are still evolving makes it difficult to evaluate these
proposals. Having said this, the interoperability initiative taken by 3Com,
Cascade, and IBM is encouraging. It based on external standards rather than a
single company's technology and provides a promising end-to-end solution.


References

"Draft Standard for Traffic Class Expediting and Dynamic Multicast
   Filtering", IEEE 802.1p/D6, April 1997.

"Draft Standard for Virtual Bridged Local Area Networks", IEEE 802.1Q/D5,
   February 1997.

"IP Navigator White Paper", Cascade Communications Corp., December 1996.
   
"LAN Emulation over ATM Version 2 - LUNI Specification - Straw Ballot", 
   ATM Forum/STR-LANE-LUNI-02.00, April 1997.

"Multiprotocol Over ATM Version 1.0 - Straw Ballot", ATM Forum/
   STR-MPOA-MPOA-01.00, February 1997.

R. Bellman, "IP Switching -- Which Flavor Works for You?", Business 
   Communications Review 27(4), April 1997, 41-46.

J. Hart, "Fast IP: The Foundation for 3D Networking", 3Com Corporation
   PN 501312-001, January 1997.

J. Heinanen, "Multiprotocol Encapsulation over ATM Adaptation
   Layer 5", IETF RFC 1483, July 1993.

J. Luciani, et al., "NMBA Next Hop Resolution Protocol (NHRP)", IETF
   Internet Draft, draft-ietf-rolc-nhrp-11.txt, March 1997.

R. McGee, "Rick McGee's Keynote Speech at Networks Expo", 
   <http://www.networking.ibm.com/netmsg84.html>.

P. Newman, et al., "Flow Labeled IP: A Connectionless Approach to ATM", Proc.
   IEEE Infocom, San Francisco, March 1996, 1251-1260.

P. Newman, et al., "Ipsilon Flow Management Protocol Specification 
   for IPv4", IETF RFC 1953, May 1996.

P. Newman, et al., "Ipsilon's General Switch Management Protocol
   Specification", IETF RFC 1987, August 1996.

A. Viswanathan, et al., "ARIS Aggregate Route-Based IP Switching", IETF 
   Internet Draft, draft-viswanathan-aris-overview-00.txt, March 1997.