BGP TTL "hack"

At the NANOG 26 meeting in october 2002, Dave Meyer presented a very simple proposal to protect BGP sessions against attacks: set the TTL to 255 on outgoing packets, and check whether the TTL in received packets is equal to 255. Since routers always lower the Time To Live (or Hop Limit in IPv6) when forwarding a packet, and routers discard packets with a TTL of 0, there is no way for anyone who isn't attached to the subnet in question to inject packets with a TTL of 255 into a subnet. RFC3682 was published in february and describes the details of the "Generalized TTL Security Mechanism (GTSM)".

Cisco has now included GTSM into IOS release 12.3(7)T, as explained in the feature guide. It seems there are some interesting caveats. First of all, Cisco states that enabling the feature using the neighbor ... ttl-security command will only enable the check for incoming packets and not change any behavior as to outgoing packets. So this must mean they always use a TTL of 255 for outgoing packets now. However, older IOS versions set the TTL for BGP packets to 1 (in the absence of any ebgp-multihop settings). If this is the case, then detecting whether a neighbor is directly connected won't be very reliable right now. Then again, Cisco says the feature must be configured on both ends of an eBGP session (no support for iBGP as of yet) which seems to contradict this.

Another thing is that they look for a TTL of 254 or higher. This suggests that for incoming packets with a TTL of 255, they first decrease the TTL and then go on to process the TCP segment. Again, this is not how things work in older IOS versions, as it's perfectly possible to set the TTL to 0 and still interact with a Cisco router on the local subnet. So unless something changed in this regard as well, accepting a TTL of 254 means that there can still be a router in between!

Note that this mechanism only offers protection against attacks on port 179 of a router from "far away". Anyone on the local subnets still gets to do whatever they please and the content of the BGP sessions isn't protected any better than before.

Permalink - posted 2004-04-08

BGP on Cisco 2500

It has been a while since the last news posting. My apologies for that. Here is something to hold you over until I can find some real news:

When perusing the HTTP referrer log, I noticed that a lot of people are finding this site in search of "bgp+2500" or something similar. So... is it possible to run BGP on a Cisco 2500 router?

The short answer is "yes". The IOS images for the 2500 support BGP, including BGP for IPv6. (They do not, however, support OSPF for IPv6 even though OSPF for IPv4 is supported and the "ipv6 router ospf" command may exist. Same thing for IS-IS: the command exists, but the protocol isn't present in any of the 2500 images.)

The slightly longer answer is that a 2500 is of limited use for inter-domain routing. Actually way back when I got started with BGP I used a 2514 with 16 MB RAM and it could hold the entire 35000 or so entry global routing table. From two upstreams even, if I remember correctly. However, in the mean time the global routing table has gotten four times as big, and the 2500's memory limit is still 16 MB. The fact that the 2500 series sports a 68030 CPU doesn't really help either. All of this means that you can only run BGP on a 2500 if you don't send it more than a few thousand routes. You also shouldn't send it a full feed and have the 2500 filter out the unwanted routes, as this will tax the CPU too much and make for many-minute convergence times.

Note that even 35000 routes wouldn't work anymore today, as modern IOS images need a lot more memory for their internal house keeping. On bigger routers it gets even worse because unlike the 2500, those can't run their software from flash, so it must be copied to RAM. Additionally, the switching path of choice is CEF these days, which takes a lot of memory. (Without CEF you'll be using fast switching which uses just as much memory but only when needed, so it's not only the CPU that melts down but you also run out of memory when a slammer-like worm hits.)

So you may be able to get away with a full table on a Cisco with 128 MB RAM (or you may not), but 256 MB gives you much more elbow room. Unfortunately, Cisco still makes boxes that can run BGP but won't take enough memory to do so properly. A good example are the 3550 series multilayer switches.

Permalink - posted 2004-03-31

Apple Safari IPv6 hack

IPv6-enabled operating systems such as Windows, Linux and FreeBSD all come with a web browser that also supports IPv6 and prefers IPv6 when both IPv4 is available. Things are slightly different with Safari, Apple's browser application. Initially, Safari only supported literal IPv6 addresses and some corner case DNS names. In the current version, Safari will do IPv6 if no IPv4 address is available, but it won't prefer IPv6 over IPv4 or fall back to IPv6 when IPv4 doesn't work.

However, Nicholas Humfrey has come up with a trick. By enabling Safari's debug mode and switch off one of the two HTTP loaders that are normally used, Safari will prefer IPv6 when it's available. See the Mac OS X hints article that Nick posted for the details.

Permalink - posted 2004-01-28

Clearing the DF bit

As I wrote a few weeks ago in an article under the name "no ip unreachables", path MTU discovery doesn't work all that well across the internet in practice. Since then, I've noticed that people end up on this site looking for ways to clear the don't fragment bit in the IP header. So here is an example of how to do this on a Cisco router:

! route-map nodf permit 10 set ip df 0 ! interface FastEthernet2/0 ip policy route-map nodf !

Note that the "ip policy route-map nodf" command must be applied on the interface receiving the packets for which the DF bit must be cleared, and not the interface with the reduced MTU itself, where the packets are subsequently transmitted. See a page at Cisco for additional strategies.

Permalink - posted 2004-01-12

New miminum allocation size at RIPE

The RIPE NCC has changed its policy regarding the initial allocation that new LIRs receive. The rule that efficient use for at least a /22 must be demonstrated is now off the table, and the minimum allocation is now a /21 rather than a /20. See the announcement. RIPE also maintains a list of minimum allocation and assignment sizes for their address blocks (linked from the announcement), but this is pretty much useless because filtering on allocation size is too restrictive while filtering on assignment size is isn't restrictive enough for many address blocks. So be very careful when implementing prefix length filtering.

Without the jargon, please!

Right. Most of us get our IP addresses from our ISPs, and ISPs usually have one or more blocks of IP address space of their own. Having their own address space is important for ISPs because this allows them to be independent from their ISPs by allowing them to change ISPs without having to change addresses. (Obviously this is useful to end-users as well, but this changed policy applies to ISPs.) Until now, ISPs that wanted to get address space of their own needed to show that they and/or their customers would start using 1024 addresses (a /22) immediately. In this case, they would get a block of 4096 addresses (a /20). The advantage of having such a large block is that everyone in the world is prepared to store a pointer to it in their routers, making the addresses globally usable without limitations.

Since some networks only accept routing information for the smallest address blocks that RIPE and the other Regional Internet Registries (ARIN, APNIC and LACNIC) give out to ISPs. Smaller address blocks aren't entirely useless, but they may not be globally reachable without having to depend on the ISP the addresses came from, which of course limits ISP independence.

Since RIPE is now giving out blocks of 2048 addresses (/21) from some of their address blocks, networks are expected (and pretty much forced) to accept these blocks. This is good news for small ISPs that want their own independent block: they no longer have to jump through hoops trying to show they need 1024 addresses, or make do with only semi-independent addresses.

Note that the other RIRs haven't changed their policies (or at least there are no announcements to be found). ARIN's policy for instance, is even more restrictive than the old RIPE policy: multihomed networks must show efficient use of a /21 to get a /20, single homed ISPs must even show efficient use of a full /20 to get a /20. So for now the good news only applies to ISPs in the RIPE region, which is roughly Europe, the Middle East, Africa north of the Sahara and the former Soviet Union. For more info, see the RIR policy comparison matrix.

Permalink - posted 2004-01-10

IPv6 documentation prefix and IPv6 site/host list

There is now an official IPv6 prefix set aside for documentation purposes: 2001:0DB8::/32. (Leading zero courtesy of APNIC.)

The how and why is documented at a page at APNIC. Note that there is also a prefix set aside for documentation purposes in IPv4: 192.0.2.0/24. See RFC 3330 for more information and other special IPv4 prefixes.

At prik.net there is now a list of IPv6-enabled hosts or sites. I have no idea how complete the list is, but it has more than 3000 entries so it's better than the manually maintained stuff in some other places. If the link doesn't work, this is probably because your browser doesn't understand compressed content. In that case, use the uncompressed version. The compression ratio is about 1 : 6.

Permalink - posted 2004-01-04

older posts - newer posts