Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 16, 2010 21:14 UTC (Tue) by jengelh (subscriber, #33263)
Parent article: Ghosts of Unix past, part 3: Unfixable designs

Neil, will you, at some point, release short answers to the exercises?

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 17, 2010 4:28 UTC (Wed) by neilbrown (subscriber, #359) [Link] (13 responses)

Most of the exercises are research questions for which short answers do not exist.
Question 4 (on Xnotify) would result in an article that I would very much like to read. Question 5 could result is a potentially useful slab of code (though it is less clear whether it would be used).

I can tell you what I was thinking of in the "IP protocol suite" questions though, as no-one seems to have taken a stab that those in the comments.

The 'full exploitation' in IP relates to UDP. It is most nearly an application layer (layer 7) protocol (as applications can use it to communicate) yet it is used a multiple levels of the stack - particularly for routing (at least back when we used RIP. BGP uses TCP) which is a layer 3 concern. It is used for VPNs and other network management. And even sometimes for application level protocols.

The "conflated design" in IP is the fact that end-point addresses and rendezvous addresses are equivalent at the IP level. They aren't at higher levels. "lwn.net" is a rendezvous address, but the IP level you only see 72.51.34.34, which could (in a shared-hosting config) map from several rendezvous addresses. So upper level protocols (like http/1.1) need to communicate the *real* rendezvous address, because IP doesn't.

The "unfixable design" in IP is obviously the tiny address space, which we have attempted to fix by NAT and VPNs etc, but they aren't real fixes. Had IP used a distinct rendezvous address it would have only been needed in the first packet of a TCP connection, so it would have been cheap to make it variable-length and then we might not have needed IPv6 (though that doesn't really address UDP).

So those were my thoughts. I haven't spent as much time fighting with network protocols as I have with the Unix API so I'm a lot less confident of these ideas than of the ones I wrote formally about.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 17, 2010 4:40 UTC (Wed) by dlang (guest, #313) [Link] (4 responses)

just setting the address space to 64 bits instead of 32 bits would have been enough to eliminate the need for IPv6

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 17, 2010 10:00 UTC (Wed) by iq-0 (subscriber, #36655) [Link] (3 responses)

No it would only eliminate the address shortage part of the problem. There are more significant changes that really are long due to be made which are also addressed by IPv6. The reasoning is: don't upgrade to half a solution when you know you really must do another upgrade soon after, the cost is in the breaking upgrade not in the amount of changes.
That is not to say that IPv6 is the holy grail, it's design by committee and as such is probably too different on one front and not different enough on another. And of course it's trial by jury with a terribly large jury, so there is probably not one protocol (now or ever) that would meet all the demands.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 17, 2010 23:23 UTC (Wed) by dlang (guest, #313) [Link] (2 responses)

what are the other problems that IPv6 solves?

at the time it was designed, there were a lot of things that it did that were not possible in IPv4, but most (if not all) of the features that people really care about have been implemented in IPv4

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 18, 2010 8:11 UTC (Thu) by Cato (guest, #7643) [Link] (1 responses)

You are right about many things such as QoS, IPv6, etc.

However, Mobile IP is much better implemented in IPv6 so you don't get inefficient 'triangular routing' - http://www.usipv6.com/ppt/MobileIPv6_tutorial_SanDiegok.pdf

The biggest benefit of course is not having to use NAT for IPv6 traffic.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 18, 2010 13:19 UTC (Thu) by vonbrand (guest, #4458) [Link]

Yep, that's why people are clamoring for NATv6 ;-) (Just as the idiotic firewalling going on has made everything run over HTTP.)

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 18, 2010 4:27 UTC (Thu) by paulj (subscriber, #341) [Link] (7 responses)

That the rendezvous address is (potentially) at a higher level than the end-point address is normal layering. For any given layer that provides some kind of addressing semantics, there can always be another layer above it that implements richer addressing and must map its richer addresses down to the lower layer. That's good and normal.

So to look for conflation in networking addressing you probably need to stay within a layer. E.g. within IP, there is conflation in addressing because each address encodes both the identity of a node and its location in the network. Or perhaps more precisely: IP addressing lacks the notion of identity really, but an IP address is the closest you get and so many things use it for this. This may be fixed in the future with things like Shim6 or ILNP, which separate IP addressing into location and identity. This would allow upper-layer protocols like TCP to bind their state to a host identity, and so decouple them from network location.

Variable length addresses would have been nice. The ISO packet protocol CLNP uses variable length NSAP addresses. However, hardware people tend to dislike having to deal with VL address fields. The tiny address space of IPv4 perhaps needn't have been unfixable - it could perhaps have been extended in a semi-compatible way. However it was decided (for better or worse) a long time ago to create IPv6.

Possibly another problem with IP, though I don't know where it fits in your list, is multicast. This is an error of foresight, due to the fact that multicast still had to be researched and it depended on first understanding unicast - i.e. IP first had to be deployed. The basic problem is that multicast is bolted on to the side of IP. It generally doesn't work, except in very limited scopes. One case is where it can free-ride on existing underlying network multicast primitives, i.e. ones provided by local link technologies. Another is where a network provider has gone to relatively great additional trouble to configure multicast to work within some limited domain - needless to say this is both very rare and even when done is usually limited to certain applications (i.e. not available generally to network users). In any new network scheme one hopes that multicast services would be better integrated into the design and be a first-class service alongside unicast.

Another retrospectively clear error is IP fragmentation. It was originally decided that fragmentation was best done on a host by host basis, on the assumption that path MTU discovery could be done through path network control signalling and that fragmentation/reassembly was a reasonably expensive process that middle-boxes ought not to be obliged to do. IMO this was a mistake: path MTU signalling turned out to be very fragile in modern deployment (IP designers didnt anticipate securo-idiocy); it turned out fragmentation/reassembly was relatively cheap - routers routinely use links both for internal buses and external connections which require fragmenting packets into small fixed size cells. As a consequence of the IP fragmentation choices, the IP internet is effectively limited to a (outer) path MTU of 1500 for ever more, regardless of changes in hardware capability. This causes problems for any IP packet protocol which wants to encap itself or another. One imagines that any new network scheme would learn from the IP MTU mess, make different trade-offs and come up with something better and more robust.

We should of course be careful to not overly condemn errors of foresight. Anticipating the future can be hard, particularly where people are busy designing cutting-edge new technology that will define the future. ;)

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 18, 2010 8:13 UTC (Thu) by Cato (guest, #7643) [Link] (5 responses)

One performance improvement of IPv6 is that it has a much more regular IP header structure, which involves a lower cost for hardware-based forwarding.

http://en.wikipedia.org/wiki/IPv6#Features_and_difference... has a good summary of the benefits of IPv6 including this one.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 1:15 UTC (Fri) by dlang (guest, #313) [Link] (4 responses)

that sounds like something that was a really big deal when IPv6 was created, but with the increased processor speeds we have now, not nearly as important.

this isn't just that clock speeds are higher, but that the ratio of clock speeds to the system bus speeds is no longer 1:1, this means that it's possible to execute far more steps without slowing the traffic down.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 11:15 UTC (Fri) by job (guest, #670) [Link]

If you're switching IP in hardware, it makes your design simpler and faster.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 11:41 UTC (Fri) by Cato (guest, #7643) [Link] (2 responses)

IP routers have not done CPU-based forwarding as the main path for a long time - the largest one is probably the Cisco CRS-3 which forwards 322 terabits per second when fully scaled (http://newsroom.cisco.com/dlls/2010/prod_030910.html), but even quite low end routers now also use hardware forwarding (i.e. ASICs, network processors, etc, not CPU).

You can probably manage to forward anything in hardware, but it helps somewhat that IPv6 has a regular header design.

IPV6 and hardware-parseable IP headers

Posted Nov 19, 2010 23:26 UTC (Fri) by giraffedata (guest, #1954) [Link]

I don't think CPU speed per se (how fast a single CPU is) is relevant. It's all about cost, since most IP networks are free to balance the number of CPUs, system buses, network links, etc.

And from what I've seen, as the cost of routing in a general purpose CPU has come down, so has the cost of doing it in a specialized network link processor (what we're calling "hardware" here) -- assuming the IP header structure is simple enough. So today, as ten years ago, people would rather do routing in an ASIC than allocate x86 capacity to it.

I think system designers balance system bus and CPU speed too, so it's not the case that there are lots of idle cycles in the CPU because the system bus can't keep up with it.

Ghosts of Unix past, part 3: Unfixable designs

Posted Dec 3, 2010 9:05 UTC (Fri) by paulj (subscriber, #341) [Link]

FWIW, major router vendors still use software routing for many of their lower-end enterprise routers, even some mid-range.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 11:18 UTC (Fri) by job (guest, #670) [Link]

Path MTU discovery is indeed broken in practice.

One thing I never really understood is why TCP MSS is a different setting from MTU. Given the belief that the MTU could be auto detected, MSS could be deduced from it.

Perhaps someone can enlighten me?