> On Fri, 28 Aug 1998 00:57:26 -0700 (PDT), Curt Sampson <
> wrote:
>
> > Actually, it's not even academic. When the Linux stack receives a
> > fragmented 8K packet, it recopies all 8K of data to another area
> > of memory. Ow! That's a massive performance penalty in any world,
> > academic or not.
>
> Hmmm. I don't know how "clean" Linux's networking code is, but
> the memory bandwidth of a typical Linux CPU is much higher than
> the bandwidth of a typical Linux network interface so for us
> common folks that don't have 155 Mb/s access to a backbone
> there's no "massive performance penalty."
Linux has to copy the data because it has to allocate a buffer large enough to
hold the fragments. The original buffers are discontinuous, yet Linux wants the
packet to be linear at all levels.
Does anyone here have any scientific analysis of sk_buff's versus mbuff's or is
this yet another religious debate? Linear buffers are easy to deal with which
leads to efficient code in some places, at the cost of having to do extra
copies so that a fragmented packet can be linear. It's a trade-off.
The only way you can avoid the copying is if you choose a data structure
which can represent a whole packet in terms of a list of smaller buffers.
It's not clear whether such a representation gives you an overall performance
win. I personally think that the mbuff scheme in BSD is the C programming
equivalent of sado-masochism. Has it ever been submitted to the IOCCC?
> For example, in my particular "academic world" data arrives at 10
> Mb/s so it takes 6.5 ms to receive those 8k. If the data is then
> copied at 200 ns per 32-bit word it takes 410 us to copy it: less
> than 10% of the time it took to receive it. Presumably this
> decision was made to simplify the code and thus make it *more*
> portable, not less.
Also, the only time you will get a lot of fragmented packets is if you are
using some brain-dead UDP application that sends large datagrams. Or if you
have TCP data going through an intermediate router whose MTU is lower than that
of either endpoint (but this should be fixed by path MTU discovery anyway).
The defragmentation only takes place if the packets are destined for the local
machine. Routed packets are not defragmented unless you compiled the kernel
with the CONFIG_IP_ALWAYS_DEFRAG or some other option which makes this choice
implicit. Thus at least this ``gross inefficiency'' doesn't impair Linux as an
ordinary router.
It's good to see that people still care about memory copying inefficiencyes
though. I currently maintain communication protocol software which, believe
it or not, does allocation and copying at every layer, quite unnecessarily. Now
that's revolting.