Kaz Kylheku (kaz@cafe.net)
Mon, 18 Jan 1999 09:17:08 -0800
> On Sat, 16 Jan 1999, Kaz Kylheku <kaz@cafe.net> wrote:
>
> > This is the main
> > raison d'etre for alloca: extending a stack frame is darn fast compared to
> > using a dynamic allocator.
>
> Why is that? I mean, what causes the speed difference?
Because extending the stack frame can be a one instruction job! All you have to
do is displace the stack pointer by so many bytes. Whereas true dynamic
allocation typically involves searching a data structure for a suitably sized
free block. You pay for the convenience of having fewer constraints in the
sense that with dynamic allocation you are permitted can allocate and free in
any order so long as freeing of a given block does not precede its allocation.
:) Whereas if you allocate off a stack structure, you have to free in order,
or use a ``mark/release'' mechanism whereby you set a mark in the stack which
is later specified to a release operation that will throw away everything up to
that mark. There is actually no free() equivalent for alloca. You have to
terminate the function; so what you have is a coarse grained ``mark/release''
where the original stack top in that frame is the mark.
I believe that implemenations of alloca may be sensitive to compiler settings
that pertain to calling conventions. For example, if alloca() relies on stack
frame pointers for its cleanup, what happens if you use gcc with
-fomit-frame-pointer? The idea is that if your stack frames have frame
pointers, the cleanup instruction sequence upon function return is invariant of
the state of the stack pointer; the stack pointer is simply recovered from the
frame pointer, e.g. on i386:
movl %ebp, %esp # restore stack pointer
popl %ebp # pop out saved frame link into frame ptr
ret # return
Thus alloca can decrement the stack pointer all it wants without destroying the
validity of the above cleanup code. Without frame pointers, the cleanup code
has to assume that the stack frame size hasn't changed, so that it can simply
add a constant quantity to the stack pointer; but alloca will destroy the
validity of this assumption.
I just tried the following test: I compiled a trivial function with GCC
-fomit-frame-pointer. Surely enough, the use of the frame pointer %ebp
disappeared. Then I added a call to alloca() inside the function and recompiled
with the same settings. Guess what? GCC behaved as if -fomit-frame-pointer were
turned off when compiling that function. What's more, not only does GCC
recognize alloca(), but it fully inlines it. My call to alloca, which had a
constant expression of 10000 as its argument was replaced by a simple
addl $-10000, %esp
instruction, where X is the amount that was requested. That's a fast way to get
some memory, indeed! It's also inherently thread-safe; nothing special has to
be done in a threaded program to make it work. No implementation of malloc
can beat this.
Also note that in GNU C, you can dynamically allocate in automatic storage even
without using alloca; local arrays can be declared with sizes which are
non-constant expressions. That's really getting non-portable! E.g.
void foo(char *x)
{
char tmp[strlen(x) + 1];
strcpy(tmp, x);
// ...
}
GCC bends over backwards to allow alloca to work, which is not surprising since
many GNU programs use it. And I believe that the users are better off, because
the programs perform better as a a result. But for me, the portability hit
outweighs the performance temptation.
This archive was generated by hypermail 2.0b3 on Mon 18 Jan 1999 - 08:24:26 PST