Traduisez - Übersetzen - Traduzca - Traduza - Tradurre - Translate

VanLUG Email Archive

Re: aha!

Ted Powell
Sat, 3 Oct 1998 17:56:39 -0700

On Sat, Oct 03, 1998 at 05:22:45PM -0700, klindsay wrote:
>
> Hmm... I use this method on high end production machine which produce
> large log files quite quickly. I just wrote a bash rotation script which
> has been working for the last few months without a hitch. I just :>
> /var/log/(logfile) and it seems be working great.

I didn't mean to imply that zapping the content would _never_ work,
just that it doesn't _always_ work.

The safe way is to rename the currently active log file and create a
new file with the original name. Programs which keep their log file open
will keep writing to the renamed file.

Then you do whatever it takes to make the program close and reopen its
log file, typically sending it a SIGHUP, but for some you need to stop
it and restart it. This has to be configured on a case-by-case basis,
but should have been done by the builders of the distribution and/or
the individual package.

After waiting a decent interval for processes using the renamed log file
to respond to the SIGHUP or whatever by closing it and beginning to use
the new file, your script is then free to compress the renamed file,
move it elsewhere, or whatever it is configured to do.

> And I've never
> experienced any gaping holes from daemons writing at the previous offset.
> Since syslog handles all of our logs I probably don't have to worry, but
> yes if a daemon wrote its own logs writing to the previous offset, (Which
> seems wacky to me, any idea why this would be done?) it would leave
> gapping holes.

Normal behaviour when writing to a sequential file in UNIX is to write at
an offset equal to previous_offset + length_of_the_previous_write. If a
file is opened in append mode, the i/o system is responsible for seeing
that all subsequent writes take place at the end of the file, wherever
it may have moved to, but if a program does an ordinary open and seeks
to the end of the file itself, for example, it won't have this guarantee.

I observed the described behaviour with the log files written by the
NCSA web server (from which Apache was derived), running under SunOS
4.3 on a Sparc 10. If you're lucky, you may never experience it in your
environment. If you're a sysadmin, you don't rely on luck, at least not
when it's easily avoidable.

-- 



http://psg.com/~ted/ (Ted Powell) If your hard drive crashes, perhaps you have a recent backup. If Earth crashes, what then? We need off-site backup: Luna, L5, Mars, wherever.