Logrotate Successful, original file goes back to original size
Solution 1:
This is probably because even though you truncate the file, the process writing to the file will continue writing at whichever offset it were at last. So what's happening is that logrotate truncates the file, size is zero, process writes to the file again, continuing at the offset it left off, and you now have a file with NULL-bytes up to the point where you truncated it plus the new entries written to the log.
od -c after truncate + sudden growth, generated output along the lines of:
0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
33255657600 \0 C K B - s e r v e r [ h t t
33255657620 <more log output>
What this says is from offset 0 to 33255657600 your file consists of null bytes, and then some legible data. Getting to this state doesn't take the same amount of time it would take to actually write all those null-bytes. The ext{2,3,4} filesystems support something called sparse files, so if you seek past a region of a file that doesn't contain anything, that region will be assumed to contain null-bytes and won't take up space on disk. Those null bytes won't actually be written, just assumed to be there, hence the time it takes to go to 0 to 3.5GB don't take a lot of time to. (You can test the amount of time it takes by doing something like dd if=${HOME}/.bashrc of=largefile.bin seek=3432343264 bs=1
, this should generate a file of over 3GB in a few milliseconds).
If you run ls -ls
on your logfiles after they've been truncated and had a sudden growth again, it should now report a number at the beginning of the line which represents the actual size (in blocks occupied on disk), which probably is orders of magnitude smaller than the size reported by just ls -l
.
Solution 2:
I am extremely confident that Kjetil has hit it. Drew, you may not yet be convinced by his explanation but I urge you to read carefully what he's said.
If you accept it, the fix is either to stop and restart your application when the logs are rotated, or use a tool like apache's "rotatelogs", where you feed the log output to the tool via a pipe, and the tool takes care of rotating the logfile every so often. For example, one of my apache instances logs with
ErrorLog "|/usr/sbin/rotatelogs /www/logs/error_log 604800"
which causes a lot of logfiles with names like
-rw-r--r-- 1 root root 4078 Dec 21 01:04 error_log.1292457600
-rw-r--r-- 1 root root 4472 Dec 29 08:41 error_log.1293062400
-rw-r--r-- 1 root root 78630 Jan 4 12:57 error_log.1293667200
-rw-r--r-- 1 root root 15753 Jan 12 01:10 error_log.1294272000
to appear without restarting apache; I can then compress them manually after the fact. Note how the rotation is done every week, which is every 604800 seconds, that being the argument passed to rotatelogs
.
If you can't stop and restart the app, and it can't log via a pipe, then I think you've got a real problem. Perhaps others will have suggestions.