Splitting gzipped logfiles without storing the ungzipped splits on disk
A script like the following might suffice.
#!/usr/bin/perl
use PerlIO::gzip;
$filename = 'out';
$limit = 500000;
$fileno = 1;
$line = 0;
while (<>) {
if (!$fh || $line >= $limit) {
open $fh, '>:gzip', "$filename_$fileno";
$fileno++;
$line = 0;
}
print $fh $_; $line++;
}
You can use the split --filter
option as explained in the manual e.g.
zcat biglogfile.gz | split -l500000 --filter='gzip > $FILE.gz'
Edit: not aware when --filter
option was introduced but according to comments, it is not working in core utils 8.4
.