Read gzip stream line by line
It might be easier to use readline
for this:
const fs = require('fs');
const zlib = require('zlib');
const readline = require('readline');
let lineReader = readline.createInterface({
input: fs.createReadStream('test.gz').pipe(zlib.createGunzip())
});
let n = 0;
lineReader.on('line', (line) => {
n += 1
console.log("line: " + n);
console.log(line);
});
If anyone is still looking into how to do this years later, and wants a solution that works with async
/await
, here's what I'm doing (TypeScript, but you can just ditch the type annotations).
import fs from "fs";
import zlib from "zlib";
import readline from "readline";
const line$ = (path: string) => readline.createInterface({
input: fs.createReadStream(path).pipe(zlib.createGunzip()),
crlfDelay: Infinity
});
const yourFunction = async () => {
for await (const line of line$("/path/to/file.txt.gz")) {
// do stuff with line
}
}