Logstash - csv output headers

The reason why you are getting multiple headers in the output is because Logstash has no concept of global/shared state between events, each item is handled in isolation so every time the CSV output plugin runs it behaves like the first one and writes the headers.

I had the same issue and found a solution using the init option of the ruby filter to execute some code at logstash startup-time.

Here is an example logstash config:

# csv-headers.conf

input {
    stdin {}
}
filter {
    ruby {
        init => "
            begin
                @@csv_file    = 'output.csv'
                @@csv_headers = ['A','B','C']
                if File.zero?(@@csv_file) || !File.exist?(@@csv_file)
                    CSV.open(@@csv_file, 'w') do |csv|
                        csv << @@csv_headers
                    end
                end
            end
        "
        code => "
            begin
                event['@metadata']['csv_file']    = @@csv_file
                event['@metadata']['csv_headers'] = @@csv_headers
            end
        "
    }
    csv {
        columns => ["a", "b", "c"]
    }
}
output {
    csv {
        fields => ["a", "b", "c"]
        path   => "%{[@metadata][csv_file]}"
    }
    stdout {
        codec => rubydebug {
            metadata => true
        }
    }
}

If you run Logstash with that config:

echo "1,2,3\n4,5,6\n7,8,9" | ./bin/logstash -f csv-headers.conf

You will get an output.csv file with this content:

A,B,C
1,2,3
4,5,6
7,8,9

This is also thread-safe because it runs the code on startup only, so you can use multiple workers.

Hope it helps!

Tags:

Csv

Ruby

Logstash