Ruby: How can I read a CSV file that contains two headers in Ruby?
It looks like your CSV file was produced from an Excel spreadsheet that has columns grouped like this:
... | Rushing | Passing | ...
... |Rushes|Gain|Loss|Net|TD|Att|Cmp|Int|Yards|TD|Conv| ...
(Not sure if I restored the groups properly.)
There is no standard tools to work with such kind of CSV files, AFAIK. You have to do the job manually.
- Read the first line, treat it as first header line.
- Read the second line, treat it as second header line.
- Read the third line, treat it as first data line.
- ...
I'd recommend using the smarter_csv
gem, and manually provide the correct headers:
require 'smarter_csv'
options = {:user_provided_headers => ["Institution ID","Institution","Game Date","Uniform Number","Last Name","First Name", ... provide all headers here ... ],
:headers_in_file => false}
data = SmarterCSV.process(filename, options)
data.pop # to ignore the first header line
data.pop # to ignore the second header line
# data now contains an array of hashes with your data
Please check the GitHub page for the options, and examples. https://github.com/tilo/smarter_csv
One option you should use is :user_provided_headers
, and then simply specify the headers you want in an array. This way you can work around cases like this.
You will have to do data.pop
to ignore the header lines in the file.
You'll have to write your own logic. CSV is really just rows and columns, and by itself has no inherent idea of what each column or row really is, it's just raw data. Thus, CSV has no concept or awareness that it has two header rows, that's a human thing, so you'll need to build your own heuristics.
Given that your data rows look like:
"721","Air Force","09/01/12",
When you start parsing your data, if the first column represents an integer, then, if you convert it to an int and if it's > 0
than you know you're dealing with a valid "row" and not a header.