Reading a CSV file with SQL querys from linux shell

Take a look at https://github.com/harelba/q, a Python tool for treating text as a database. By default it uses spaces to delimit fields, but the -d , parameter will allow it to process CSV files.

Alternatively you can import the CSV file into SQLite and then run SQL commands against it. This is scriptable, with a bit of effort.


There is also csvsql (part of csvkit)!

It can not only run sql on given csv (converting it into sqlite behind scenes), but also convert and insert into one of many supported sql databases!

Here you have example command (also in csvsql_CDs_join.sh):

csvsql --query 'SELECT CDTitle,Location,Artist FROM CDs JOIN Artists ON CDs.ArtistID=Artists.ArtistID JOIN Locations ON CDs.LocID = Locations.LocID' "$@"

showing how to join three tables (available in csv_inputs in csv_dbs_examples).

(formatting with csvlook also part of csvkit)

Inputs

$ csvlook csv_inputs/CDs.csv 

| CDTitle  | ArtistID | LocID |
| -------- | -------- | ----- |
| CDTitle1 | A1       | L1    |
| CDTitle2 | A1       | L2    |
| CDTitle3 | A2       | L1    |
| CDTitle4 | A2       | L2    |

$ csvlook csv_inputs/Artists.csv 

| ArtistID | Artist  |
| -------- | ------- |
| A1       | Artist1 |
| A2       | Artist2 |

$ csvlook csv_inputs/Locations.csv 

| LocID | Location  |
| ----- | --------- |
| L1    | Location1 |
| L2    | Location2 |

csvsql

$ csvsql --query 'SELECT CDTitle,Location,Artist FROM CDs JOIN Artists ON CDs.ArtistID=Artists.ArtistID JOIN Locations ON CDs.LocID = Locations.LocID' "$@" | csvlook

Produces:

| CDTitle  | Location  | Artist  |
| -------- | --------- | ------- |
| CDTitle1 | Location1 | Artist1 |
| CDTitle2 | Location2 | Artist1 |
| CDTitle3 | Location1 | Artist2 |
| CDTitle4 | Location2 | Artist2 |

Tags:

Linux

Sql

Csv

Shell