Copying specific Columns in Amazon Redshift from S3 databucket
This is for the case where the file has fewer columns than the target load table.
Assuming that CustomerName and ProductName can be NULL fields you have two options.
Option #1 - Load Directly on the table
COPY main_tablename
(Date
,CustomerID
,ProductID
,Price)
FROM 's3://<<YOUR-BUCKET>>/<<YOUR-FILE>>'
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret- access-key>';
ANALYZE main_tablename;
Option #2 -- Load the data in a staging table. Then join the staging table with the reference data to insert data into
COPY staging-tablename
(Date
,CustomerID
,ProductID
,Price)
FROM 's3://<<YOUR-BUCKET>>/<<YOUR-FILE>>'
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret- access-key>';
INSERT INTO
main_tablename
SELECT st.CustomerID
,cust.CustomerName
,st.ProductID
,prod.ProductName
,st.Price
,st.Date
FROM staging-tablename st
INNER JOIN customer-tablename cust ON ( cust.CustomerID = st.CustomerID)
INNER JOIN product-tablename prod ON ( prod.ProductID = st.ProductID );
TRUNCATE TABLE staging-tablename;
ANALYZE main_tablename;
This is for the case where the file has more columns than the target load table.
Assuming that CustomerName and ProductName can be NULL fields you have two options.
Load the data in a staging table. Then join the staging table with the reference data to insert data into
COPY staging-tablename
FROM 's3://<<YOUR-BUCKET>>/<<YOUR-FILE>>'
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret- access-key>';
INSERT INTO
main_tablename
SELECT Date
,CustomerID
,ProductID
,Price
FROM staging-tablename st;
TRUNCATE TABLE staging-tablename;
ANALYZE main_tablename;