Bulk Insert records into Active Record table
Thanks to Chris Heald @cheald for his 2009 article, with showed me that the best way to go was the multi-row insert command.
Added the following code to my initializers/active_record.rb
file, changed my Model.create!(...)
calls to Model.import!(...)
and away it goes. A couple caveats:
1) It does not validate the data.
2) It uses the form of the SQL INSERT command that reads like ...
INSERT INTO <table> (field-1, field-2, ...)
VALUES (value-1-1, value-1-2, ...), (value-2-1, value-2-2, ...), ...`
... which may not be the correct syntax for all databases, but it works with Postgres. It would not be difficult to alter the code for the appropriate syntax for your SQL version.
In my particular case, inserting 19K+ records into a simple table on my development machine (MacBook Pro with 8GB RAM, 2.4GHz Intel Core i5 and and SSD) went from 223 seconds using 'model.create!' to 7.2 seconds using a 'model.import!'.
class ActiveRecord::Base
def self.import!(record_list)
raise ArgumentError "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
key_list, value_list = convert_record_list(record_list)
sql = "INSERT INTO #{self.table_name} (#{key_list.join(", ")}) VALUES #{value_list.map {|rec| "(#{rec.join(", ")})" }.join(" ,")}"
self.connection.insert_sql(sql)
end
def self.convert_record_list(record_list)
key_list = record_list.map(&:keys).flatten.uniq.sort
value_list = record_list.map do |rec|
list = []
key_list.each {|key| list << ActiveRecord::Base.connection.quote(rec[key]) }
list
end
return [key_list, value_list]
end
end
Use the activerecord-import gem. Let us say you are reading a CSV file and generating a Product
catalogue and you want to insert records in batches of 1000:
batch,batch_size = [], 1_000
CSV.foreach("/data/new_products.csv", :headers => true) do |row|
batch << Product.new(row)
if batch.size >= batch_size
Product.import batch
batch = []
end
end
Product.import batch