DEPRECATION WARNING: Dangerous query method: Random Record in ActiveRecord >= 5.2
If you want to continue using order by random()
then just declare it safe by wrapping it in Arel.sql
like the deprecation warning suggests:
Model.order(Arel.sql('random()')).first # PostgreSQL
Model.order(Arel.sql('rand()')).first # MySQL
There are lots of ways of selecting a random row and they all have advantages and disadvantages but there are times when you absolutely must use a snippet of SQL in an order by
(such as when you need the order to match a Ruby array and have to get a big case when ... end
expression down to the database) so using Arel.sql
to get around this "attributes only" restriction is a tool we all need to know about.
Edited: The sample code is missing a closing parentheses.
I'm a fan of this solution:
Model.offset(rand(Model.count)).first
With many records, and not many deleted records, this may be more efficient. In my case I have to use .unscoped
because default scope uses a join. If your model doesn't use such a default scope, you can omit the .unscoped
wherever it appears.
Patient.unscoped.count #=> 134049
class Patient
def self.random
return nil unless Patient.unscoped.any?
until @patient do
@patient = Patient.unscoped.find rand(Patient.unscoped.last.id)
end
@patient
end
end
#Compare with other solutions offered here in my use case
puts Benchmark.measure{10.times{Patient.unscoped.order(Arel.sql('RANDOM()')).first }}
#=>0.010000 0.000000 0.010000 ( 1.222340)
Patient.unscoped.order(Arel.sql('RANDOM()')).first
Patient Load (121.1ms) SELECT "patients".* FROM "patients" ORDER BY RANDOM() LIMIT 1
puts Benchmark.measure {10.times {Patient.unscoped.offset(rand(Patient.unscoped.count)).first }}
#=>0.020000 0.000000 0.020000 ( 0.318977)
Patient.unscoped.offset(rand(Patient.unscoped.count)).first
(11.7ms) SELECT COUNT(*) FROM "patients"
Patient Load (33.4ms) SELECT "patients".* FROM "patients" ORDER BY "patients"."id" ASC LIMIT 1 OFFSET 106284
puts Benchmark.measure{10.times{Patient.random}}
#=>0.010000 0.000000 0.010000 ( 0.148306)
Patient.random
(14.8ms) SELECT COUNT(*) FROM "patients"
#also
Patient.unscoped.find rand(Patient.unscoped.last.id)
Patient Load (0.3ms) SELECT "patients".* FROM "patients" ORDER BY "patients"."id" DESC LIMIT 1
Patient Load (0.4ms) SELECT "patients".* FROM "patients" WHERE "patients"."id" = $1 LIMIT 1 [["id", 4511]]
The reason for this is because we're using rand()
to get a random ID and just do a find on that single record. However the greater the number of deleted rows (skipped ids) the more likely the while loop will execute multiple times. It might be overkill but could be worth a the 62% increase in performance and even higher if you never delete rows. Test if it's better for your use case.