Rails :include vs. :joins
It appears that the :include
functionality was changed with Rails 2.1. Rails used to do the join in all cases, but for performance reasons it was changed to use multiple queries in some circumstances. This blog post by Fabio Akita has some good information on the change (see the section entitled "Optimized Eager Loading").
The difference between joins and include is that using the include statement generates a much larger SQL query loading into memory all the attributes from the other table(s).
For example, if you have a table full of comments and you use a :joins => users to pull in all the user information for sorting purposes, etc it will work fine and take less time than :include, but say you want to display the comment along with the users name, email, etc. To get the information using :joins, it will have to make separate SQL queries for each user it fetches, whereas if you used :include this information is ready for use.
Great example:
http://railscasts.com/episodes/181-include-vs-joins
.joins
will just joins the tables and brings selected fields in return. if you call associations on joins query result, it will fire database queries again
:includes
will eager load the included associations and add them in memory. :includes
loads all the included tables attributes. If you call associations on include query result, it will not fire any queries
I was recently reading more on difference between :joins
and :includes
in rails. Here is an explaination of what I understood (with examples :))
Consider this scenario:
A User has_many comments and a comment belongs_to a User.
The User model has the following attributes: Name(string), Age(integer). The Comment model has the following attributes:Content, user_id. For a comment a user_id can be null.
Joins:
:joins performs a inner join between two tables. Thus
Comment.joins(:user)
#=> <ActiveRecord::Relation [#<Comment id: 1, content: "Hi I am Aaditi.This is my first comment!", user_id: 1, created_at: "2014-11-12 18:29:24", updated_at: "2014-11-12 18:29:24">,
#<Comment id: 2, content: "Hi I am Ankita.This is my first comment!", user_id: 2, created_at: "2014-11-12 18:29:29", updated_at: "2014-11-12 18:29:29">,
#<Comment id: 3, content: "Hi I am John.This is my first comment!", user_id: 3, created_at: "2014-11-12 18:30:25", updated_at: "2014-11-12 18:30:25">]>
will fetch all records where user_id (of comments table) is equal to user.id (users table). Thus if you do
Comment.joins(:user).where("comments.user_id is null")
#=> <ActiveRecord::Relation []>
You will get a empty array as shown.
Moreover joins does not load the joined table in memory. Thus if you do
comment_1 = Comment.joins(:user).first
comment_1.user.age
#=>←[1m←[36mUser Load (0.0ms)←[0m ←[1mSELECT "users".* FROM "users" WHERE "users"."id" = ? ORDER BY "users"."id" ASC LIMIT 1←[0m [["id", 1]]
#=> 24
As you see, comment_1.user.age
will fire a database query again in the background to get the results
Includes:
:includes performs a left outer join between the two tables. Thus
Comment.includes(:user)
#=><ActiveRecord::Relation [#<Comment id: 1, content: "Hi I am Aaditi.This is my first comment!", user_id: 1, created_at: "2014-11-12 18:29:24", updated_at: "2014-11-12 18:29:24">,
#<Comment id: 2, content: "Hi I am Ankita.This is my first comment!", user_id: 2, created_at: "2014-11-12 18:29:29", updated_at: "2014-11-12 18:29:29">,
#<Comment id: 3, content: "Hi I am John.This is my first comment!", user_id: 3, created_at: "2014-11-12 18:30:25", updated_at: "2014-11-12 18:30:25">,
#<Comment id: 4, content: "Hi This is an anonymous comment!", user_id: nil, created_at: "2014-11-12 18:31:02", updated_at: "2014-11-12 18:31:02">]>
will result in a joined table with all the records from comments table. Thus if you do
Comment.includes(:user).where("comment.user_id is null")
#=> #<ActiveRecord::Relation [#<Comment id: 4, content: "Hi This is an anonymous comment!", user_id: nil, created_at: "2014-11-12 18:31:02", updated_at: "2014-11-12 18:31:02">]>
it will fetch records where comments.user_id is nil as shown.
Moreover includes loads both the tables in the memory. Thus if you do
comment_1 = Comment.includes(:user).first
comment_1.user.age
#=> 24
As you can notice comment_1.user.age simply loads the result from memory without firing a database query in the background.