Which Java Type do you use for JPA collections and why?
The question of using a Set or a List is much more difficult I think. At least when you use hibernate as JPA implementation. If you use a List in hibernate, it automatically switch to the "Bags" paradigm, where duplicates CAN exist.
And that decision has significant influence on the queries hibernate executes. Here a little example:
There are two entities, employee and company, a typical many-to-many relation. for mapping those entities to each other, a JoinTable (lets call it "employeeCompany") exist.
You choose the datatype List on both entities (Company/Employee)
So if you now decide to remove Employee Joe from CompanyXY, hibernate executes the following queries:
delete from employeeCompany where employeeId = Joe;
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXA);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXB);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXC);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXD);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXE);
And now the question: why the hell does hibernate not only execute that query?
delete from employeeCompany where employeeId = Joe AND company = companyXY;
The answer is simple (and thx a lot to Nirav Assar for his blogpost): It can't. In a world of bags, delete all & re-insert all remaining is the only proper way! Read that for more clarification. http://assarconsulting.blogspot.fr/2009/08/why-hibernate-does-delete-all-then-re.html
Now the big conclusion:
If you choose a Set instead of a List in your Employee/Company - Entities, you don't have that Problem and only one query is executed!
And why that? Because hibernate is no longer in a world of bags (as you know, Sets allows no duplicates) and executing only one query is now possible.
So the decision between List and Sets is not that simple, at least when it comes to queries & performance!
Like your own question suggests, the key is the domain, not JPA. JPA is just a framework which you can (and should) use in a way which best fits your problem. Choosing a suboptimal solution because of framework (or its limits) is usually a warning bell.
When I need a set and never care about order, I use a Set
. When for some reason order is important (ordered list, ordering by date, etc.), then a List
.
You seem to be well aware of the difference between Collection
, Set
, and List
. The only reason to use one vs. the other depends only on your needs. You can use them to communicate to users of your API (or your future self) the properties of your collection (which may be subtle or implicit).
This is follows the exact same rules as using different collection types anywhere else throughout your code. You could use Object
or Collections
for all your references, yet in most cases you use more concrete types.
For example, when I see a List
, I know it comes sorted in some way, and that duplicates are either acceptable or irrelevant for this case. When I see a Set
, I usually expect it to have no duplicates and no specific order (unless it's a SortedSet
). When I see a Collection
, I don't expect anything more from it than to contain some entities.
Regarding list ordering... Yes, it can be preserved. And even if it's not and you just use @OrderBy
, it still can be useful. Think about the example of event log sorted by timestamp by default. Artificially reordering the list makes little sense, but still it can be useful that it comes sorted by default.
I generally use a List. I find the List API far more useful and compatible with other libraries than Set. List is easier to iterate and generally more efficient for most operations and memory.
The fact that a relationship cannot have duplicates and is not normally ordered should not require usage of a Set, you can use whatever Collection type is most useful to your application.
It depends on your model though, if it is something you are going to do a lot of contains checks on, then a Set would be more efficient.
You can order a relationship in JPA, either using an @OrderBy or an @OrderColumn.
See, http://en.wikibooks.org/wiki/Java_Persistence/Relationships#Ordering
Duplicates are not generally supported in JPA, but some mappings such as ElementCollections may support duplicates.