Hibernate Performance Best Practice?
I believe you want to review this section in the Hibernate manual.
I expect that your original problem of "...unreasonably many db-calls..." is an instance of what they call the "N+1 selects problem". If so, they've got options on how to deal with it.
- Make the fetch type Join. Then you'll have a single select with several joins, assuming no intermediate collections.
- Do lazy loading.
- Probably some others, eg FetchProfiles which I've got no experience with.
The first two can be specified at the association level, and fetch type can be overridden at the query level. You should be able to get your query to do only what you need, no more, and do it with a 'good' SQL query with these tools.
- Don't use joins unless really needed. They won't allow you to use neither lazy loading, nor using 2nd level cache for associations
- Use lazy="extra" for large collections, it won't retrieve all the elements until you ask it, you can also use size() method for instance without getting elements from DB
Use load() method if it's possible since it doesn't issue a select query until it's required. E.g. if you have a Book and an Author and you want to associate them together, this will won't issue any selects, only single insert:
Book b = (Book) session.load(Book.class, bookId); Author a = (Author) session.load(Author.class, authorId); b.setAuthor(a); session.save(b);
Use named queries (in your hbm files or in @NamedQuery) so that they are not parsed during each query. Don't use Criteria API until it's required (it makes impossible to use PreparedStatement cache in this case)
- Use OSIV in your web app since it will load data only when/if it's needed
- Use read-only modes for selects-only:
session.setReadOnly(object, true)
. This will make Hibernate not to keep an original snapshot of the selected entity in the persistent context for further dirty checks. - User 2nd level cache and Query Cache for read-mostly and read-only data.
- Use FlushMode.COMMIT instead of AUTO so that Hibernate doesn't issue select before updates, but be ready that this may lead to stale data being written (though Optimistic Locking can help you out).
- Take a look at batch fetching (batch-size) in order to select several entities/collections at one time instead of issuing separate queries for each one.
- Do queries like 'select new Entity(id, someField) from Entity' in order to retrieve only required fields. Take a look at result transformers.
- Use batch operations (like delete) if needed
- If you use native queries, specify explicitly what cache regions should be invalidated (by default - all).
- Take a look at materialized path and nested sets for tree-like structures.
- Set
c3p0.max_statements
in order to enable PreparedStatment cache in the pool and enable the statement cache of your DB if it's switched off by default. - Use StatelessSession if it's possible, it overcomes dirty checks, cascading, interceptors, etc.
- Do not use pagination (
setMaxResults()
,setFirstResult()
) along with queries that contain joins to collections, this will result in all the records pulled from database and pagination will happen in memory by Hibernate. If you want pagination, ideally you shouldn't use joins. If you can't escape it, again - use batch fetching.
Actually there are a lot of tricks, but I can't recall more at the moment.
There are many things you can do to speed-up Hibernate performance, like:
- Enabling SQL statement logging so that you can validate all statements and even detect N+1 query problems during testing.
- Database connection management and monitoring using FlexyPool
- JDBC batching to reduce the number of roundtrips needed to submit INSERT, UPDATE, and DELETE statement.
- JDBC Statement caching
- JPA identifier optimizers like pooled or pooled-lo
- Choosing compact column types
- Use the right relationships: bidirectional
@OneToMany
instead of unidirectional one, using@MapsId
for@OneToOne
, usingSet
for@ManyToMany
- Using inheritance the right way and preferring SINGLE_TABLE for performance reasons
- Minding the Persistence Context size and avoiding long-running transactions
- Using OS caching, DB caching before jumping to the 2nd-level cache which is also useful to off-load the Primary node when doing database replication
- Unleash database query capabilities via SQL native queries
- Split writes among multiple one-to-one entities to [reduce optimistic locking false positives and get a better chance to hit the database cache even when modifying certain entities.