In this entry, I want to take a look at using HQL to reduce unwanted database queries. As with any optimization, the first thing to do is identify the problem. So lets take a look at how Hibernate does things by default.
When you create a relationship between entities, the default fetching behavior for the relationship is called "lazy". Which means that Hibernate won't try to load up the associated objects until something actually asks for them. And in some cases, this is perfectly fine. But in other cases, it can cause a lot of overhead.
Consider the relationship between a BlogEntry and a BlogComment. If you recall, this is a bidirectional relationship, where a BlogEntry has many BlogComments, and a BlogComment has one BlogEntry. That means if I load up a BlogComment, by default the associated BlogEntry is not loaded yet. Only when some code tries to access the comment's BlogEntry does Hibernate actually create that object. And that may be fine if I'm only dealing with one comment.
But consider a situation where I want to show all of the comments that have been posted by a specific user, along with the BlogEntry where the comment was posted. I might have a simple DAO method that looks like:
function commentsByUser( id )
{
return OrmExecuteQuery( "SELECT c
FROM
BlogComment c
WHERE
c.createdBy.id=:id",
{id=arguments.id}
);
}
Which will generate SQL like this:
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogcommen0_.id as id2_,
blogcommen0_.content as content2_,
blogcommen0_.dateCreated as dateCrea3_2_,
blogcommen0_.entryId as entryId2_,
blogcommen0_.createdBy as createdBy2_
from
BlogComment blogcommen0_
where
blogcommen0_.createdBy=?
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogentry0_.id as id0_0_,
blogentry0_.content as content0_0_,
blogentry0_.dateCreated as dateCrea3_0_0_,
blogentry0_.title as title0_0_,
blogentry0_.createdBy as createdBy0_0_
from
BlogEntry blogentry0_
where
blogentry0_.id=?
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogentry0_.id as id0_0_,
blogentry0_.content as content0_0_,
blogentry0_.dateCreated as dateCrea3_0_0_,
blogentry0_.title as title0_0_,
blogentry0_.createdBy as createdBy0_0_
from
BlogEntry blogentry0_
where
blogentry0_.id=?
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogentry0_.id as id0_0_,
blogentry0_.content as content0_0_,
blogentry0_.dateCreated as dateCrea3_0_0_,
blogentry0_.title as title0_0_,
blogentry0_.createdBy as createdBy0_0_
from
BlogEntry blogentry0_
where
blogentry0_.id=?
As I iterate over each comment to show the blog entry information, Hibernate must execute separate queries for each entry. This is how the lazy loading works. The associated objects are built on an as-needed basis.
As you can see, if I were showing 30 comments, that would be one query to get the comments, and then 30 queries to create each related BlogEntry. That's a lot of separate queries, especially if I know in advance that I will need them all.
One option is to modify the association to specify fetch="join", like this:
property name="blogEntry" fieldtype="many-to-one" fetch="join" fkcolumn="entryId" cfc="BlogEntry" notnull="true";
This tells Hibernate "every time you load a BlogComment, I want you to load the associated BlogEntry at the same time". This will cause the objects to be loaded in one SQL statement, eliminating the extra queries. Unfortunately, it's a very brute-force approach. I'd better be really sure that I always want the associated blog entry with every comment. Otherwise, I'm just switching one problem for another: I'll be loading a bunch of extra data if all I actually want are the comments!
To be honest, I'm not happy with this all or nothing decision. The good news is, I can leave the association lazy by default, but use HQL to perform JOIN fetching for situations where I want to load all of the data at once. The method in my DAO might look like this:
function commentsByUser( id )
{
return OrmExecuteQuery( "SELECT c
FROM
BlogComment c
JOIN FETCH c.blogEntry b
JOIN c.createdBy u
WHERE
u.id=:id",
{id=arguments.id}
);
}
As you can see, I'm telling hibernate to fetch the blog entry data along with the comments. The resulting SQL would look like:
09/20 22:30:11 [jrpp-1] HIBERNATE DEBUG -
select
blogcommen0_.id as id2_0_,
blogentry1_.id as id0_1_,
blogcommen0_.content as content2_0_,
blogcommen0_.dateCreated as dateCrea3_2_0_,
blogcommen0_.entryId as entryId2_0_,
blogcommen0_.createdBy as createdBy2_0_,
blogentry1_.content as content0_1_,
blogentry1_.dateCreated as dateCrea3_0_1_,
blogentry1_.title as title0_1_,
blogentry1_.createdBy as createdBy0_1_
from
BlogComment blogcommen0_
inner join
BlogEntry blogentry1_
on blogcommen0_.entryId=blogentry1_.id
inner join
`User` user2_
on blogcommen0_.createdBy=user2_.id
where
user2_.id=?
Which means the BlogEntry will be loaded up along with the BlogComments, all from a single query. In this context, that's going to be a great deal more efficient than executing separate queries for every association. I hope you can see that a big advantage of using HQL is the fine-grained control it offers when it comes to loading objects. I should have one or two more looks at some other scenarios where HQL can help you optimize your application, so stay tuned!
Comments (7) |
del.icio.us
|
Digg It!
|
Linking Blogs
| 5786 Views

