In this entry, I want to take a look at using HQL to reduce unwanted database queries. As with any optimization, the first thing to do is identify the problem. So lets take a look at how Hibernate does things by default.
When you create a relationship between entities, the default fetching behavior for the relationship is called "lazy". Which means that Hibernate won't try to load up the associated objects until something actually asks for them. And in some cases, this is perfectly fine. But in other cases, it can cause a lot of overhead.
Consider the relationship between a BlogEntry and a BlogComment. If you recall, this is a bidirectional relationship, where a BlogEntry has many BlogComments, and a BlogComment has one BlogEntry. That means if I load up a BlogComment, by default the associated BlogEntry is not loaded yet. Only when some code tries to access the comment's BlogEntry does Hibernate actually create that object. And that may be fine if I'm only dealing with one comment.
But consider a situation where I want to show all of the comments that have been posted by a specific user, along with the BlogEntry where the comment was posted. I might have a simple DAO method that looks like:
function commentsByUser( id )
{
return OrmExecuteQuery( "SELECT c
FROM
BlogComment c
WHERE
c.createdBy.id=:id",
{id=arguments.id}
);
}
Which will generate SQL like this:
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogcommen0_.id as id2_,
blogcommen0_.content as content2_,
blogcommen0_.dateCreated as dateCrea3_2_,
blogcommen0_.entryId as entryId2_,
blogcommen0_.createdBy as createdBy2_
from
BlogComment blogcommen0_
where
blogcommen0_.createdBy=?
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogentry0_.id as id0_0_,
blogentry0_.content as content0_0_,
blogentry0_.dateCreated as dateCrea3_0_0_,
blogentry0_.title as title0_0_,
blogentry0_.createdBy as createdBy0_0_
from
BlogEntry blogentry0_
where
blogentry0_.id=?
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogentry0_.id as id0_0_,
blogentry0_.content as content0_0_,
blogentry0_.dateCreated as dateCrea3_0_0_,
blogentry0_.title as title0_0_,
blogentry0_.createdBy as createdBy0_0_
from
BlogEntry blogentry0_
where
blogentry0_.id=?
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG -
select
blogentry0_.id as id0_0_,
blogentry0_.content as content0_0_,
blogentry0_.dateCreated as dateCrea3_0_0_,
blogentry0_.title as title0_0_,
blogentry0_.createdBy as createdBy0_0_
from
BlogEntry blogentry0_
where
blogentry0_.id=?
As I iterate over each comment to show the blog entry information, Hibernate must execute separate queries for each entry. This is how the lazy loading works. The associated objects are built on an as-needed basis.
As you can see, if I were showing 30 comments, that would be one query to get the comments, and then 30 queries to create each related BlogEntry. That's a lot of separate queries, especially if I know in advance that I will need them all.
One option is to modify the association to specify fetch="join", like this:
property name="blogEntry" fieldtype="many-to-one" fetch="join" fkcolumn="entryId" cfc="BlogEntry" notnull="true";
This tells Hibernate "every time you load a BlogComment, I want you to load the associated BlogEntry at the same time". This will cause the objects to be loaded in one SQL statement, eliminating the extra queries. Unfortunately, it's a very brute-force approach. I'd better be really sure that I always want the associated blog entry with every comment. Otherwise, I'm just switching one problem for another: I'll be loading a bunch of extra data if all I actually want are the comments!
To be honest, I'm not happy with this all or nothing decision. The good news is, I can leave the association lazy by default, but use HQL to perform JOIN fetching for situations where I want to load all of the data at once. The method in my DAO might look like this:
function commentsByUser( id )
{
return OrmExecuteQuery( "SELECT c
FROM
BlogComment c
JOIN FETCH c.blogEntry b
JOIN c.createdBy u
WHERE
u.id=:id",
{id=arguments.id}
);
}
As you can see, I'm telling hibernate to fetch the blog entry data along with the comments. The resulting SQL would look like:
09/20 22:30:11 [jrpp-1] HIBERNATE DEBUG -
select
blogcommen0_.id as id2_0_,
blogentry1_.id as id0_1_,
blogcommen0_.content as content2_0_,
blogcommen0_.dateCreated as dateCrea3_2_0_,
blogcommen0_.entryId as entryId2_0_,
blogcommen0_.createdBy as createdBy2_0_,
blogentry1_.content as content0_1_,
blogentry1_.dateCreated as dateCrea3_0_1_,
blogentry1_.title as title0_1_,
blogentry1_.createdBy as createdBy0_1_
from
BlogComment blogcommen0_
inner join
BlogEntry blogentry1_
on blogcommen0_.entryId=blogentry1_.id
inner join
`User` user2_
on blogcommen0_.createdBy=user2_.id
where
user2_.id=?
Which means the BlogEntry will be loaded up along with the BlogComments, all from a single query. In this context, that's going to be a great deal more efficient than executing separate queries for every association. I hope you can see that a big advantage of using HQL is the fine-grained control it offers when it comes to loading objects. I should have one or two more looks at some other scenarios where HQL can help you optimize your application, so stay tuned!
Comments (7) |
del.icio.us
|
Digg It!
|
Linking Blogs
| 5795 Views

# Posted By Daniel Schmid | 10/12/10 4:51 AM
Excellent! I should definetly dig deeper into HQL.
What about using criteria and restricitions objects to filter?
<a href="http://tinyurl.com/29qloup">http://tinyurl...;
Which approach would you recommand, HQL or Criteria Queries?
# Posted By Daniel Schmid | 10/12/10 4:52 AM
sorry for wrong tinyurl
http://preview.tinyurl.com/29qloup
# Posted By Josh | 1/19/11 3:30 PM
I'm working on an application that displays a lot of people with their state and congressional district and I'm having trouble some trouble...
In my Person object, I'm referencing the State object using both lazy="true" and fetch="join". I also have a method in the CFC for the Person object that concatenates State and District and returns a string like "(MD-5)" for Maryland district 5. (District is a property of the Person object, State is its own object)
The problem I'm running into is that Hibernate is still firing off individual queries to get each Person's State - if I turn lazy loading off, it runs all the "getState" queries when the object initializes; if I keep lazy loading on, then it runs the "getState" queries when I call the "getStateDistrict" function. No matter what I do, it seems like Hibernate is determined to run a billion queries when I load a page that lists all people with their StateDistrict value.
Any ideas how I can get the functions within an object's CFC to reference the info that is already queried and stored with the Person object instead of making a new call to the DB?
(P.S. Instead of being a string property of the Person object, State needs to be its own object because we have add'l info that is state-specific.)
getStateDistrict function from the Person.cfc:
<cfparam name="local.stateDist" default="">
<cfset local.stateDist = variables.state.getAbbr()>
<!--- Does not work: <cfset local.stateDist = variables.state.abbr> --->
<cfif Len(variables.district)>
<cfif Val(variables.district) EQ 0>
<cfset local.district = "AL"> <!--- Set 'zero' districts to "At Large" --->
<cfelse>
<cfset local.district = Val(variables.district)> <!--- Use Val() to get rid of leading zeroes --->
</cfif>
<cfset local.stateDist = local.stateDist & "-" & local.district>
</cfif>
<cfreturn local.stateDist />
# Posted By Brian Kotek | 1/19/11 5:19 PM
You probably want to ask on the CF ORM mailing list (http://groups.google.com/group/cf-orm-dev) for more responses. But it should break down like this:
If you're only using the default cache (the Hibernate session cache), while you will see separate queries to load the relationships, you should only have one actual database hit per State. If more than one relationship needs that State, the rest should all use the instance that was already loaded.
You can also enable the second-level cache, which will cache the instances across all Hibernate sessions. So in that case, you would only ever have one database hit per State, and all subsequent references to that State would come from the cache, even in later requests.
# Posted By Kevin R. | 8/30/11 4:19 PM
Hello, in one of the previous blog entries for this series it mentioned that at the end of the series, you would post the code for the final files.
I was wondering if you had done so or if you had changed your mind?
If they are available, could you give me a link to them?
I'm looking for a good example of a simple, working ORM application and yours looks the best of those I have looked at.
I'm hoping to get some ideas for what I am working on.
Best regards,
Kevin
# Posted By steve | 1/5/12 10:55 PM
Hi Brian, I've noticed you have some comments out there on various blogs about the cf9 orm/hibernate. I'm used to sql and entirely new to hibernate (e.g. cf9 orm). I'm trying to understand how the cf9 orm handles a left outer join, which seems more practical for a one to many relationship than a simple inner join. Could you show me the "preferred" way to handle this? It would be helpful to see an example of the actual cfc mappings as well as an attempt to utilize the ormExecuteQuery command.
Thanks!
# Posted By David | 2/16/12 4:25 AM
How would Fetch apply to something like this where there is a many to many set between brand and vendor?
public array function getProductsForVendor(required any vendorID) {
var params = [arguments.vendorId];
var hql = " SELECT p
FROM Product p
INNER JOIN p.brand b
INNER JOIN b.vendors v
WHERE v.vendorID = ? ";
return ormExecuteQuery(hql, params);
}