Using ColdFusion ORM and HQL, Part 5: Optimizing with HQL

In this entry, I want to take a look at using HQL to reduce unwanted database queries. As with any optimization, the first thing to do is identify the problem. So lets take a look at how Hibernate does things by default.

When you create a relationship between entities, the default fetching behavior for the relationship is called "lazy". Which means that Hibernate won't try to load up the associated objects until something actually asks for them. And in some cases, this is perfectly fine. But in other cases, it can cause a lot of overhead.

Consider the relationship between a BlogEntry and a BlogComment. If you recall, this is a bidirectional relationship, where a BlogEntry has many BlogComments, and a BlogComment has one BlogEntry. That means if I load up a BlogComment, by default the associated BlogEntry is not loaded yet. Only when some code tries to access the comment's BlogEntry does Hibernate actually create that object. And that may be fine if I'm only dealing with one comment.

But consider a situation where I want to show all of the comments that have been posted by a specific user, along with the BlogEntry where the comment was posted. I might have a simple DAO method that looks like:

function commentsByUser( id )
{
	return OrmExecuteQuery( "SELECT c
				 FROM 
				 	BlogComment c
				 WHERE 
				 	c.createdBy.id=:id",
				 {id=arguments.id} 
				);
}
		

Which will generate SQL like this:

09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG - 
    select
        blogcommen0_.id as id2_,
        blogcommen0_.content as content2_,
        blogcommen0_.dateCreated as dateCrea3_2_,
        blogcommen0_.entryId as entryId2_,
        blogcommen0_.createdBy as createdBy2_ 
    from
        BlogComment blogcommen0_ 
    where
        blogcommen0_.createdBy=?
		
The query gets all of the comments posted by the user. I can loop over them, showing the comment along with the name and date of the blog entry. However, the overhead caused by the lazy loading will increase as the number of comments I'm displaying goes up. For example, if a user has posted three comments, Hibernate will run the following queries when I try to display this information:

        
09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG - 
    select
        blogentry0_.id as id0_0_,
        blogentry0_.content as content0_0_,
        blogentry0_.dateCreated as dateCrea3_0_0_,
        blogentry0_.title as title0_0_,
        blogentry0_.createdBy as createdBy0_0_ 
    from
        BlogEntry blogentry0_ 
    where
        blogentry0_.id=?

09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG - 
    select
        blogentry0_.id as id0_0_,
        blogentry0_.content as content0_0_,
        blogentry0_.dateCreated as dateCrea3_0_0_,
        blogentry0_.title as title0_0_,
        blogentry0_.createdBy as createdBy0_0_ 
    from
        BlogEntry blogentry0_ 
    where
        blogentry0_.id=?

09/20 22:29:09 [jrpp-1] HIBERNATE DEBUG - 
    select
        blogentry0_.id as id0_0_,
        blogentry0_.content as content0_0_,
        blogentry0_.dateCreated as dateCrea3_0_0_,
        blogentry0_.title as title0_0_,
        blogentry0_.createdBy as createdBy0_0_ 
    from
        BlogEntry blogentry0_ 
    where
        blogentry0_.id=?
		

As I iterate over each comment to show the blog entry information, Hibernate must execute separate queries for each entry. This is how the lazy loading works. The associated objects are built on an as-needed basis.

As you can see, if I were showing 30 comments, that would be one query to get the comments, and then 30 queries to create each related BlogEntry. That's a lot of separate queries, especially if I know in advance that I will need them all.

One option is to modify the association to specify fetch="join", like this:

property name="blogEntry" fieldtype="many-to-one" fetch="join"
		fkcolumn="entryId" cfc="BlogEntry" notnull="true";
		

This tells Hibernate "every time you load a BlogComment, I want you to load the associated BlogEntry at the same time". This will cause the objects to be loaded in one SQL statement, eliminating the extra queries. Unfortunately, it's a very brute-force approach. I'd better be really sure that I always want the associated blog entry with every comment. Otherwise, I'm just switching one problem for another: I'll be loading a bunch of extra data if all I actually want are the comments!

To be honest, I'm not happy with this all or nothing decision. The good news is, I can leave the association lazy by default, but use HQL to perform JOIN fetching for situations where I want to load all of the data at once. The method in my DAO might look like this:

function commentsByUser( id )
{
	return OrmExecuteQuery( "SELECT c
				 FROM 
				 	BlogComment c 
				 	JOIN FETCH c.blogEntry b
				 	JOIN c.createdBy u
				 WHERE 
				 	u.id=:id",
				 {id=arguments.id} 
				);
}
		

As you can see, I'm telling hibernate to fetch the blog entry data along with the comments. The resulting SQL would look like:

09/20 22:30:11 [jrpp-1] HIBERNATE DEBUG - 
    select
        blogcommen0_.id as id2_0_,
        blogentry1_.id as id0_1_,
        blogcommen0_.content as content2_0_,
        blogcommen0_.dateCreated as dateCrea3_2_0_,
        blogcommen0_.entryId as entryId2_0_,
        blogcommen0_.createdBy as createdBy2_0_,
        blogentry1_.content as content0_1_,
        blogentry1_.dateCreated as dateCrea3_0_1_,
        blogentry1_.title as title0_1_,
        blogentry1_.createdBy as createdBy0_1_ 
    from
        BlogComment blogcommen0_ 
    inner join
        BlogEntry blogentry1_ 
            on blogcommen0_.entryId=blogentry1_.id 
    inner join
        `User` user2_ 
            on blogcommen0_.createdBy=user2_.id 
    where
        user2_.id=?
		

Which means the BlogEntry will be loaded up along with the BlogComments, all from a single query. In this context, that's going to be a great deal more efficient than executing separate queries for every association. I hope you can see that a big advantage of using HQL is the fine-grained control it offers when it comes to loading objects. I should have one or two more looks at some other scenarios where HQL can help you optimize your application, so stay tuned!

Comments Comments (8) | del.ico.us del.icio.us | Digg It! Digg It! | Linking Blogs Linking Blogs | 10473 Views

Using ColdFusion ORM and HQL, Part 4: Basic Filtering and Sorting

Now that we've got our domain model set up, associations created, and relationships managed, I want to delve into the topic of HQL. In case you aren't aware, HQL is an "object-centric" query syntax that gives you very fine-grained control over what Hibernate is doing when you ask it to retrieve objects from the database.

I'll say up front that I am a proponent of using data access objects (DAOs) to encapsulate interaction with the database. In essence, that means the only place that really should know about the database, Hibernate, or HQL is within the DAOs. So the examples I'm going to show are created as methods within a DAO.

We'll start with some simple sorting and filtering. This is a common requirement, and HQL makes it very easy to do:

	
function allEntries()
{
	return OrmExecuteQuery( "FROM 
					BlogEntry b 
				 ORDER BY 
					b.dateCreated DESC" );
}
		

That simply gets all of the blog entries, sorted by the date they were created.

function mostRecentEntry()
{
	var results = OrmExecuteQuery( "FROM 
						BlogEntry b 
					ORDER BY 
						b.dateCreated DESC", 
					false, {maxResults=1} );
	return results[1];
}
		

As the name implies, the above HQL would get the most recent blog entry.

function entriesByUser( id )
{
	return OrmExecuteQuery( "SELECT b 
				FROM 
					BlogEntry b 
					JOIN b.createdBy u
				WHERE 
					u.id=:id",
				{id=arguments.id} );
}
		

Here, we start leveraging the object relationships. In this case, I'm getting all of the blog entries that were created by the specified user.

function entriesWithCommentsFromUserName( userName )
{
	return OrmExecuteQuery( "SELECT DISTINCT b 
				FROM 
					BlogEntry b 
					JOIN b.comments c 
					JOIN c.createdBy u 
				WHERE 
					u.name LIKE :userName",
				{userName='%#arguments.userName#%'} );
}
		

And finally, the above HQL loads all BlogEntries that have a comment which was posted by a user matching a user name search string. This is a fairly contrived example, since I'm not sure a blog application would actually need to do this. But on the ORM mailing list, I've seen several people with the need to filter the objects based on a property from a collection of related objects.

To make it more clear, this is the actual SQL that Hibernate would generate under the hood for this HQL:

09/15 00:58:17 [jrpp-2] HIBERNATE DEBUG - 
select
        distinct blogentry0_.id as id0_,
                blogentry0_.content as content0_,
                blogentry0_.dateCreated as dateCrea3_0_,
                blogentry0_.title as title0_,
                blogentry0_.createdBy as createdBy0_ 
        from
                BlogEntry blogentry0_
                inner join
                BlogComment comments1_ 
                	on blogentry0_.id=comments1_.entryId 
                inner join
                	`User` user2_
                	on comments1_.createdBy=user2_.id
        where
        		user2_.name like ?
            		
09/15 00:58:17 [jrpp-2] HIBERNATE DEBUG - binding '%Tom%' to parameter: 1
		

The SQL itself may seem obvious to anyone familiar with SQL. But since Hibernate is dealing with objects and properties, it can sometimes take a bit of trial and error to work out how traverse properties that are collections of objects.

I will stop here for today. In the next entry, we'll focus on using HQL to optimize the loading of larger object graphs.

Comments Comments (14) | del.ico.us del.icio.us | Digg It! Digg It! | Linking Blogs Linking Blogs | 16302 Views

Using ColdFusion ORM and HQL, Part 3: Bidirectional Relationships

In this entry I want to take a closer look at the relationship I have set up between BlogEntry and BlogComment. Since I want to be able to ask an entry for its comments, and I also want to be able to ask a comment for its entry, this is a bidirectional relationship. As a result, this takes a bit of additional thought to get working optimally.

Recall that BlogEntry looks like this right now:

component displayname="BlogEntry" extends="Entity" output="false" 
		  persistent="true" accessors="true"
{
	property name="title" type="string" ormtype="string";
	property name="content" type="string" ormtype="text";
	property name="categories" singularname="category" fieldtype="many-to-many" 
  		     cfc="BlogCategory" linktable="entry_category" 
  		     fkcolumn="entryId" inversejoincolumn="categoryId";
	property name="comments" singularname="comment" fieldtype="one-to-many"
  		     cfc="BlogComment" fkcolumn="entryId" 
  		     orderby="dateCreated ASC" inverse="true";
	
	BlogEntry function init()
	{
		variables.categories = [];
		variables.comments = [];
		return this;
	}	
}
		

And that BlogComment looks like:

component displayname="BlogComment" extends="Entity" output="false"
		  persistent="true" accessors="true"
{
	property name="content" type="string" ormtype="text";
	property name="blogEntry" fieldtype="many-to-one" 
 	  	     fkcolumn="entryId" cfc="BlogEntry";  
}
		

The first thing to make note of is that the comments collection in BlogEntry is marked as inverse. As I (and others) have mentioned previously, it is important to designate one side of a bidirectional relationship as the owning side. By setting the comments collection as inverse, I am telling Hibernate that the other side of this relationship is the owning side. In other words, the blogEntry property in BlogComment is the owner of this relationship.

This has a very important effect on the relationship: Hibernate will ignore all persistence operations regarding this relationship on the side that is marked as inverse. Which means that in order to get Hibernate to properly save this relationship, I must set the blogEntry property in BlogComment. Put another way, I'm telling Hibernate: "when it comes to his relationship, I want you to use the blogEntry property of BlogComment to dictate what you persist to the database".

I need to emphasize one additional point, which is that even though Hibernate only cares about what is set into the blogEntry property when it comes to persisting my data, my domain model most certainly cares about both sides of the relationship. If I set the blogEntry property on one of my BlogComments, then that BlogComment better be present in the comments collection of my BlogEntry.

So it looks like we have two different issues. First, how do I make sure the owning side of the relationship (the blogEntry property) is always set correctly. And second, how do I make sure that the collection of BlogComments in a BlogEntry is also kept up to date?

At first glace, the solution might just be to dictate that anyone who wants to set up this bidirectional relationship needs to know that they should always do both comment.setBlogEntry( blogEntry ) and blogEntry.addComment( comment ). And that will certainly work. But it's a poor solution, and the reason why is probably obvious: if someone forgets to set up both sides, my domain model spirals into an invalid state.

The good news is that there is a simple best-practice way to enforce this relationship. It is called an association management method. Which just means I should create methods on my BlogEntry and my BlogComment to ensure that both sides are always set together.

I'll start with the owning side, which in this case is BlogComment.

component displayname="BlogComment" extends="Entity" output="false"
		  persistent="true" accessors="true"
{
	property name="content" type="string" ormtype="text";
	property name="blogEntry" fieldtype="many-to-one" 
 	  	     fkcolumn="entryId" cfc="BlogEntry" notnull="true";  

	function setBlogEntry( blogEntry )
	{
		if( !IsNull( arguments.blogEntry ) )
		{
			variables.blogEntry = arguments.blogEntry;		

			if( !arguments.blogEntry.hasComment( this ) )
			{
				ArrayAppend( arguments.blogEntry.getComments(), this );
			}
		}	
	}
}
		

The code is very straightforward, but just to make sure, I'll go through it. When a blogEntry is set on a BlogComment, the first thing I do is check to make sure the incoming blogEntry is not null. In my model, I can't have a comment without an entry. Next, I set the value of blogEntry to the incoming value. And finally, if the incoming blogEntry doesn't have this comment in its collection of comments, I add it to the comments collection.

Since BlogComment already has all of the logic I need to set up the relationship, all I need to do in BlogEntry is delegate to the setBlogEntry method in BlogComment:

component displayname="BlogEntry" extends="Entity" output="false" 
		  persistent="true" accessors="true"
{
	property name="title" type="string" ormtype="string";
	property name="content" type="string" ormtype="text";
	property name="categories" singularname="category" fieldtype="many-to-many" 
  		     cfc="BlogCategory" linktable="entry_category" 
  		     fkcolumn="entryId" inversejoincolumn="categoryId";
	property name="comments" singularname="comment" fieldtype="one-to-many"
  		     cfc="BlogComment" fkcolumn="entryId" 
  		     orderby="dateCreated ASC" inverse="true";
	
	BlogEntry function init()
	{
		variables.categories = [];
		variables.comments = [];
		return this;
	}	

	function addComment( comment )
	{
		comment.setBlogEntry( this );
	}
}
		

With these association management methods set up, I now know that any time someone sets a blogEntry on a comment, or adds a comment to a blogEntry, both sides of the relationship will always be set. I would strongly recommend that you adopt a strategy like this to enforce the rules for your bidirectional relationships.

In the next few entries, I'll move on to the topic of HQL. I want to show some of the ways you can leverage HQL to optimize the way Hibernate interacts with the database.

Comments Comments (4) | del.ico.us del.icio.us | Digg It! Digg It! | Linking Blogs Linking Blogs | 12510 Views

Using ColdFusion ORM and HQL, Part 2: ORM Event Handler

In the previous entry, I described the basic blog application I want to create. We looked at the model, the relationships needed, and whipped up the CFCs to implement that model.

One of the things I mentioned is that I want each entity to be able to keep track of when it was created and who created it. This could be done manually, but I like to avoid manual whenever I can. The good news is that we can easily do this automatically.

The solution is to use Hibernate's event model. In Hibernate terms, the classes that are notified about persistence activities are called interceptors. Happily, the ColdFusion development team have given us access to this capability by letting us specify event handler CFCs in our ORM configuration.

I should note that I am running CF 9.0.1 with the cumulative hotfix installed. I'll set up my Application.cfc like this:

component name="Application.cfc" output="false"
{
    this.name = Hash( GetCurrentTemplatePath() );
    
    this.sessionManagement = true;
    this.sessionTimeout = CreateTimeSpan( 0, 0, 0, 10 );
    this.setClientCookies = true;
    this.setDomainCookies = true;
    
    this.appRoot = ExpandPath( '.' );
    this.mappings = {};
    this.mappings["/ormblog"] = this.appRoot;
    
    this.datasource = "ormblog";
    this.ormEnabled = true;
    this.ormSettings.dialect = "MySQLwithInnoDB";
    this.ormSettings.logSQL = true;
    this.ormSettings.saveMapping = false;
    this.ormSettings.eventHandler = 'ormblog.model.orminterceptor.AuditInterceptor';
    this.ormSettings.flushAtRequestEnd = false;
    this.ormSettings.autoManageSession = false;
    this.ormsettings.dbcreate = "dropcreate";
}
		

You can see that I'm specifying an AuditInterceptor CFC as an ORM event handler. Any event handler CFC we define must implement the CFIDE.ORM.IEventHandler interface, so my AuditInterceptor looks like this:

import ormblog.model.domain.*;

component output="false" implements="CFIDE.ORM.IEventHandler"
{
    public void function preInsert( any entity )
    {
   	 entity.setDateCreated( Now() );
   	 if( IsNull( entity.getCreatedBy() ) 
             && StructKeyExists( session, 'currentUser' ) )
   	 {
   		 entity.setCreatedBy( EntityLoadByPK( 'User', session.currentUser ) );
   	 }
    }
    
    public void function preLoad( any entity )
    {
    }
    
    public void function postLoad( any entity )
    {
    }
    
    public void function postInsert( any entity )
    {
    }
    
    public void function preUpdate( any entity, Struct oldData )
    {
    }
    
    public void function postUpdate( any entity )
    {
    }
    
    public void function preDelete( any entity )
    {
    }
    
    public void function postDelete( any entity )
    {
    }

}
		

In my preInsert() method, I'm setting the dateCreated, and setting the createdBy to the currently logged in User. In a real application, I would probably avoid having my interceptor reach into the session scope by using ColdSpring to inject a Session Proxy object that encapsulates interaction with the session scope. But in the interest of keeping this example simple and focused, we'll just grab the current user's ID from the session scope.

The end result of this is that any time I persist a new entity, the dateCreated and createdBy properties will be set automatically. And this could easily be expanded, for example if I ever want to track dateUpdated or updatedBy, etc.

If I only wanted to perform this audit tracking on certain CFCs, rather than all subclasses of my Entity class, I could create another subclass of Entity called AuditableEntity and have my CFCs extend that, or I could create an Auditable interface and have my CFCs implement that interface. The AuditInterceptor could then be changed to only set the audit properties if the CFC being saved is of type AuditableEntity/Auditable. But I digress. ;-)

We've now completed the automatic population of dateCreated and updatedBy. In the next entry I'll dig deeper into the bi-directional relationship between BlogEntry and BlogComment.

Comments Comments (5) | del.ico.us del.icio.us | Digg It! Digg It! | Linking Blogs Linking Blogs | 12171 Views

Using ColdFusion ORM and HQL, Part 1

Now that the Hibernate integration in ColdFusion 9 has been in use for a while, we're seeing an increasing number of questions on the CF-ORM mailing list. This isn't surprising, Hibernate is a vast topic in and of itself, and the CF documentation can really only scratch the surface. There are entire books just dedicated to using Hibernate (my personal favorite being Java Persistence with Hibernate).

It's one thing to look at the documentation and understand the simple use cases, but quite another to actually get into real world application development with Hibernate. With this in mind, I'm going to be posting a few blog entries that explore using CF and Hibernate in some more complex scenarios.

The end result will be a rudimentary blog application, and I'll post that code in the last blog entry on this topic. But before we get to that, I want to back up and look at how one might approach the design of this application.

Here is a simple UML diagram of the domain model I want to create:

Contrary to good OO design, I'm not focusing on the behavior of these objects at all due to the fact that this is specifically an ORM example. I am also making the following assumptions, which affected the way the associations were defined, and will drive some later decisions when it comes to loading the objects:

  • A BlogEntry will typically be associated with a few BlogCategories (between one and four, but maybe a few more)
  • A large number of BlogEntries (hundreds or thousands) can be associated with a BlogCategory
  • A low to moderate number of BlogComments can be associated with a BlogEntry (probably up to a few dozen unless the entry triggers an unusual number of comments)
  • Each entity should keep track of when it was created and who created it

With this in mind, the relationships are defined as follows:

  • One User can be associated with many Entities via the createdBy property
  • One BlogCategory can be associated with many BlogEntries
  • Many BlogComments can be associated with a BlogEntry
  • One BlogEntry can be associated with a BlogComment

Translated into code, the entities look like this:

component displayname="Entity" output="false" mappedsuperclass="true" accessors="true"
{
	property name="id" fieldtype="id" type="numeric" ormtype="int" generator="native";
	property name="dateCreated" type="date" ormtype="timestamp";
	property name="createdBy" fieldtype="many-to-one" cfc="User" fkcolumn="createdBy";
}
		

component displayname="User" extends="Entity" persistent="true"
		  output="false" accessors="true"
{
	property name="name" type="string" ormtype="string";
}
		

component displayname="BlogEntry" extends="Entity" output="false" 
		  persistent="true" accessors="true"
{
	property name="title" type="string" ormtype="string";
	property name="content" type="string" ormtype="text";
	property name="categories" singularname="category" fieldtype="many-to-many" 
  		     cfc="BlogCategory" linktable="entry_category" 
  		     fkcolumn="entryId" inversejoincolumn="categoryId";
	property name="comments" singularname="comment" fieldtype="one-to-many"
  		     cfc="BlogComment" fkcolumn="entryId" 
  		     orderby="dateCreated ASC" inverse="true";
	
	BlogEntry function init()
	{
		variables.categories = [];
		variables.comments = [];
		return this;
	}	
}
		

component displayname="BlogCategory" extends="Entity" output="false"
		  persistent="true" accessors="true"
{
	property name="name" type="string" ormtype="string";
}
		

component displayname="BlogComment" extends="Entity" output="false"
		  persistent="true" accessors="true"
{
	property name="content" type="string" ormtype="text";
	property name="blogEntry" fieldtype="many-to-one" 
 	  	     fkcolumn="entryId" cfc="BlogEntry";  
}
		

As you can see, this is really a tiny amount of code when you consider everything that is going on under the hood. This gives me a solid basis for the application. I can create these objects, set properties and associations, and persist them to the database. But that's the easy part. I want to go further than the simple use cases.

In the upcoming entries, I'll look at how to best handle the bi-directional association between BlogEntry and BlogComment, as well as how to automatically set the createdBy and dateUpdated properties in Entity. I'll also be taking an in-depth look at using HQL for filtering, sorting, and optimizing the way the objects are created to eliminate unnecessary database queries.

Comments Comments (6) | del.ico.us del.icio.us | Digg It! Digg It! | Linking Blogs Linking Blogs | 11635 Views

More Entries