Thursday, August 24, 2006

Watch that SQL!

Long before I used object relational mapping tools like Hibernate or Toplink in Java or ActiveRecord in Ruby on Rails I used to write a fair amount of SQL (some real nasty queries too!). And in a lot of the shops where I worked it was standard practice to create an explain plan just to make sure that the database could execute the SQL efficiently. If it couldn't, I would dutifully restructure the query, add indexes, or go ask for some DBA help.

But nowadays in this wonderful land of ORMs people just let the framework crank out the queries and never ever look at them. To me, that's just plain wrong. If you're using Hibernate turn on the logging. If you're using Rails just go take a look at the development.log file and go watch what's going on! You may be surprised.

I started a new job a few weeks ago and just wanted to see what Hibernate was doing. And lo and behold, a JSP page was causing 69 queries to run in order to generate a simple list of rows from a table. The main problem was that every lazily instantiated relationship for each object in the collection was being traversed after the initial fetch.

So what could have been done? Well in this case, Hibernate has several options:

  1. Hibernate's query language is quite expressive and something like select e from Employee e left outer join fetch e.department d would have caused both objects to be fetched in a single query.
  2. The fetch keyword I used in the first example has one limitation (for good reason too). It can only be used to fetch one collection. So take advantage of Hibernate's caching to prefetch related objects instead. Hibernate will associate any newly instantiated objects with objects that already exist in cache. Even if you're not using something like EHCache you still have some caching built into your session so don't discount this tip right away.
  3. Use a DTO-like object and go old-school and only select the fields you need with something like: select new EmployeeListDTO( e.firstName, e.lastName, d.name) from Employee e left outer join e.department d
Now if you're using ActiveRecord you don't have the double edged sword of a cache so the second option I described above isn't available, but the other two are possibilities:

  1. Employee.find :all :include => :department
  2. Employee.find_by_sql "select e.first_name, e.last_name, d.name from employees e left outer join departments d on e.department_id = d.id"
The second option shows one distinct design difference between Hibernate and ActiveRecord. Hibernate gives you a SQL-like query language but does away with having to specify join constraints because that information is in the mapping metadata. Whereas ActiveRecord says why reinvent SQL and just let's you have at it. Oddly enough I like both approaches...

Finally, be sure to use the ad-hoc query tools at your disposal to try out your queries before you embed them in your app. If you're using Hibernate 2.x then go get a copy of Hibernate Console (part of the Hibernate Tools). It can be a bit of a pain to get it setup but I really appreciate being able to work out the query before running it as part of my app. If you're using Hibernate 3.x and Eclipse then go get the Hibernate Tools plugin. If you're using Ruby on Rails then you already have all the tools you need. Simply drop to the command line, cd to your project directory and then type script/console. Open up a second shell window (if you're running unix) and type tail -f log/development.log from your project directory and you can instantly watch what SQL Rails generates when you execute those find methods.

Now go out there and watch what you're doing for database access! Don't let all that ORM goodness make you lazy.

No comments: