In this blog post I want to illustrate how we can eager load child and parent objects inside memory using NHibernate and how to avoid the nasty problem of creating Cartesian products. I will show you how this can be achieved using the three different type of Query pattern implemented inside NHibernate.

For this example I am using the version 3.3 of NHibernate against a SQLite database to have some quick “in memory” tests.

The Domain Model

My model is quite straighforward, is composed by a Person entity and two child collections, Address and Phone, like illustrated in the following picture:

image

For the Id I am using a System.Guid data type, for the collections I am using an IList<T> and the mapping is achieved using <Bag> with the inverse=”true” attribute. I don’t write the remaining mapping for simplicity.

<class name="Person" abstract="false" table="Person">
  <id name="Id">
    <generator class="guid.comb" />
  </id>

  <property name="FirstName" />
  <property name="LastName" />

  <bag name="Addresses" inverse="true" table="Address" cascade="all">
    <key column="PersonId" />
    <one-to-many class="Address"/>
  </bag>

  <bag name="Phones" inverse="true" table="Phone" cascade="all">
    <key column="PersonId" />
    <one-to-many class="Phone"/>
  </bag>
</class>

NHibernate Linq

With the Linq extension for NHibernate, I can easily eager load the two child collections using the following syntax:

Person expectedPerson = session.Query<Person>()
    .FetchMany(p => p.Phones)
        .ThenFetch(p => p.PhoneType)
    .FetchMany(p => p.Addresses)
    .Where(x => x.Id == person.Id)
    .ToList().First();

The problem of this query is that I will receive a nasty Cartesian product. Why? Well let’s have a look at the SQL generated by this Linq using NHibernate  profiler:

image

In my case I have 2 Phone records and 1 Address record that belong to the parent Person. If I have a look at the statistics I can see that the total number of rows is wrong:

image

Unfortunately, if I write the following test, it passes, which means that my Root Aggregate entity is wrongly loaded:

// wrong because address is only 1
expectedPerson.Addresses
   .Count.Should().Be(2, "There is only one address");
expectedPerson.Phones
   .Count.Should().Be(2, "There are two phones");

The solution is to batch the collections into two different query, without affecting too much the Database performances. In order to achieve this goal I have to use the Future syntax and tell to NHibernate to build a Root Aggregate with three database batch calls:

// create the first query
var query = session.Query<Person>()
      .Where(x => x.Id == person.Id);
// batch the collections
query
   .FetchMany(x => x.Addresses)
   .ToFuture();
query
   .FetchMany(x => x.Phones)
   .ThenFetch(p => p.PhoneType)
   .ToFuture();
// execute the queries in one roundtrip
Person expectedPerson = query.ToFuture().ToList().First();

Now if I profile my query, I can see that the entities loaded are loaded using 3 SQL queries but batched together into one single database call:

image

Regarding the performances, this is the difference between a Cartesian product and a Batch call:

image

NHibernate QueryOver

The same mechanism is available also for the QueryOver<T> component, we can instruct NHibernate to create a left outer join, and get a Cartesian product, like the following statement:

Phone phone = null;
PhoneType phoneType = null;
// One query
Person expectedPerson = session.QueryOver<Person>()
    // Inner Join
    .Fetch(p => p.Addresses).Eager
    // left outer join
    .Left.JoinAlias(p => p.Phones, () => phone)
    .Left.JoinAlias(() => phone.PhoneType, () => phoneType)
    .Where(x => x.Id == person.Id)
    .TransformUsing(Transformers.DistinctRootEntity)
    .List().First();

As you can see here I am trying to apply the transformer DistinctRootEntity, but unfortunately the transformer does not work if you eager load more than 1 child collection, because the Database returns more than 1 instance of the same Root Aggregate.

Also in this case, the alternative is to Batch the collections and send 3 queries to the Database in one round trip:

Phone phone = null;
PhoneType phoneType = null;
// prepare the query
var query = session.QueryOver<Person>()
    .Where(x => x.Id == person.Id)
    .Future();
// eager load in one batch the first collection
session.QueryOver<Person>()
    .Fetch(x => x.Addresses).Eager
    .Future();
// second collection with grandchildren
session.QueryOver<Person>()
    .Left.JoinAlias(p => p.Phones, () => phone)
    .Left.JoinAlias(() => phone.PhoneType, () => phoneType)
    .Future();
// execute the query
Person expectedPerson = query.ToList().First();

Personally, the only thing that I don’t like about QueryOver<T> is the syntax, as you can see from my complex query I need to create some empty pointers to the object Phone and PhoneType. I don’t like it because when I batch 3,  4 collections I always come up with 3, 4 variables that are quite ugly and useless.

NHibernate HQL

HQL is a great query language, it allows you to really write any kind of complex query and the biggest advantage, compared to Linq or QueryOver<T> is the fully support by the framework.

The only downside is that it requires “magic strings”, so you must be very careful on what query you write because it is very easy to write wrong queries and get a nice runtime exception.

So, also in this case, I can eager load everything in one shot, and get again a Cartesian product:

Person expectedPerson =
    session.CreateQuery(@"
    from Person p 
    left join fetch p.Addresses a 
    left join fetch p.Phones ph 
    left join fetch ph.PhoneType pt
    where p.Id = :id")
        .SetParameter("id", person.Id)
        .List<Person>().First();

Or batch 3 different HQL query in one Database call:

// prepare the query
var query = session.CreateQuery("from Person p where p.Id = :id")
        .SetParameter("id", person.Id)
        .Future<Person>();
// eager load first collection
session.CreateQuery("from Person p 
                     left join fetch p.Addresses a where p.Id = :id")
        .SetParameter("id", person.Id)
        .Future<Person>();
// eager load second collection
session.CreateQuery("from Person p
                     left join fetch p.Phones ph 
                     left join fetch ph.PhoneType pt where p.Id = :id")
        .SetParameter("id", person.Id)
        .Future<Person>();

Eager Load vs Batch

Actually I run some tests in order to understand if the performances are better by:

  • Running an eager query and clean manually the duplicated records
  • Run a batch set of queries and get a clean Root Aggregate

These are my results:

image

And surprisingly the eager load + C# cleanup is slower than the batch call. Smile

One Thought on “NHibernate Fetch strategies

  1. Pingback: Dew Drop – July 7, 2014 (#1809) | Morning Dew

Post Navigation