Category: NHibernate

NHibernate Fetch strategies

In this blog post I want to illustrate how we can eager load child and parent objects inside memory using NHibernate and how to avoid the nasty problem of creating Cartesian products. I will show you how this can be achieved using the three different type of Query pattern implemented inside NHibernate.

For this example I am using the version 3.3 of NHibernate against a SQLite database to have some quick “in memory” tests.

The Domain Model

My model is quite straighforward, is composed by a Person entity and two child collections, Address and Phone, like illustrated in the following picture:

image

For the Id I am using a System.Guid data type, for the collections I am using an IList<T> and the mapping is achieved using <Bag> with the inverse=”true” attribute. I don’t write the remaining mapping for simplicity.

<class name="Person" abstract="false" table="Person">
  <id name="Id">
    <generator class="guid.comb" />
  </id>

  <property name="FirstName" />
  <property name="LastName" />

  <bag name="Addresses" inverse="true" table="Address" cascade="all">
    <key column="PersonId" />
    <one-to-many class="Address"/>
  </bag>

  <bag name="Phones" inverse="true" table="Phone" cascade="all">
    <key column="PersonId" />
    <one-to-many class="Phone"/>
  </bag>
</class>

NHibernate Linq

With the Linq extension for NHibernate, I can easily eager load the two child collections using the following syntax:

Person expectedPerson = session.Query<Person>()
    .FetchMany(p => p.Phones)
        .ThenFetch(p => p.PhoneType)
    .FetchMany(p => p.Addresses)
    .Where(x => x.Id == person.Id)
    .ToList().First();

The problem of this query is that I will receive a nasty Cartesian product. Why? Well let’s have a look at the SQL generated by this Linq using NHibernate  profiler:

image

In my case I have 2 Phone records and 1 Address record that belong to the parent Person. If I have a look at the statistics I can see that the total number of rows is wrong:

image

Unfortunately, if I write the following test, it passes, which means that my Root Aggregate entity is wrongly loaded:

// wrong because address is only 1
expectedPerson.Addresses
   .Count.Should().Be(2, "There is only one address");
expectedPerson.Phones
   .Count.Should().Be(2, "There are two phones");

The solution is to batch the collections into two different query, without affecting too much the Database performances. In order to achieve this goal I have to use the Future syntax and tell to NHibernate to build a Root Aggregate with three database batch calls:

// create the first query
var query = session.Query<Person>()
      .Where(x => x.Id == person.Id);
// batch the collections
query
   .FetchMany(x => x.Addresses)
   .ToFuture();
query
   .FetchMany(x => x.Phones)
   .ThenFetch(p => p.PhoneType)
   .ToFuture();
// execute the queries in one roundtrip
Person expectedPerson = query.ToFuture().ToList().First();

Now if I profile my query, I can see that the entities loaded are loaded using 3 SQL queries but batched together into one single database call:

image

Regarding the performances, this is the difference between a Cartesian product and a Batch call:

image

NHibernate QueryOver

The same mechanism is available also for the QueryOver<T> component, we can instruct NHibernate to create a left outer join, and get a Cartesian product, like the following statement:

Phone phone = null;
PhoneType phoneType = null;
// One query
Person expectedPerson = session.QueryOver<Person>()
    // Inner Join
    .Fetch(p => p.Addresses).Eager
    // left outer join
    .Left.JoinAlias(p => p.Phones, () => phone)
    .Left.JoinAlias(() => phone.PhoneType, () => phoneType)
    .Where(x => x.Id == person.Id)
    .TransformUsing(Transformers.DistinctRootEntity)
    .List().First();

As you can see here I am trying to apply the transformer DistinctRootEntity, but unfortunately the transformer does not work if you eager load more than 1 child collection, because the Database returns more than 1 instance of the same Root Aggregate.

Also in this case, the alternative is to Batch the collections and send 3 queries to the Database in one round trip:

Phone phone = null;
PhoneType phoneType = null;
// prepare the query
var query = session.QueryOver<Person>()
    .Where(x => x.Id == person.Id)
    .Future();
// eager load in one batch the first collection
session.QueryOver<Person>()
    .Fetch(x => x.Addresses).Eager
    .Future();
// second collection with grandchildren
session.QueryOver<Person>()
    .Left.JoinAlias(p => p.Phones, () => phone)
    .Left.JoinAlias(() => phone.PhoneType, () => phoneType)
    .Future();
// execute the query
Person expectedPerson = query.ToList().First();

Personally, the only thing that I don’t like about QueryOver<T> is the syntax, as you can see from my complex query I need to create some empty pointers to the object Phone and PhoneType. I don’t like it because when I batch 3,  4 collections I always come up with 3, 4 variables that are quite ugly and useless.

NHibernate HQL

HQL is a great query language, it allows you to really write any kind of complex query and the biggest advantage, compared to Linq or QueryOver<T> is the fully support by the framework.

The only downside is that it requires “magic strings”, so you must be very careful on what query you write because it is very easy to write wrong queries and get a nice runtime exception.

So, also in this case, I can eager load everything in one shot, and get again a Cartesian product:

Person expectedPerson =
    session.CreateQuery(@"
    from Person p 
    left join fetch p.Addresses a 
    left join fetch p.Phones ph 
    left join fetch ph.PhoneType pt
    where p.Id = :id")
        .SetParameter("id", person.Id)
        .List<Person>().First();

Or batch 3 different HQL query in one Database call:

// prepare the query
var query = session.CreateQuery("from Person p where p.Id = :id")
        .SetParameter("id", person.Id)
        .Future<Person>();
// eager load first collection
session.CreateQuery("from Person p 
                     left join fetch p.Addresses a where p.Id = :id")
        .SetParameter("id", person.Id)
        .Future<Person>();
// eager load second collection
session.CreateQuery("from Person p
                     left join fetch p.Phones ph 
                     left join fetch ph.PhoneType pt where p.Id = :id")
        .SetParameter("id", person.Id)
        .Future<Person>();

Eager Load vs Batch

Actually I run some tests in order to understand if the performances are better by:

  • Running an eager query and clean manually the duplicated records
  • Run a batch set of queries and get a clean Root Aggregate

These are my results:

image

And surprisingly the eager load + C# cleanup is slower than the batch call. Smile

NHibernate cache system. Part 3

In this series of articles we saw how the cache system is implemented in NHibernate and what can we do in order to use it. We also saw that we can choose a cache provider but we didn’t have a look yet at what providers we can use.

I personally have my own opinion about the 2nd level cache providers available for NHibernate 3.2 and I am more than happy if you would like to share with me your experience about it.

Below is a list of the major and the most famous 2nd level cache providers I know:

image

I would personally suggest you to answer the following questions in order to understand what is the cache provider you need for your project:

  • Size of you project (small, standard, enterprise)
  • Cache technology you have already in place
  • Quality attributes required by your solution (scalability, security, …)

SysCache

SysCache and the most recent SysCache2 is a cache provider built on top of the old ASP.NET cache system. It is available from ASP.NET cache provider (NHibernate.Cache.SysCache.dll)

It is an abstraction over ASP.NET cache so it can’t be used in a non-web application. It works but Microsoft suggest to do not use it for non-web applications.

The cache space is not configurable so it is the same for different session factories … really dangerous on a web application that requires isolation between the various users (http://ayende.com/blog/1708/nhibernate-caching-the-secong-level-cache-space-is-shared)

Useful for: small in house projects, better for web projects hosted in IIS

NCache

NCache is a distributed in-memory Object cache and a distributed ASP.NET Session State manager product.

It is able to synchronizes cache across multiple servers so it is designed also for the enterprise.

It provides dynamic clustering & cache configuration for 100% uptime for a real scalable architecture.

Cache reliability through data replication across servers

InProc/OutProc cache for multiple processes on same machine

API identical to ASP.NET Cache

It is available for download and trial here http://www.alachisoft.com/ncache/edition-comparison.html, it is a third party provider and it is not free.

Useful for: medium to big applications that are designed to be scalable

MemCache

MemCache is a famous Linux cache system designed exclusively for the enterprise. It is a complex enterprise cache system based on Linux platform that provide a cache mechanism also for NHibernate.
It is pretty easy to be scaled on a big server farm because it is designed to do so
It does not require licensing cost because it’s an OSS and it is a well known system with a big community.

The following picture represents the core logic of MemCache:

image

The only downside is that it requires a medium knowledge of Linux OS in order to be able to install and configured it.

Useful for: enterprise applications that are designed to be scalable

Velocity, a.k.a. AppFabric

Velocity has been now integrated in AppFabric and it is the cache system implemented by Microsoft for the enterprise. It requires AppFabric and IIS and it can be used locally or with Azure (does it make sense to cache in the cloud?? Flirt male).

  • AppFabric Caching, provides local caching, bulk updates, callbacks for updates, etc… so this is why it’s exciting over something like MemCache which doesn’t provide these features Out of the Box.
  • For enterprise architectures, really scalable, Microsoft product (there may be a license requirement)

Useful for: enterprise applications that are designed to be scalable

NHibernate cache system. Part 2

In the previous post we saw how the cache system is structured in NHibernate and how it works. We saw that we have different methods to play with the cache (Evict, Clear, Flush …) and they are all associated with the ISession object because the cache of level 1 is associated with the lifecycle of an ISession object.

In this second article we will see how the second level cache works and how it is associated with the ISessionFactory object that is in charge of controlling this cache mechanism.

Second Level cache architecture

How does the second level cache work?

image

First of all, when an entity is cached in the second level cache, the entity is disassembled into a collection of keys/values pair, like a dictionary and persisted in the cache repository. This mechanism is accomplished because most of the second level cache providers are able to persist serialized dictionary collections and because in the same time NHibernate does not force you to make serializable your entities (something that IMHO, should never be done!!).

A second mechanism happens when we cache the result of a query (Linq, HQL, ICriteria) because these results can’t be cached using the first level cache (see previous blog post). After we cache a query result, NHibernate will cache only the unique identifiers of the entities involved in the result of the query.

Third, NHibernate has an internal mechanism that allows him to know and keep track of a timestamp value used to write tables or to work with sessions. How does it work? Well the mechanism is pretty clear, it keeps track of when the last table was written too. A series of mechanism will update this timestamp information and you can find a better explanation of Ayende’s blog: http://ayende.com/blog/3112/nhibernate-and-the-second-level-cache-tips.

Configuration of the second level cache

By default the second level cache is disabled. If you need to use the second level cache you have to let NHibernate know about that. The hibernate.cfg file has a dedicated section of parameters that should be used to enable the second level cache:

First of all we specify the cache provider we are using, in this case I am using the standard hashtable provider, but I will show you in the next article what are the real providers you should use. Second we say that the cache should be enabled; this part is really important because if you do not specify that the cache is enable, it simply won’t work … Confused smile

Then you may provide to the cache a default expiration in seconds:

If you want to add additional configuration properties, they will be cache provider specific!

Cache by mapping

One of the possible configuration is to enable the cache at the entity level. This means that we are marking our entity as “cachable”.

In order to do that we have to introduce a new tag, the <cache> tag. In this tag we can specify different type of “”usage”:

  • Read-write

    It should be used if you plan also to update the data (no with serializable transaction)

  • Read-only

    Simplest and best performing, for read only access

  • Nonstrict-read-write

    If you need to occasionally update the data. You must commit the transaction

  • Transactional

    not documented/implemented yet because no one cache provider allows transactional cache. It is implemented in the Java version because J2EE allow transactional second level cache

Now, if we write a simple test that will create some entities and will try to retrieve them using two different ISession generated by the same ISessionFactory we will get the following behavior:

The result will be the following:

image

As you can see the second session will access the 2nd level cache using a transaction and will not use the database at all. This has been accomplished just by mapping the entity with the <cache> tag and by using the GET<T> method.

Let’s make everything a little bit more complex. Let’s assume for a second that our object is an aggregate root and it is more complex than the previous one. If we want to cache also a collection of child or a parent reference we will need to change our mapping in the following way:

Now we can execute the following test (I am omitting some parts for saving space, I hope you don’t mind …)

And this is the result from the profiled SQL:

image

In this case the second ISession is calling the cache 4 times in order to resolve all the objects (2 products x 2 categories).

Cache a query result

Another way to cache our result is by creating a cachable query that is slightly different than creating a cachable object.

Important note:

In order to cache a query we need to set the query as “cachable” and then set the corresponding entity as “cachable” too. Otherwise NHB will cache the ID of the entity but then it will always fetch the entity and cache only the query result.

To write a cachable query we need to implement an IQuery object in the following way:

Now, let’s try to write a unit test for this:

And this is the expected result:

image

In this case the cache is telling us that the second session has 1 query result cached and that we called it once.

Final advice

As you saw using the 1st and 2nd level cache is a pretty straightforward process but it requires time and understanding of NHibernate cache mechanism. Below are some final advice that you should keep in consideration when working with the 2nd level cache:

  • 2nd Level Cache is never aware of external database changes!
  • Default cache system is hashtable, you must use a different one
  • Wrong implementation of the 2nd level cache may result in a non expected performance degrade (i.e. hashtable doc)
  • First level cache is shared across same ISession, second level is shared across same ISessionFactory

In the next article we will see what are the available cache providers.

NHibernate cache system. Part 1

In this series of articles I will try to explain you how NHibernate cache system works and how it should be used in order to get the best performance/configuration from this product.

NHibernate Cache architecture

NHibernate has an internal cache architecture that I will define absolutely well done. On an architectural point of view, it is designed for the enterprise and it is 100% configurable. Consider that it allows you to create also your custom cache provider!

The following picture show the cache architecture overview of NHibernate (actually the version I am talking about is the 3.2 GA).

image

The cache system is composed by two levels, the cache of level 1 that usually it is configured by default if you are working with the ISession object, and the cache of level 2 that by default is disabled.

The cache of level 1 is provided by the ISession data context and it is maintained by the lifecycle of the ISession object, this means that as soon as you destroy (dispose) an ISession object, also the cache of level 1 will be destroyed and all the corresponding objects will be detached from the ISession. This cache system works on a per transaction basis and it is designed to reduce the number of database calls during the lifecycle of an ISession. As an example, you should use this cache if you have the need to access and modify an object in a transaction, multiple times.

The cache of level 2 is provided by the ISessionFactory component and it is shared across all the session created using the same factory. Its lifecycle correspond to the lifecycle of the session factory and it provides a more powerful but also dangerous set of features. It allows you to keep objects in cache across multiple transactions and sessions; the objects are available everywhere and not only on the client that is using a specific ISession

Cache Level 1 mechanism

As soon as you start to create a new ISession (not an IStatelessSession!!) NHibernate starts to holds in memory, using a specific mechanism, all the objects that are involved with the current session. The methods used by NHibernate to load the data into the cache are two: Get<T>(id) and Load<T>(id). This means that if you try to load one or more entities using: LinQ, HQL, ICriteria … NHibernate will not put them into the cache of level 1.

Another way to put an object into the lvl1 cache is to use persistence methods like Save, Delete, Update and SaveOrUpdate.

image

As you can see from the previous picture, the ISession object is able to contains two different categories of entities, the one that we define “loaded” using Get or Load and the one that we define “dirty”, which means that they were somehow modified and associated with a session.

Load and Get, what’s the difference?

A major confusion I personally noticed while working with NHibernate is the not correct usage of the two methods Get and Load so let’s see for a moment how they work and when they should or should not be used.

Get

Load

 

Fetch method

Retrieve the entire entity in one SELECT statement and puts the entity in the cache.

Retrieve only the ID of the entity and returns a non fetched proxy instance of the entity. As soon as you “hit” a property, the entity is loaded.

How it loads

It verifies if the entity is in the cache, otherwise it tries to execute a SELECT

It verifies if the entity is in the cache, otherwise it tries to execute a SELECT

Not Available

If the entity does not exist, it returns NULL

If the entity does not exist, it THROW an exception

 

I personally prefer Get because it returns NULL instead of throwing a nasty exception, but this is a personal choice; while some of you may prefer to use Load because you want to avoid a database call until is really needed.

Below I wrote a couple of very simple tests to show you how the Get and Load methods work across the same ISession.

Get<T>()

In this test I have created a list of Persons in one transaction and then I cleared the session in order to be sure that nothing was left in the cache. Then I loaded one of the Person entities using the Get<T> method and then I load it again using the same method call in order to verify that the SELECT statement was issued only once.

image

As you can see, NHibernate is loading the entire entity from the database in the first call, and in the second one is simply loading it again from the level 1 cache. You should notice here that NHibernate is loading the entire entity in the first Get<T> call.

Load<T>()

In this second test I am executing the same exact steps of the previous one, but this time I am using the Load<T> method and the result is completely different! Look at the SQL log below:

image

Now NHibernate is not loading the entity from the database at all, it is loading it only in the second call, when I try to hit one of the Person properties. If you debug this code you will notice that NHibernate issues the database call at the line Assert.That(expectedPerson2.FirstName, Is.Not.EqualTo(string.Empty)); and not before!

Session maintenance

If you are working with the ISession object in a Client application or if you are keeping it alive in a web application using some strange behaviors like keeping it saved inside the HttpContext you will realize, soon or later, that sometimes the cache of level 1 needs to be cleared.

Now, despite the fact that these methods (based on my personal experience) should never be used, because it means that you are wrongly implementing your data layer, and despite the fact that the behavior of these methods may result in something unexpected, NHibernate provides three different methods to clear the cache of level 1 content.

 

Session.Clear

 

Session.Evict

 

Session.Flush

 
             
 

Removed all the existing objects from the ISession without syncing them with the database

Remove a specific object from the ISession without syncing it with the database

Remove all the existing objects from the session by syncing them with the database

I will probably write more about these three methods in some future post but if you need to investigate more about them, I would suggest you to read carefully the NHibernate docs available here: http://www.nhforge.org/doc/nh/en/index.html

In the next article we talk about the level 2 cache.

Book Review: Architecting application for the enterprise.

Good morning everybody, first of all I want to apologize for my absence in December. I am really sorry but I was ‘trying’ to deliver a component in my office and I was so busy and tired that I didn’t have time for the blog.

Let’s start this ‘holiday’ posts with an interesting review of a software architecture book written by two friends of mine: Andrea Salatarello and Dino Esposito.

image

The title is: Microsoft .NET: Architecting Applications for the Enterprise (PRO-Developer) (Paperback) available at Amazon.com for 29.69 USD.

Chapters and sections.

The book is divided in 2 main sections: Principles and Design of the system.
The principles section talks about the architect and the architectures in software development. The design of the system talks about how the application should be architected and developed.

The first part has these chapters:

  • Architects and architecture today
  • UML essentials
  • Design principles and patterns

And the second one has the following:

  • The business layer
  • The service layer
  • The Data access layer
  • The presentation layer

Description and overall.

This is the description provided on the back of the book and I completely agree with it.

“Make the right architectural decision up front – and improve the quality and reliability of your results. … you will learn how to apply the patterns and the techniques that help control project complexity …”

I am 100% satisfy of this book as I already use these patterns and approaches  explained in depth in the book.

It’s a must to have for senior developers and software architects. You can’t miss it!!
First of all, this is the first book I have read where is explained in depth what is a Software Architect and why his role it’s fundamental in the development of a complex application. Second, it explains in depth the different approaches you may use for the various layer of an application, starting from the DBMS ending with the UI.

I wish any developer that works or will work with me, will read this book as it gives you a complete overview on how an application should be developed and when a particular layer should or should not be used.

There is also a complete open source project (NSK) on codeplex where you can see all the patterns and methodologies explained in this book. Of course, opening the Visual Studio solution and try to understand everything is not easy as it is using this awesome book.

I am really satisfy and happy! Thanks Andrea and Dino for your effort!

Tags:

NHibernate, collection with composite-id.

In the previous post we saw how to map an entity with a composite-id.

Well but now if you have mapped an entity in this way and you need to create a collection of this entity or maybe you have a related child class that uses this primary keys … you are in trouble! Smile

The way to solve the problem is very easy. First of a short example to understand what I’m talking about:


Now the first class will be mapped in this way if we assume that “name and lastname are the primary keys”. (Also remember to override Equals() and GetHashCode()!!).


Now we can go back to the child mapping file and reference the many-to-one foreign mapping in this simple way:


So now my repository will be able to give me:

image

And in that moment my repository will go back and execute the sub-select.

NHibernate and the composite-id.

I’m working with the new version (2.0 GA) of NHibernate. The problem I have encountered today is about the composite-id.

Let’s say we have an entity that doesn’t have a personal id field. We don’t want to use the GUID or any other auto id generator class. We must implement the primary key logic in the corresponding table in the database. Now, imagine to have this table:

Key Column Name SQL Type Not null
yes Field01 varchar true
yes Field02 varchar true
yes Field03 varchar true
yes Field04 varchar true

At this point we will have a mapping XML in NHibernate in this way:


If we compile the DAL of our project, we will receive a fancy error:

“composite-id class must override Equals()”

“composite-id class must override GetHashCode()”

The explanation is very simple. We are saying that this class has 3 fields to implement a comparison so the NHB must know how you can compare these fields … and off course the CLR doesn’t know how to do it.

This is the solution in our entity:


There we go!! Now you have implemented a full version of the method “Hey is this the entity I’m going to save or is this entity brand new?!?!”

Personal Consideration.

In my opinion NHB should be able to override this method by itself via reflection and not to ask you to rebuild thousand of entities if you want to use the standard pattern active Record.