Shifty Responsibility of Repository Pattern

During a lot of discussions about the Repository Pattern I noticed that people are separated into two camps. For the purposes of the article I call them abstractionists and concretists. The difference between them is whether the repository is considered or not during development of a system. The first camp considers it is worth to have it because we can abstract from details of a physical data storage. The second camp considers we cannot abstract from all details and so that the implementation of the repository is just a waste of time. The polemic between two camps tends to be the holy war.

What’s wrong with the repository pattern? Obviously, nothing wrong with it, but with the way how developers understand it. I’ve tried  to investigate this and noticed two main points which probably are the causes of different attitude to the pattern. One of them is the shifty responsibility of repository pattern and the other one is the undervalue of the unit testing. Here I explain the first one.

Shifty Responsibility of Repository

When we’re discussing architecture of the future system it’s common to think about three layers: Presentation Layer, Business Layer and Data Layer (see MSDN). In such systems objects from BLL use repositories to retrieve data from a data store. Repositories returns business objects instead of raw data records. Usually it’s based on the idea about replacing an underlying data storage. In such a case a developer create an abstract repository and implement a specific one for, assume, XML data store. Everybody knows this idea which is illustrated below.

repository-idea-old

The same segregation you can see in articles of Martin Fowler and MSDN. As usual it’s only a simplified model. Although it looks correct for a small project it misleads when you try to carry the pattern onto a more complicated one. The existing ORMs make things worse because implement a lot of things out of the box. But assume a developer knows how to use Entity Framework (or another ORM) only to retrieve data. For example, where should he put a L2 cache manager  or a logger of all business operations in this schema? Obviously, he’ll try to extract routines according to SPR from SOLID and maybe he would get something like this:

repository-idea-old-ext

Does the repository have the same purpose in this new model as on the previous one? Obvious answer is “NO” since now it does not retrieve data from the data store. This responsibility was shifted to another object. This is the one of the causes huge polemic between camps.

Instead of Analysis

If assume that some term must keep its meaning regardless to code refactoring then it would be correct to not call the object as repository at the first step. It’s responsibility is only to hide data storage interface to retrieve data (linq, ADO.NET, etc.), so that it implements the Data Access Object pattern. At the same time the repository may not know about such details at all, playing a role of the DAL composition node. Such node would orchestrate working of all DAL subsystems.




1 Comment

Personally I use Repository pattern at least to explicitly define and narrow the contract between application code and “physical data storage” (read – database). Let me explain a bit. I want to understand how my application code uses database, which query runs, which scenarios must be backed up by database indecies to provide decent performance etc. I use repositories for that. Directly using ORMs “context” or “session” won’t give me that answers because these contracts are to generic. The worst case is using SQL statements directly in the code, but let’s pretend we never saw such code :)

So, as a team leader I find useful and convenient to keep my eyes on repository contracts to understand who and how uses database and prevent unproper usage. Behind the repository any implementation may exist, it depends on case – ORM, dapper, hand-written crafted magic code with HINTS to SQL Server and etc.

Leave a Reply