Of all the ways software developers can be categorized and placed into one camp or another, the dichotomy between developers being comfortable with database technology and those that are not is one of the most significant influences in the evolution of a software application. By database technology, I don’t necessarily mean the installation and configuration of databases, but more so the use of databases as a means to an end – one of the many tools in the tool box that the application under development needs to interact with. The choice in the data access solution is paramount since it will certainly affect (and be affected by) many areas of the development process: the integration with the development frameworks currently in use, the amount of convention and/or configuration required, the type of database, and even down to the actual software language itself.
There are many types of database access solutions that can align with the strengths and weaknesses of a development team while satisfying any particular data storage requirements the application needs. Technologies can vary in many ways, but they all share some collective responsibility in abstracting the true database interactions from the application. Each solution varies in the degree of abstraction and this is important since the higher the level of abstraction, the less control you will have in the actual implementation. This can lead to too much reliance on the abstraction, instead of gaining a true understanding of the nature of the actual database interactions themselves. These interactions can include the actual SQL being generated and executed, transaction management, memory management, caching, threading, and even down to the actual design and creation of the database itself.
When one delves into the world of database access technologies, it can be a little overwhelming with all the unique terms and concepts. You will quickly learn that there are many different flavors of technologies: Object-Relation Mapping (ORM), SQL Mapping, Object-Oriented Database Management System (OODBMS), Object-Document Mappers (ODM) just to name a few. This blog will try and introduce you to the world of ORMs as well as dive in a little into the leading (hence, popular) packages currently being used. In fact, we will break out ORMs into two hugely overlapping groups: pure ORM and SQL Mappers. SQL Mappers tend to be categorized as an ORM, but from a purist stance, they really are not.
So where do you start?
Before you start plowing through ORM technologies, it is recommended that the developers should first understand the application data access requirements and assess the overall team maturity level with databases in general. If you have a strong team of database technologists and SQL query enthusiasts, then I would recommend looking into the packages that have a lower degree of database abstraction; if you find yourself on the other end of the spectrum, then there are plenty of tools that can automate most of the database schema design, SQL query generation, transaction management, and many other tasks a team must perform during this process. Why? First of all, if you know databases, then if the package abstracts you from the data access layer to the point where you have less control over schema design, query execution, transactions, memory/data caching, and performance analysis, then you will end up frustrated and will eventually look for another technology to use. This can get very costly. Obviously, the converse holds true as well: if you don’t know databases, then you certainly do not need to be designing schemas, building queries, configuring transactions, caching, and threading with your fingers crossed hoping it turns out alright in the end. (This actually happens more than anyone would like to admit.)
Object-Relation Mapping (ORM)
The ORM technology can arguably be the most popular among the solutions since these tools provide a higher level of abstraction from the database transactions and hence are more easily absorbed by the broadest number of people. What is an ORM exactly? Essentially, the “M” in ORM stands for mapping. Mapping in this sense is the relation (“R”) between the concept of an object (“O”) to specific database actions. This can mean many things, but for the most part, the object type dictates the database table design and interaction – from creating the table schema, creating search indexes, configuration of general database table integrity, to saving, updating, and possibly deleting information. For the most part, an instantiated object can be thought to correspond to a record in one or more tables in your schema.
Most of the packages out there offer a lot of functionality. The setup and configuration of these packages vary: most support a configuration file-based setup (usually XML or JSON) and some languages support an annotation-based configuration. Other packages can actually be baked into the software framework itself (ActiveRecord in Rails and GORM in Grails) which is more of a convention over configuration type of setup. Some packages have the ability to actually generate the data object code for you based on the schema definition. There are advantages and disadvantages with ORMs, but at the end of the day, they tend to be “weighty” and an ecosystem in and of themselves.
The popular vendors out there are Hibernate, Apache Cayenne, GORM (ok, it’s not a vendor per se, but it’s inextricably tied to Grails which is Hibernate under the covers), ActiveRecord (inextricably tied to Rails and PHP), just to name a few. They all have varying levels of language support for the more popular languages out there – Ruby, Java, Groovy, Python, iOS, C#, Scala, Flex, Perl (slight cough), PHP, C#, and so on – and it will pay off big time if they provide native support for your language of choice. Almost all of these packages have really good IDE support too. A good place to start looking at all the major players would be to look at Wikipedia.
SQL Mapper
When you look at the packages that support SQL Mapping, you are definitely sliding backwards on abstraction scale where you are more in the driver seat of the actual schema design and SQL query (or stored procedure) development. Where ORMs tie object types to tables, SQL Mappers tie object methods to SQL queries or stored procedures through the use of mapping files or annotations. Using this type of tool, allows you to create and tweak your own data manipulation routines. Essentially you will create data access objects that drive the querying and CRUD interactions with the database, but you will often create objects that define the specific return types that you will need – dictated by the column selections within the queries themselves. Mappers tend to be lightweight and very fast (if your team is good at writing queries).
The no-brainer leader in this group is MyBATIS. MyBATIS provides support for all sorts of databases and supports a ton of software languages. MyBATIS – along with the leading vendors in the ORM world – are integrated into the dominating web frameworks for easier integration. The creation and setup of the SQL Maps is fairly straightforward and you can get going pretty quickly.
Going Native
Well, you could be a total baller and throw all this database access solutions talk out the window and roll your own data access layer. I’ve done that in the past (with JDBC) and I have to say that it’s quite an investment up front in coding and configuration (because you manage/code literally all aspects of your database interactions), but you get a lot of gains in performance.
Scalable? Um, no. Community supported? Not really. Popular? Haha! No.
All this being said, in some cases where you have a small app with a small data footprint, I can see someone making a good point to go this route.
Additional Thoughts
So, choosing the right data access solution is vital to a well-oiled project down the road. Look deep and realize what your team’s strengths and weaknesses are and choose a package that aligns with those traits and abilities. As you delve into one package or many packages, you will find that they all have vibrant developer communities, lots of developer blogs, and some even have their own conferences. So there are literally tons of resources out there for you to start your journey.
DATA IS EVERYTHING.
Choose wisely…..