The state of @DataSourceDefinition in Java EE
Traditionally, Java EE favored deploying applications with so-called unresolved dependencies. I talked about this in some more detail in a previous blog entry, but practically it boils down to the fact that a Java EE application can rarely be run as-is on an Application Server (AS hereafter).
Instead, specifically for the application that you want to run, all kinds of things have to be created and configured on the AS. Of course, this is different for every other implementation of such an AS, like JBoss, GlassFish, WebLogic, etc. So Java EE applications that are in theory portable, aren’t so in practice because of this ‘little’ thorny issue.
Worse, even if the application is to be run only on say JBoss, you traditionally can’t just take the .war or .ear and deploy. It has to be accompanied by some readme that explains which datasources, jms queues, roles etc need to be created. In JBoss AS 7 for example, all this has to be added to a single .xml file inside the AS.
So even if you only want to quickly test a single application, you’ll have to make many modifications to your installed AS. Imagine that every time you wanted to run a Windows or OS X application, you first had to thoroughly study some readme, make a ton of changes to your internal OS configuration files and only then after some trial and error would be able to run the app. Without a shadow of doubt, this would not fly with many people.
So do we really need to accept this for Java EE applications?
The short answer is “still a little, but it’s getting better”.
As an important step towards ready-to-run applications, Java EE 6 has standardized the way a data source is defined. This can be done via either an annotation (@DataSourceDefinition) or an element in web.xml, ejb-jar.xml or application.xml (data-source). Although the Java EE specification is rather clear in what has to be supported, its weak point is in enforcing that it’s actually supported. This mainly happens via the often criticized TCK (Technical Compliance Kit), which seems to spot check for features and behavior instead of exhaustingly making sure each and every thing specified is present.
On top of that, some vendors (notably JBoss), didn’t seem particularly thrilled about this approach. As a result, adoption of standardized internally defined data sources was initially slow.
However, the situation has greatly improved lately. After initially flat out refusing to support embedded data sources, JBoss has given in to support it anyway (via their proprietary -ds.xml mechanism). Up until the latest JBoss AS release (7.1.1) the standardized data source definition didn’t work really well, but in the recently released JBoss EAP 6.0.0 (a branched JBoss AS 7.1.2) it finally seems to work. JBoss EAP 6.0.1 (a branched JBoss AS 7.1.3) was also tested and luckily it also still works there.
To see how a variety of servers was doing, I tried to run the application I discussed at my above given previous blog entry on them, without making any modifications. The follow is the result:
|JBoss EAP 6.0 (AS 7.1.2)||V||V||After AS shut-down, Hibernate tries to drop schema, but DB already closed at that point.|
|WebLogic 12.1.1||V||V||Seems to start embedded DB twice, leading to locking errors in case of H2 on disk. Workaround by following strict shut-down/restart pattern|
|TomEE 1.1nightly||V||V||After AS shut-down, DB tries to close itself, but classes already unloaded at that point. Note that TomEE 1.1 hasn’t been released yet.|
|Geronimo v3.0||X||V||Doesn’t deploy when persistence.xml references datasource. Seems to be able to load driver from war, but lots of exceptions and failures everywhere.|
As can be seen, two AS implementations struggle with an embedded DB that closes itself. Both JBoss EAP 6 and TomEE 1.1 threw exceptions at shut-down, though quite different ones. WebLogic had the reverse problem, instead of issues at shut-down time the issues were at start-up time, but only when restarting the server “the wrong way” (described here).
GlassFish was the only AS unable to load the DB driver from within the .war. This is a shame, as this is a very important feature for embedded databases (the driver jar -is- the DB in that case) and thus still requires the user to change something to the installed AS.
Geronimo was the only AS that didn’t work at all. As soon as the datasource was referenced in persistence.xml deployment of the application failed. Apparently the JPA processor can’t find the datasource when it’s setting up the persistence units. Removing JPA and attempting to inject the datasource directly semi-worked. Deployment succeeded and injection happened, but Geronimo appeared to be unable to set the vitally important url property. An issue has been created for this at their JIRA. On the plus side, it did seem Geronimo was able to load the driver from the war, but without anything else working this isn’t of much use in practice.
Finally, the perhaps somewhat lesser known Resin 4.0.28 was surprisingly the only AS that didn’t seem to exhibit a single problem on my test application with respect to the embedded data source (it was however rather noisy when running this test application, throwing various kinds of “warning exceptions” about backing beans and even some internal JSF artifact not being seriablizable).
So overall the results are rather good. Some minor issues, but no real show-stopper as before. Unfortunately JBoss still tries to scare their users away from these embedded data sources, by explicitly calling it “for development & testing only“, but they do support it now and seem to support it rather well.
Java EE 7 will introduce various additional standardized embedded resources, like JMS queues and possible even an embedded platform default database, which further widens the options applications that need to be ready-to-run have.
I would like to stress that Java EE applications with unresolved dependencies are absolutely useful for those situations where they need to integrate into existing infrastructure managed by operations teams. This is especially true if operations and developers don’t 100% trust or even know each other. Places like these is where Java EE was traditionally used a lot, but with the current crop of lightweight* application servers, Java EE has the opportunity to branch out to many more places, including those that don’t need or want to use a strict separation between application and resources.
(* to give an indication; the example app is small, but does use JSF, EJB, JPA and BeanValidation, which means the container needs to start up a lot of services. Yet, a cold startup of every Java EE server tested with the app already deployed to it, took in the range of 1 to a few seconds at most)