An aspiring software craftsman journey, By Mahmoud Ben Hassine
How I Reduced My Java App Code By 80% Using Easy Batch
21 January 2014,
Easy Batch, Java
In this post, I will try to show you how Easy Batch can tremendously simplify batch application development
by taking care of the boilerplate code you may write yourself.
This will make your application more readable, understandable and maintainable.
The use case is a typical production application that loads data from a CSV file into a relational database table.
Here is the input file containing products data:
Let’s assume we have a JPA EntityManager used to persist Product objects to the database.
We would like to map each record of this file to an instance of the following Product POJO :
Before persisting products to the database, data must be validated to ensure that:
product id and name are specified
product price is not negative
product last update date is in the past
Finally, records starting with # should be ignored, mainly the header record (and probably comments and trailer record).
To keep the example simple, I will write products data to the standard output and not to a database.
So let’s get started!
The following listing is a possible (horrible) solution that I have seen hundred of times in production systems:
This solution actually works perfectly and implements the requirements above.
But it’s an obvious maintenance nightmare!
It could be worse if the Product POJO contained dozen of fields, which is often the case in production systems.
In this solution, there is only one line which represents the batch business logic. Do you see it? Here it is:
In production, this line would be persisting the object to the database.
All the rest is boilerplate: handling IO, reading, filtering, parsing and validating data, type conversion,
mapping records to Product instances, logging and reporting statistics at the end of execution.
The idea behind Easy Batch is to handle all of this error prone boilerplate code for you.
With Easy Batch, you focus only on your batch business logic. So let’s see how would be the solution with Easy Batch.
First, I will create a RecordProcessor to implement the business logic:
Then, I will declare (and not implement like in the above solution) data validation constraints on Product POJO
with the elegant Bean Validation API as follows:
Finally, I need to configure a job to:
Read data from the flat file products.csv
Filter records starting with #
Map each CSV record to an instance of the Product POJO
Validate product data
Process each record using the ProductProcessor implementation
This can be done with the following snippet:
That’s all. Except from implementing the core business logic, all I have done is providing configuration metadata that Easy Batch cannot guess.
The framework will take care of all the boilerplate code of reading, filtering, parsing, validating and mapping data to domain objects.
Time to do some math by counting total lines of code.
Both solutions use the Product POJO, so I’ll ignore it. Imports are also irrelevant, they will also be ignored.
The first solution WithoutEasyBatch has 84 lines of code (empty lines have been ignored). Note that I have inlined all variables and tried to make it as compact as possible.
The second solution WithEasyBatch has:
6 lines for the ProductProcessor class
4 lines for Bean Validation API annotations added on the Product POJO
11 lines for the main class
In sum, 84LOC vs 21LOC, which is 75% less than the first solution.
Oh wait, this is not 80% as claimed in the title of the post!
Ok you got me.. But if I count monitoring, transaction processing and batching which I get for free from Easy Batch,
I could actually put 90% or even 95% in the post title!
I hope you got the point and agree with me, the second solution is easier to read, understand, test and maintain.
In the end of this post, this is what Easy Batch is all about, making your life easier when you have to deal with batch applications in Java.
The main motivation behind the framework is to let you keep focus on your business logic and to take care of the boilerplate code for you.