random thoughts on software engineering

An aspiring software craftsman journey

The Story Behind Java Query Language

This post is about the idea behind JQL project.

I was working on a legacy code base and stumbled upon this method:

Fortunately, I had a 27’ screen to see it entirely without scrolling right :smile: This is a test method, so probably it’s fine, but I’m not sure.. Anyway, the first questions that came to my mind at that time were:

  • Are all test method names are as long as that?
  • What is the longest method name in this code base?
  • What if I can ask questions like select length from method where name like 'test%' order by length ?

Oh, that would be great! (at least for SQL lovers :wink:) This was the starting point of JQL.

So I started googling about tools to create a relational model from a java code base. After some time of investigation, I found a few references. Some tools are commercial, others use weird languages to query code.. So I decided to write a PoC with Java and SQL. Having already used java parser, it took me a few minutes to write the following pseudo code in Java:

for each java file 
   model = parse file with java parser
      for each unit in model // unit can be a class, method, field etc
         insert unit in sql table

SQLite was my first choice for the PoC since it’s really easy to setup and work with. I started with two simple tables: CLASS and METHOD, each with a single column name. I was able to generate a single file SQLite database and started writing queries. Cool!

I shared the PoC with some colleagues of mine and asked them for feedback. I was expecting someone to say “who cares?”.. You know, those people who always say this.. But I agree with Mitchell Hashimoto:

My expectations were wrong! It turns out it was a good idea according to my colleagues, especially when I used the PoC to find out some gems in the code base: methods with 15+ parameters, interfaces with 100+ methods, etc.

So I decided to share my thougths with Lukas Eder, the SQL champion! Lukas told me the idea was excellent. This is when I got more confident :smile: Lukas has also pointed me to jarchitect. I was not aware of it, what a fantastic product! jarchitect uses cqlinq to query code, a language very similar to SQL. To be honest, if jarchitect was free or open source, I would forget about my idea of JQL right away. But it’s not the case, it’s not free (and is expensive! IMO) nor open source. So I believed contributing a poor man open source alternative to the community would be welcome.

Lukas posted two links on reddit and hacker news which got some attention by interested people. Some users pointed out that a graph model with jqassistant would be more appropriate than a relational model. I do agree, that’s a good point. But IMO it really depends on how you would like to model your code base and more importantly how you want to query it.

Other people have tweeted about it which made me even more confident:

At the time of writing this post, JQL version is 0.1. It works fine but is still a PoC. I’m evaluating if the relational model is a good fit to query code. If you are interested, you are welcome to give it a try and tell me what do you think. I would love to hear your thoughts!