«
»


jsonschematypes: a Java code generator from JSON Schemas

Posted by jimblackler on Nov 7, 2020

I like to develop apps that work on different platforms, and across different platforms. These apps often need to transmit data across the network (e.g. in the case of a web application with both server and JavaScript code), store data in files, or read configuration data. I’m a fan of using JSON for these types of applications. It’s a popular meta-format based on JavaScript types and syntax, but its use extends well beyond JavaScript applications. There are many interchange formats, but JSON is very well supported; there are JSON libraries for every language you can think of. It’s stored as text, which means it is easily sent in web requests, and it’s easy for humans to read and edit (much friendlier than XML, or a binary format, in my opinion).

A JSON file to store or transmit a player details in a game might look like this:

{
    "name": "Jim",
    "score": 500,
    "alive": true
}

On thing that makes JSON easy to use is that you don’t need a formal definition of valid data. One system can write a JSON file and another can read it, without any other files needing to be shared. The convention as to how the JSON should be interpreted is a matter of convention; determined by how the programs are written.

That’s great for making a quick start when you’re creating software. But it can cause a problem as the program grows in complexity, or more engineers get involved. Without a formal written structure (known as a schema) it’s easy to become confused about exactly how to read or write it.

JSON Schema

Fortunately there is a standard schema format for JSON, called appropriately JSON Schema.

A JSON Schema is itself a JSON document that specifies a test as to whether another JSON file passes certain tests that assert its validity for a certain application. An application developer supplies a schema for their program’s data, and if an individual file passes the schema, it’s valid for use.

This is one schema to validate the above JSON:

{
  "$schema": "https://json-schema.org/draft/2019-09/schema",
  "properties": {
    "name": {
      "type": "string"
    },
    "score": {
      "type": "integer",
      "minimum": 0
    },
    "alive": {
      "type": "boolean"
    }
  },
  "additionalProperties": false,
  "required": ["name", "score"]
}

I’ve been working on an application that uses JSON extensively. Much of the system is written in Java, so I use the popular org.json library to read the data. The text of a JSON file is parsed into an org.json.JSONObject or JSONArray, then it’s read using string key accessors. For example:

  public static void read(JSONObject jsonObject) {
    String name = jsonObject.getString("name");
    int score = jsonObject.getInt("score");
    boolean alive = jsonObject.getBoolean("alive");
    // ....
  }

Having to use string key accessors means that there’s room for programming error; even when a Schema is available. I found a few libraries that could copy JSON data into structured Java classes. However they all required the whole JSON file to be copied into Java classes at once. This means that if schemas aren’t available or practical for part of the data, they can’t be used. If the schema uses features that the library doesn’t support, it can’t be used either.

I wanted something that could allow me to gradually adapt a program written for org.json types into structured types, with the option to ‘fall back’ to regular unstructured data access for any reason.

Structured JSON access in Java

Something like this:

  public static void read(JSONObject jsonObject) {
    Player player = new Player(jsonObject);
    String name = player.getName();
    int score = player.getScore();
    boolean alive = player.isAlive();
    // ...
  }

This approach greatly reduces the changes of misinterpreting the structure. IDEs such as those published by JetBrains will be able to make suggestions as you type that match the intended format of the data you’re reading. In short; errors that would normally become evident at run time are made evident as you first write the code.

jsonschematypes

I wrote a library to generate Java wrappers around org.json objects that make this kind of structured access possible. It means that many format problems will be detected at build time, rather than run time.

It converts standard JSON Schemas into strutured access classes that wrap the JSONObject. They look like this:

public class Player {
	private final JSONObject jsonObject;

	public Player(JSONObject jsonObject) {
		this.jsonObject = jsonObject;
	}

	public JSONObject getJSONObject() {
		return jsonObject;
	}

	public String getName() {
		return jsonObject.getString("name");
	}

	public int getScore() {
		return jsonObject.getInt("score");
	}

	public boolean isAlive() {
		return jsonObject.getBoolean("alive");
	}

	public boolean hasAlive() {
		return jsonObject.has("alive");
	}
}

Design

Classes are created according to the design of the JSON Schema from which they are derived. The library takes some license in creating Java classes that match the inferred intent of the schema, to make them relatively lightweight and not overwhelming humans to read. This means tactcialy omitting some varations of accessors that could potentially be supplied. For example, if a schema specifies a default value for a property, no has method will be generated to test for the presence of that property on an object, since a value can always be returned. The effect of this is that changes to the schema could result in methods being removecd from generated Java classes.

Where the generator cannot identify any useful accessors to build, no class is generated and the containing class will supply JSONObjects directly.

The generated classes are designed to be as close as possible to idiomatic Java as possible. For example, using is getters such as isEnabled when exposing booleans.

They are not designed to hide or completely abstract the fact that the objects they interface are backed by JSONObjects and JSONArrays. It is a goal of the library that programmers can switched to unstructured access (via the usual JSON accessors) of data where required.

To build the classes I used a code generation library com.helger.jcodemodel which is an extended fork of Sun’s Java code generator.

As well as the Java library there’s a Gradle plugin that means it can be use in IntellIJ IDEA and Android Studio to automatically generate the classes when your schema changes.

Library, plugin and demonstration

Instructions on how to use it are on the github repository, where I’ve made the source code available under the Apache 2.0 license. Note there are separate instructions for the library and for the Gradle plugin.

There’s also an online demonstration where you can copy your own schemas into an editor and preview the Java code that would be created.

If you have questions, as always, feel free to contact me on the comments below, in GitHub issues on the project, or by email at jimblackler@gmail.com.

Leave a Reply

Comment