Wednesday 22 November 2023

Schema evolution, message queues, functions, co-and-contravariance

Draft: Schema evolution, message queues, functions, co-and-contravariance

I was pondering schema evolutions in the context of temporal decoupled producers and consumers, exploring what aspects are governed by the robustness principle of communication: "be liberal in what you accept and conservative in what you produce”.

It all started with wondering why adding a field to a json struct is generally considered harmless but adding a variant to an enum is not.

Meyer's Open–Closed principle = Robustness principle

Squint and “Software entities should be open for extension, but closed for modification” looks a lot like "be liberal in what you accept and conservative in what you produce”. Both principles emphasise the idea of designing systems that can adapt to changes or variations without requiring modifications to their code.

Here is a list of schema modifications always safe from a consumer perspective (new and old consumers can handle the schema):

Adding optional fields to a product type (struct or record): Adding fields is safe as old consumers ignore the new field and new consumers are prepared for the field to be missing/optional. Old as well as new producers are compatible with the new schema.
Removing a variant from a sum type (enum): Removing a variant is safe as old consumers will simply not encounter the previously known variant anymore. Awkwardly, new consumers must retain the code/ability to handle the old variant. This modification, however, will break old producers trying to publish a message containing the now removed variant.

List of schema modifications that are always safe for producers (new and old producers can handle the schema):

Adding optional fields to a product type (struct or record): Adding optional fields is safe for old producers as omitting an optional field is admissible. Consumers can happily ignore the new field as mentioned above.
Adding a new variant to a sum type (enum): Adding a new variant is safe as old producers will blissfully ignore the existence of a new variant. However, consumers pattern matching exhaustiveness checks will be violated by a new enum variant.

Note that direction matters when we talk about forward and backwards compatibility. Forward compatibility is a design characteristic that allows a system to accept input intended for a later version of itself eg. ability to consume messages published by producers with updated schemas. Backwards compatibility allows for interoperability with an older legacy system, or with input designed for such a system eg. producer ????

Algebraic types / Liskov Substitution Principle

The one schema modification that both producers and consumers can handle seamlessly is adding optional fields to a product type. Adding a new variant to a sum type seemingly results in a new/different type eg [ cat | dog ] != [ cat | dog | mouse ]. Strangely enough, adding a new field to a product type appears to be less disruptive. Reminiscent of inheritance and subtyping.

Message = Instance / Schema = Type / Queue = Function

Is it just me or is there a relationship between the concepts of covariance, contravariance, and which schema modifications are safe from a consumer and producer perspective? When working with functions, you can think of input arguments as being "consumed" by the function, and output values as being "produced" by the function.

In this context, the concepts of covariance and contravariance apply to compatibility of schema versions in the world of message queues:

Covariant output (return values, producer): When a function returns a type, it is safe to substitute a more specific type (subtype) in place of a more general type (supertype). This follows the rule of covariance. For example, if a function is expected to return a Vehicle, it is safe for it to return a Car (assuming Car is a subtype of Vehicle). This is because the consumer of the function's output can handle a Vehicle and thus can handle a Car.
Contravariant input (function arguments, consumer): When a function accepts a type as an argument, it is safe to substitute a more general type (supertype) in place of a more specific type (subtype). This follows the rule of contravariance. For example, if a function is expected to take a Car as an argument, it is safe for it to accept a Vehicle (supertype). This is because the function is designed to handle a Car, and if it can handle any Vehicle, it can certainly handle a Car.

Backward compatible = Covariant output / Forward compatible = Contravariant input

Liskov Substitution Principle

Can we think about schema evolutions in terms of subtyping relationships?

https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)

https://en.wikipedia.org/wiki/Robustness_principle

https://en.wikipedia.org/wiki/Forward_compatibility

https://en.wikipedia.org/wiki/Backward_compatibility

https://en.wikipedia.org/wiki/Algebraic_data_type

Thursday 11 July 2013

Basic Datomic database example using Scala

I was reading and watching a lot about Rich Hickey's Datomic database lately and started a simple Scala application showing basic Datomic operations.

The repository can be found here: https://github.com/tinoadams/datomic_scala_test

I'm planning on adding more examples down the track.

The basic example demonstrates how to create an in-memory database, define a simple schema, add a custom partition, insert some entities and querie the database to retrieve the entities again.

Lists and maps

The Datomic library comes with convenience methods to create Java lists and maps:

 import datomic.Util._  
 val javaMap = map("key1", "value1", "key2", "value2")  
 val javaList = list("value1", "value2")

Scala's lists and maps can also be used but need to be converted into Java types:

 import scala.collection.JavaConversions._  
 val javaMap: java.util.Map[_, _] = Map("key1" -> "value1", "key2" -> "value2")  
 val javaList: java.util.List[_] = List("value1", "value2")

Thursday 31 May 2012

"Emulating" C#' using keyword in Scala

Not quiet the same but better than dealing with closing the resource manually every time I'm using a DB connection or input stream or the like...

And here the most convoluted way of outputting a string:

Saturday 26 May 2012

Akka 2 and setReceiveTimeout

The Akka documentation states that setReceiveTimeout is

"A timeout mechanism can be used to receive a message when no initial message is received within a certain time."

I was unsure about the "initial message" part and turns out that the underlying actor receives a timeout every time there hasn't been a message within the specified timeframe i.e. not just when no initial message has been received.

Small code sample to show the behaviour of setReceiveTimeout:

Output:

Sun May 27 13:23:57 EST 2012 / Started

Sun May 27 13:24:02 EST 2012 / No message received since 5 seconds

Sun May 27 13:24:07 EST 2012 / No message received since 5 seconds

Sun May 27 13:24:12 EST 2012 / Recieved: Message: 1

Sun May 27 13:24:13 EST 2012 / Recieved: Message: 2

Sun May 27 13:24:14 EST 2012 / Recieved: Message: 3

Sun May 27 13:24:15 EST 2012 / Recieved: Message: 4

Sun May 27 13:24:16 EST 2012 / Recieved: Message: 5

Sun May 27 13:24:17 EST 2012 / Recieved: Message: 6

Sun May 27 13:24:18 EST 2012 / Recieved: Message: 7

Sun May 27 13:24:19 EST 2012 / Recieved: Message: 8

Sun May 27 13:24:20 EST 2012 / Recieved: Message: 9

Sun May 27 13:24:21 EST 2012 / Recieved: Message: 10

Sun May 27 13:24:26 EST 2012 / No message received since 5 seconds

Sun May 27 13:24:31 EST 2012 / No message received since 5 seconds

Tuesday 15 November 2011

Config & code snippets

This is my notepad post containing some config and code snippets.

Enabling JRebel in the Eclipse run configuration

Select the JRE tab and paste the following (replace path with your local jrebel.jar location) in the VM arguments field:

-noverify -javaagent:/Applications/ZeroTurnaround/JRebel/jrebel.jar -Drebel.lift_plugin=true

Set Java maximum Java heap size when starting Eclipse (windows)

Drag "eclipse.exe" into the task bar to create an icon -> right click the icon and select "properties" -> choose "Shortcut" tab and paste the following (replace with local Eclipse path) into the target field:

"C:\Program Files (x86)\eclipse-jee-indigo-SR1-scala\eclipse Scala.exe" -vm "C:\Program Files (x86)\Java\jdk1.7.0_01\bin\javaw.exe" -vmargs -Xmx1152m

Fixing Pentaho data integration app on Mac OSX

Starting the data integration app results in:

LSOpenURLsWithRole() failed with error -10810 for the file /Applications/pentaho/data-integration/Data Integration 32-bit.app.

Can be resolved by:

chmod +x ~/Downloads/data-integration/Data\ Integration\ 32-bit.app/Contents/MacOS/JavaApplicationStub

Same fix works for 64-bit version.

Wednesday 21 September 2011

Scala case classes and DDD value objects

I was looking for a neat way to use the Scala case class construct when implementing value objects as advertised by the domain driven design approach.

The only problem with using case classes, which come with all the convenience features like "unapply" and "copy", was that I wanted to operate on arguments before the class constructor assigns them to the immutable class members.

In case of the example below I simply wanted to trim the given postcode. Similar use cases, where I want to bring value object arguments into a particular format, pop up all the time.

The best pointer to a suitable solution was this Stackoverflow post: http://stackoverflow.com/questions/5827510/how-to-override-apply-in-a-case-class-companion

Since we cannot override the default "apply" method of case class companion objects I changed the method to be called "parse" which is not as concise as using the "apply" approach but with time I actually started to appreciate the more explicit naming.

Following code shows the way I'm currently implementing DDD value objects in Scala:

Making the case class constructor private ensures that the factory method "parse" of the companion object must be used to instantiate a new postcode object.

In addition to the version above I just recently found another way of tackling the problem.

There is a related discussion happening here:
https://issues.scala-lang.org/browse/SI-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel

According to the Scala specs the definition of apply is omitted if class c is abstract. Unfortunately this also prevents the compiler from generating the copy method which will have to be implemented manually.

Sample case class with companion object that sanitizes parameters before instantiating a new case class instance:

Wednesday 14 September 2011

Scala 2.9.1 Circumflex ORM 2.0.3

I just upgraded my Scala Eclipse plugin to Scala IDE 2.0.0-beta10 which ships with Scala 2.9.1.final.

When running my sample Circumflex ORM project I received this exception:

Exception
java.lang.NoSuchMethodError: scala.collection.mutable.Stack$.apply(Lscala/collection/Seq;)Lscala/collection/mutable/Stack;

Fixing it required me to recompile Circumflex against Scala 2.9.1.

I forked the project on GitHub and made the necessary amendments. I'm currently using the unstable 2.1 version.

Clone and install the forked Circumflex version 2.1 to the local Maven repository

git clone git@github.com:tinoadams/circumflex.git
cd circumflex
git checkout master
mvn clean install

With the newly compiled Circumflex version in place you should now be able to build and run the sample application using the version 2.1 dependency:

		
<dependency>
	<groupId>ru.circumflex</groupId>
	<artifactId>circumflex-orm</artifactId>
	<version>2.1</version>
	<scope>provided</scope>
</dependency>