Friday, July 21, 2017

The need for immutability

Immutable objects are entities that in the eyes of the external observer their state doesn't change. This doesn't strictly mean that internally the object doesn't change. It rather means, that as far as the API consumer is aware, any exposed method can be called without affecting the outcome of any future calls of all the exposed methods of the same instance.

This might be a confusing definition at first but let's take it bit by bit. First, let's look at the seemingly heretic statement that an immutable object can change internally.

Consider the String class in Java. It is without doubt an immutable object. It encapsulates a char array which it protects by copying it every time the API consumer requests for it. Any instance method of the String class that does string operations gives a new String. If you look closely in the source code though, there is one field (in Java 8) that changes; the hash.

Calling the hashCode method of a string has a side effect. It caches the hashCode result so it won't have to compute it again. This is invisible to the external observer as all the future method calls will return the same result both before and after calling the hashCode method. Even the memory usage remains constant as the 4 bytes of the primitive int are already reserved.

This is not even a problem for multi-threading. Hash doesn't strictly need to be volatile, it will either be 0 and thus re-computed (it's not a real problem if it's computed multiple times in parallel or before the threads get the updated version of it) or not. There is no middle state since writing to an int is atomic.

Immutable objects are enjoying such optimisation delights specifically because they are immutable; You can call hash code a million times, you'll get the same result, cached or not.

Strings can also be interned ( a pool of re-usable strings) and Integers (Integer.class) are cached (from -128 to 127). Yes, immutability can also enable easy memory optimisations.

API considerations


Who is the consumer of the object? There are three types of consumers, the developer that interacts with an object via its API, the object itself including internally defined classes and a special case of consumers that break the immutability contract because "they know what they are doing".


The developer as a user


The first consumer is the one you need to worry about the most. Every public method you define in your class is a contract between you and the API user. If the public methods change the state, you need not only maintaining them but also handling all state errors at every entry point.

To demonstrate this, let's have a look at my favourite example:

final class Dog {
    public final int barkLevel;
    public final String name;
    public Dog(int barkLevel, String name) {
        if (barkLevel < 0)
            throw new IllegalArgumentException("Bark level cannot be a negative number");
        if (name == null || name.isEmpty())
            throw new IllegalArgumentException("A dog needs a name");
        this.name = name;
        this.barkLevel = barkLevel;
    }
}

This is an immutable dog. Once it's created neither its name nor its barkLevel can change. Of course this is not true in real life and we'll come back to that.

The benefit here is that given any Dog instance, it can be safely used forever. There is no way, as far as the external observer is aware, that you have a Dog that doesn't have a name or the bark level is negative. So the rest of the codebase need not worry about any validations of any Dog property.

Another benefit is that we don't need to have getters for every field; we don't need to have any sort of protection since both int and String are immutable. Remember that it is more expensive to invoke a method rather than read the field value. Naturally, you usually need to have getter methods to protect the encapsulated properties.

In real life, the bark level can change (even the name in rare cases). But despite the fact that you can easily model real life in OOP, a computer program remains a different world with its own domain and semantics. Here the semantics clash a bit. Having no setter, the model says this dog will forever have this bark level. In the software world we can model this with a with method and give a new dog, preserving the real life semantics.

Also remember that this is not a real dog but it's the idea of what our program thinks of a dog. Thus, you can think of getting a new idea of what the dog is, instead of thinking in terms of physically getting a new dog because the bark level changed.

Now consider having this type of implementation, which is the commonest among Java codebases:

class Dog {
    private int barkLevel;
    private String name;
 
    public int getBarkLevel() { return barkLevel;}
    public String getName() { return name;}
    public void setBarkLevel(int barkLevel) { this.barkLevel = barkLevel;}
    public void setName(String name) {this.name = name;}

}

You can argue that you can add validation in every setter method. Even so, that means at any point, your dog instance can break and you'll need a new dog or go back to the previous one. How do you manage failures? You need to remember previous states and recover the dog. Or put logic that doesn't change the state to an erroneous one. All these just add more technical depth.

Builders

The second consumer which is also important is the object's class definition itself. It can incorporate all the mutability needs of the object in order to create the pre-defined immutable object.

For instance, consider that you have quite a few object properties. Calling a constructor with more than 3 parameters (or any method in fact) is inconvenient. What you need is a builder.

The builder will manage the mutability, you can call method after method defining the desirable state of the object. This is the concept of the string builder as well. That way you also tackle some performance considerations where you won't need to create and destroy N objects for N properties.

Creating a builder though is often a burden since you need to write a lot of boilerplate code. Fear not:  there are libraries to generate that code for you on compile time (e.g. Lombok for Java).

What about changing a single property? For POJOS it's tempting to have setter methods because they seem cheap and they change the state of a single object, no copies involved. To write a method that gives you the same object with a single property changed is again involving a lot of boilerplate code which also doesn't seem efficient.

All these though can be automated, either by macros or by using libraries such as Lombok. In Lombok's case you can just use @Wither which gives you a with method for every parameter that you want to be changeable. The performance overhead is minimal, since the copy of the properties is shallow.

Deserialising objects


We finally have the case of the third consumer. The one that breaks our immutability contract because it knows what it's doing. There is a way to even avoid that but we'll discuss it at the end.

Such consumers are normally serialisation libraries, such as Gson for Json or Hibernate for database entities. What they do with the most common configuration is instantiate an object and for each field they'll try to find a setter method that matches the field name prefixed with set in camel case. Configured appropriately for immutable objects they will instantiate the object and reflectively assign a value to each object field. Even final fields can be altered on runtime - in the case of Java at least.

Now the assumption here is that the immutability breaks in a limited scope; the method that does the deserialisation. At the end you will get a reference of a deserialised object and not a reference of an object which is being deserialised.

Given the right configuration some libraries allow calling the constructor with all the arguments needed directly. For example, Jackson has a set of annotations that you can use to map each json field with a constructor parameter.

Semantics


We've seen the 3 consumers, now we need to go back to the most important one; The human developers. So far we've seen that the benefit we give to them is not to worry about an object being in an invalid state.

Immutable objects give great semantics to the external observer as well. Consider the following definitions of an Exception:

final class JsonTypeCastingDecodingFailure extends Exception {
    public JsonTypeCastingDecodingFailure(String fieldName, Class expectedType, Class found) {
        super(String.format("%s cannot be casted to %s from %s", fieldName, expectedType.getName(), found.getName()));
    }
}

class JsonTypeCastingDecodingFailure extends Exception {
    private String message;
    public JsonTypeCastingDecodingFailure(String fieldName, Class expectedType, Class found) {
        this.messageString.format("%s cannot be casted to %s from %s", fieldName, expectedType.getName(), found.getName());
    }
    public void setMessage(String message) {
        this.message = message;
    }

    public String getMessage() {
        return message;
    }
}


The second definition makes no sense. What are the semantics of having an error that you allow someone to alter its message?

Semantics are often ignored during programming. Most developers have been using the same wrong things over and over again until they have become the normal; adding getters and setters is one of them, post-fixing Exception at every exception class name is another (but this is a story for another time).


Summary


Every public method provided is a contract, it has a purpose and a meaning and allows interpretations which are sometimes the wrong ones; semantics are the thing that everyone cares only when they have inherited legacy code.

State changes need to be managed in a restricted area where a Facade provides the minimum API to do one thing and the internal state is encapsulated and protected.

My final advice is an old but often neglected one: Make something public if and only if it needs to be public. A setter method will never pass that condition.

Friday, May 6, 2016

Memory efficient serialization for Android



VSerializer

A library to serialize and deserialize objects with minimum memory usage.

Gradle dependencies

allprojects {
        repositories {
            ...
            maven { url "https://jitpack.io" }
        }
}
dependencies {
    compile 'com.github.vaslabs:VSerializer:1.0'
}

Example

VSerializer vSerializer = new AlphabeticalSerializer();
TestUtils.AllEncapsulatedData allEncapsulatedData = new TestUtils.AllEncapsulatedData();
allEncapsulatedData.a = -1L;
allEncapsulatedData.b = 1;
allEncapsulatedData.c = 127;
allEncapsulatedData.d = -32768;
allEncapsulatedData.e = true;
allEncapsulatedData.f = 'h';

byte[] data = vSerializer.serialize(allEncapsulatedData);

TestUtils.AllEncapsulatedData recoveredData = 
    vSerializer.deserialise(data, TestUtils.AllEncapsulatedData.class);

Motivation

Memory on Android is precious. Every application should be using the minimum available memory both volatile and persistent. However, the complexity of doing such a thing is too much for the average developer that wants to ship the application as fast as possible. The aim of this library is to automate the whole process and replace ideally the default serialization mechanism.
That can achieve:
  • Lazy compression and decompression on the fly to keep volatile memory usage low for objects that are not used frequently (e.g. cached objects with low hit/miss ratio).
  • Occupying less persistent memory when saving objects on disk.

How does it work?

This project is under development and very young. However, you can use it if you are curious or you want to be a step ahead by following the examples in the unit test classes.

Advantages

  • A lot less memory usage when serializing objects compared to JVM or json.
  • Faster processing for serialization/deserialization
  • Extensible: will be able to easily encrypt and decrypt your serialized objects

Disadvantages

  • Less forgiving for changed classes. A mechanism to manage changes will be in place but since the meta data for the classes won't be carried over it will never be the same as the defaults.
  • Does not maintain the object graph meaning that a cyclic data structure will not be possible to be serialized.

Use case

  • Any data structure that matches a timestamp with other primitive values would be highly optimised in terms of space when saving the data using this approach. You can save millions of key/value pairs for data like timestamp/location history graph.
  • Short lived cache data are in less danger to cause problems when you do class changes. You can benefit by reducing the memory usage in your caching mechanism and not worry much about versioning problems.

Get the code from: https://github.com/vaslabs/VSerializer

Sunday, January 10, 2016

Trackpa: Never lose your grandparents

Or that was the initial concept. You can track a phone's location via sms and you have the option for encryption, so your location won't leak here and there.








Get it from:

https://play.google.com/store/apps/details?id=com.vaslabs.trackpa

Receiver:
https://play.google.com/store/apps/details?id=com.vaslabs.trackpa_receiver


I would have the receiver free as well, but I have to use google maps API service, so this is mainly to avoid spam and api charges or cover them if the need arises. You can get it for free by compiling the source code (see below) using your own API keys.

Source code:

https://github.com/vaslabs/trackpa
https://github.com/vaslabs/trackpa_receiver

Sunday, October 18, 2015

Java: When the compiler crashes the plane

Software design principles on compiled programming languages tend to have one rule in common; Compiler errors over runtime errors. A religiously followed rule. It's a dogma (although based on legitimate reasons). You should always program in a way that most errors would be reported from the compiler and your logic tested by unit testing. Sometimes little things slip through though.

The following scenario describes an easy to do mistake in Java and highlights some good practice to avoid crashing that plane.

Pre-requisites:



Open intelliJ and setup a java project with the following (adapt to match your system):





The example code is this:

import java.util.concurrent.ConcurrentHashMap;


public class FunWithMaps {

    public static void main(String[] args) {
        ConcurrentHashMap map = new ConcurrentHashMap();

        map.put("1", 1);

        System.out.println(map.keySet().getClass());
    }
}

Run. You'll get the following error:

Exception in thread "main" 
java.lang.NoSuchMethodError: 
java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;

So, the compiler used java 8 API and didn't complain despite the fact that we set it 
to compile to java 7. But that was the bytecode version and the java 8 API that was in
the compile classpath didn't cause any problems. 

Consider that this scenario is what the continuous integration (CI) might have. 
This imaginary CI system builds your production level code but you - as a programmer - 
have no control over it. Then, the ConcurrentHashMap code would succeed on your IDE 
(because you would be compiling with java 7 targeting java 7) but the CI would be compiling 
java 8 generating java 7 bytecode without having java 7 API in the classpath. The runtime environment
 would use java 7.

You wouldn't know, the compiler wouldn't know, and the crash would rely on the testing environment
 to be caught. That scenario might cause you a late runtime crash on a live environment.

Let's see the bytecode a bit and see if we get what we expect, i.e. a call to the keySet method that returns the KeySetView.

Open your generated class file with Java bytecode editor.



On line 13, this is quite obvious. When running with java 7 you don't get any error until line 13 is executed by jvm and it tries to find that method, which doesn't exist on java 7. So how can we avoid this with a bit of good practice (although, having in compile time classpath an API of a different version that the one on runtime is the most obvious mistake that needs fixing). But we want our code to be as safe as possible and work even in situations where simple API changes won't affect it.

Let's change the left hand side to the interface definition. The Map.

 import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;


public class FunWithMaps {

    public static void main(String[] args) {
        Map map = new ConcurrentHashMap();

        map.put("1", 1);

        System.out.println(map.keySet().getClass());
    }
}

Before running it, make a build and see the bytecode. It's now a bit different.



As expected, the compiler now generated the bytecode according to the interface visibility of the method.
If you run it now with java 7 it will not fail and will output:
class java.util.concurrent.ConcurrentHashMap$KeySet

Switch run configurations to jre 8. Run again and you get what you expected:
class java.util.concurrent.ConcurrentHashMap$KeySetView


It's always a good practice to define your variables with the highest superclass or interface 
possible, especially if you are using an external API (which is pretty much always the case). 
Interfaces rarely change or at least they change less frequently from implementation code.


That should as well settle the argument of using on the left hand side the instantiating class 
in the variable definition or their superclass (interface they implement).

Sunday, August 30, 2015

Programming with the metric system - Draft ideas

In a paper called 'Software Development for Infrastructure' Bjarne Stroustrup presented the new features of C++11 with some interesting examples. The most fascinating one was derived from the NASA accident in September of 1999. The root of the accident was a mismanagement of the metric system units due to a poorly designed API that basically was relying on comments.

I'm currently working on a project that requires managing metric units correctly. The language used is Java. Also, I have developed an external, open source library, that provides the essential API for managing metric units. It's a very young project so it supports a very small range of metric units (only those needed by the bigger project) but it is sufficient to demonstrate the basic design principles for managing metric units.

You can find the library here.

Let's examine it's usage in a few examples. Let's say we want to keep track of velocity.

Actually, this is already provided in the library.

public final class VelocityUnit {
    public final DistanceUnit DISTANCE_UNIT;
    public final TimeUnit TIME_UNIT;

    public final double DISTANCE_VALUE;

    public VelocityUnit(DistanceUnit distance_unit, TimeUnit time_unit, double distance_value) {
        DISTANCE_UNIT = distance_unit;
        TIME_UNIT = time_unit;
        DISTANCE_VALUE = distance_value;
    }

    public VelocityUnit convert(DistanceUnit distance_unit, TimeUnit time_unit) {

        if (distance_unit == DISTANCE_UNIT && TIME_UNIT == time_unit)
            return this;

        double newTimeUnitWorthOfCurrentTimeUnit = 1/time_unit.convert(1, TIME_UNIT);

        double newTotalDistance = DISTANCE_VALUE*newTimeUnitWorthOfCurrentTimeUnit;

        double newDistanceValue = distance_unit.convert(DISTANCE_UNIT, newTotalDistance);

        return new VelocityUnit(distance_unit, time_unit, newDistanceValue);

    }


    public String getMetricSignature() {
        return DISTANCE_UNIT.signature + "/" + TIME_UNIT.signature;
    }

    public String toString() {
        return String.format("%.2f%s",DISTANCE_VALUE, getMetricSignature());
    }

}
In the constructor, the programmer passes the metric unit of the distance and the time
This implementation is quite simple, so it gets the difference in distance as a third parameter 
and it assumes the value of time is a single unit of whatever metric is passed. 
You can then convert it to use different distance units or time units accordingly. 

But that doesn't tell much about differences between two people, what about hiding the 
information from the third party developer? Let's see another example now of how the 
developers will get what they expect by interacting with a black box class. In the example below, 
the developer can put values in whatever metric unit they like and get it back in whatever 
format they like. The magic is by telling the class to explicitly work with only one metric unit.

package com.vaslabs.units.examples;

import com.vaslabs.units.DistanceUnit;

public class ExampleDistanceCalculation {

    private final DistanceUnit PREF_DISTANCE_UNIT = DistanceUnit.METERS;

    private double pointA;
    private double pointB;

    public ExampleDistanceCalculation() {

    }

    public void setPointA(double value, DistanceUnit distanceUnit) {
        pointA = DistanceUnit.PREF_DISTANCE_UNIT.convert(distanceUnit, value);
    }


    public void setPointB(double value, DistanceUnit distanceUnit) {
        pointB = DistanceUnit.PREF_DISTANCE_UNIT.convert(distanceUnit, value);
    }

    public double getDistance(DistanceUnit distanceUnit) {
        return distanceUnit.convert(DistanceUnit.PREF_DISTANCE_UNIT, (pointB - pointA));
    }
}
The ExampleDistanceCalculation will work with meters while the third party developers can 
choose their own metric system. For instance, you can have a sensor and some software that 
give you values in centimeters. You can have a class like the above as a middleware 
(with CM instead of METERS) and allow all the other developers to work on the metric unit of 
their preference. It is also useful when delivering to the userland, as users may have different 
preferences on metric units.

Saturday, April 26, 2014

Pi-web-agent Quokka

The pi-web-agent version 0.2 codenamed Quokka has been released since the 24th of April. It provides a better user interface which is faster and more interactive and some extra cool features such as:

  • Pi camera controller (take snapshots or watch a live stream).
  • File manager - browse and download files. 
  • Radio - stream from internet radio or other audio by providing the URL.
The firewall management was also improved which allows now to control access from various protocols and IP addresses.

A video that demonstrates the application:


How to get it:

Download the application from pi store: http://store.raspberrypi.com/projects/pi-web-agent

Give a like to the developers and their project: https://www.facebook.com/pages/Raspberry-Pi-Web-Agent/481006072007776

Monday, April 7, 2014

The next day for your business: Windows XP?

I should have written this article about a year ago to give to someone that cares the time to plan ahead. Frankly, I don't care much. If you have a business and you are technologically impaired it's your fault.

Computers are not an unnecessary "shit that I have to buy, just don't spend much". They are your records, data center, analysis tools and your professional image all in one box. When you've installed or bought computers with Windows XP you did the right thing. They were the best you could find, a value for money deal like no any other. The main reason for that was the Microsoft monopoly. Linux distributions were good but you couldn't find the tools you needed easily, and Apple oh Apple. . .

 But in 2014 things are different. You can have a free office solution that may lack the User Interface eye candy of Microsoft's but it does the job and guess what: it's free. Also you have a large range of free Linux based Operating Systems you can use. People tend to agree that Ubuntu or Linux Mint are the most user friendly ones.

But if you feel that open source and free software is "insecure and vulnerable and Jesus everyone can see the code, is that even safe?" you can buy from Red Hat and have the support you used to have with Microsoft. Which in fact you didn't have, but this time it will be a real thing.

 So before upgrading to Windows 7 or 8 (Jesus are you thinking "what about Vista?" now?) or buying new machines, think about spending a fifth of that money to install a free operating system (which is also safer) and train your employees to use them. If they can't learn it, fire them and get new ones.

 So what are the benefits of Linux based Operating Systems?

 Remember when you needed to update manually Firefox, Google chrome and a bunch of other applications that weren't Microsoft's? Well, no more. Every application (assuming you've installed them correctly which requires an IQ roughly above the 20's) gets updated automatically along with the system updates. And guess what: If you screw up or something breaks, you can roll back (again, if you have the IQ index mentioned before).

"But why do I need to keep updating?". Well that's the reason you are switching from XP to something else right? Anyway, most of your employees play solitaire or they are on Facebook. So get them something that's free and actually works.