API Design: In The Wild Skip to main content
samwho logo, a small keyboard arranged in a 3x2 grid, with 'sam' at the top, and 'who' at the bottom. A cable runs from the top of the keyboard to off the top of the screen.

API Design: In The Wild

We've explored some guiding principles in previous posts, but we're yet to use our new-found skills. Let's take a break and look at some examples from real-world code you can find in use today, and how we might improve them.

# Go's math/big package

Most languages have a library for representing really big numbers, and Go is no exception.

Here's an example of it in action:

package main

import (
  "fmt"
  "math"
  "math/big"
  "os"
)

func main() {
  r := &big.Float{}
  _, ok := r.SetString("2.5")
  if !ok {
    fmt.Fprintln(os.Stderr, "Could not parse float")
    return
  }

  r2 := &big.Float{}
  r2.Mul(r, r)

  pi := big.NewFloat(math.Pi)

  result := &big.Float{}
  result.Mul(pi, r2)

  fmt.Printf("%v\n", result)
}

This prints out the area of a circle with radius 2.5.

A couple of things about this API bother me.

# SetString returns a boolean, not an error

It's conventional in Go to return two values from a function: the result, and an error. The caller then checks first if there's an error, and if there isn't they use the result. Go's math/big returns a result and a boolean.

Returning a boolean instead of an error isn't entirely without precedent. Indexing in to a map uses this pattern, for example.

m := make(map[string]string)
m["hello"] = "world"
s, ok := m["world"]
// `ok` will be false to represent that the key was not found

Everywhere other than map indexing, I assume an error is returned and spend a few seconds reading an error before realising my mistake.

In SetString, I'd prefer to see an error with some information about what went wrong in it. The error could say if the string passed in was empty, or not a floating point number, or anything that isn't "computer says no."

# Confusing use of function receivers

Operations in math/big use the function receiver as the result of the computation. In a.Add(b, c), a is set to the result of b + c. The library justifies this in the documentation:

By always passing in a result value via the receiver, memory use can be much better controlled. Instead of having to allocate new memory for each result, an operation can reuse the space allocated for the result value, and overwrite that value with the new result in the process.

But I would argue that this was a misguided trade-off. It makes a niche use-case easier at the expense of the common use-case. Imagine if the receiver was instead used as an argument, and the return value were the result.

package main

import (
  "fmt"
  "math"
  "math/big"
  "os"
)

func main() {
  r := &big.Float{}
  _, ok := r.SetString("2.5")
  if !ok {
    fmt.Fprintln(os.Stderr, "Could not parse float")
    return
  }

  result := r.Mul(r).Mul(big.NewFloat(math.Pi))
  fmt.Printf("%v\n", result)
}

Less mutation, less visual noise, less confusing use of a function receiver.

If it turns out there's still a real need to control the number of allocations, that could be a different set of functions on the package. Something like this:

package main

import (
  "fmt"
  "math"
  "math/big"
  "os"
)

func main() {
  result := &big.Float{}
  _, ok := result.SetString("2.5")
  if !ok {
    fmt.Fprintln(os.Stderr, "Could not parse float")
    return
  }

  big.MulFloat(result, result, result)
  big.MulFloat(result, result, big.NewFloat(math.Pi))

  fmt.Printf("%v\n", result)
}

The first argument is the result, and the second and third arguments are the things being operated on. This keeps the common use-case simple while still offering tools for more niche use-cases.

# Java's old File.exists() method

This is one of my favourites. What's wrong with this?

import java.io.File;

public final class Exists {
    public static void main(String... args) {
        if (args.length != 1) {
            System.err.println("usage: java Exists filepath");
            System.exit(1);
        }

        if (new File(args[0]).exists()) {
            System.out.println("File \"" + args[0] + "\" exists!");
        } else {
            System.out.println("File \"" + args[0] + "\" does not exist.");
        }
    }
}

You might guess that it's something to do with the .exists() method, and you would be right. I mentioned in the title that this is an older Java API, but why was it changed?

.exists() returns a boolean, and throws no exceptions. That last point is key. On Linux, Java issues a stat system call to figure out if a file exists or not. If it doesn't, stat fails with error code ENOENT. stat can also fail for a variety of other reasons.

Save the above code in a file called Exists.java and run the following commands:

$ javac Exists.java
$ java Exists Exists.java
File "Exists.java" exists!
$ java Exists DoesntExist.java
File "DoesntExist.java" does not exist.

So far, so good. Now let's use a tool called strace to simulate stat failing.

$ strace -f -qq -P Exists.java -e fault=stat:error=ENOMEM -- java Exists Exists.java
[pid  7199] stat("Exists.java", 0x7f7227c2b780) = -1 ENOMEM (Cannot allocate memory) (INJECTED)
File "Exists.java" does not exist.

We make it seem like out computer has run out of memory, and in this case .exists() erroneously reports that the file does not exist!

This is actually a bug and in the new Files.exists() it is more clear that false doesn't always mean the file does not exist. There's also readAttributes(), which will throw an IOException if there's stat fails, and this is what you should use if you need certainty.

The API design problem here was trying to represent dozens of possible outcomes in a data type that can only represent two. Whenever you're considering returning a boolean, stop and think if an enum would be better. Though, if you do this, try and avoid the classic tri-state boolean.

As an aside: Java's Math.abs() function has a similar design problem, points if you can spot it.

# Apache's EntityUtils.consume()

It's not so much EntityUtils.consume() that's the problem here, it's the fact that it was considered necessary at all. Let's look at an example and then talk through what's wrong.

import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.HttpClients;

public final class App {
  public final static void main(String[] args) throws Exception {
    HttpClient client = HttpClients.createDefault();

    while (true) {
      HttpGet req = new HttpGet("http://example.com/");
      HttpResponse res = client.execute(req);
      System.out.println(res.getStatusLine());
    }
  }
}

If you were to run this, you wouldn't get more than a couple of successful responses back before the program hangs indefinitely. Why? Because we've forgotten to close our HTTP responses, of course. We need to add the following line after our System.out.println() call:

EntityUtils.consume(res.getEntity());

I've had to track this one down in production code more than once, and it always gets me thinking: why does this keep happening? Is the language at fault for not giving us better tools to manage closeable resources? Could the library have a better interface? Could it output a helpful message or exception when it detects we've made this mistake?

The answer is a mix of these. Java has an interface called Closeable, which is the best way for library authors to signal to a user that a thing needs closing. The thing that needs closing in an HttpResponse is buried inside of it. This means there's no way for an IDE to statically analyse the code and alert the user to the problem.

The authors of the library noticed this, and this is the code they now recommend you write:

import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;

public final class App {
    public final static void main(String[] args) throws Exception {
        try (CloseableHttpClient client = HttpClients.createDefault()) {
            while (true) {
                HttpGet req = new HttpGet("http://example.com/");
                try (CloseableHttpResponse res = client.execute(req)) {
                    System.out.println(res.getStatusLine());
                }
            }
        }
    }
}

This surfaces the Closeable interface, and now IDEs can alert you when you have forgotten to close the response. This is better, but does nothing to help the poor sod debugging their hanging process. What I'd love to see is a log message that's output before the program hangs saying that the user may have forgotten to close their responses.

Go handles resource reclamation well with the defer keyword.

package main

import (
  "os"
)

func main() {
  f, err := os.Create("myfile")
  if err != nil {
    panic(err)
  }
  defer f.Close()

  // do other stuff with `f`
}

The execution of f.Close() won't happen until the end of the function. This feature keeps creation and clean up close to each other, instead of screens apart. It's not perfect, but it does help.

# Assert equal in most languages

Lastly, the classic equality assertion in most language's unit testing libraries. I'll pick on Ruby, but this is present in almost all of them:

require "minitest/autorun"

class MyTest < Minitest::Test
  def test_equality
    a = 1
    b = 2
    assert_equal a, b
  end
end

assert_equal takes two parameters here: expected, and actual. But which is which? It's usually expected and then actual, but there are examples where that isn't true.

I don't know about you, but I always have to take a second to remember which way around it should be. "Does it matter, though?" you might ask. Yes, it often does. Each testing library will generate test failure output of the form: "expected X to be equal to Y." It's not always obvious which argument is X and which is Y, especially if you're comparing the output of two function calls.

Can we do better? Yes!

"Fluent" APIs do a really good job here, and is the way a lot of unit testing frameworks are going.

Here's an example in Ruby using the rspec library:

require 'spec_helper'

describe "a test" do
  it "should be equal" do
    a = 1
    b = 2
    a.should eql(b)
  end
end

And in Java using Google's Truth library:

int a = 1;
int b = 2;

assertThat(a).isEqualTo(b);
assertThat(a).isLessThan(b);

Fluent APIs read better and make it more obvious that actual always comes first. In API design terms, the interface does work to make it more difficult for the user to make mistakes. This is a good API design principle that we'll explore in a future post.

# Conclusion

Not much to conclude here. I hope you got some insight from looking at real examples, and I hope my suggested improvements make sense. If you don't agree, or you think things could be even better, I'm extremely excited to hear about it in the comments. :)