Checking Types Against the Real World in TypeScript

Jesse Hallett · Apr 12, 2018

TypeScript

This is a follow-up to Type-Driven Development with TypeScript.

The shape of data defines a program. There are important benefits to writing out types for your data.

Let's consider a Hacker News client, which consumes stories and other items from the Hacker News API. This is a TypeScript type that describes the format for stories:

type Story = {
  type: "story"
  by: string // username
  dead?: boolean
  deleted?: boolean
  descendants: number
  id: number
  kids?: number[] // numeric IDs of comments
  score: number
  text?: string // HTML content if the story is a text post
  time: number // seconds since Unix epoch
  title: string
  url?: string // URL of linked article if the story is not a text post
}

In Javascript and other dynamically-typed languages, it is common to write a program without any explicit description of a data structure like Story. The shape of the data is implied in the code that manipulates the data. But that means anyone reading the code has to mentally reconstruct that shape from context, or refer to documentation outside of the program itself.

Catching mistakes

TypeScript provides the option of documenting data structures in the form of types. An obvious advantage is that the type checker can identify mistakes like typos in property names when accessing data. The Hacker News API uses the property name descendants; for some reason every time I try to type descendants I end up typing descendents by mistake. If I did not have a type checker to point out that Story does not have a property named descendents I could end up wasting a lot of time debugging!

List all changes

But this just scratches the surface. Types for data structures help keep programmers oriented. When a data structure is suddenly required to change, all you need to do is to update that particular type and the type checker will list all of the changes that need to be made to work with this new type.

Reducing cognitive load

When you come back to a program after you have been away from it long enough to forget how everything works, having descriptions of data structures right there in the code makes it much easier to understand what is going on. The same is true if more than one person is working on the project. Every detail that can be captured in types is one less detail that programmers have to carry in their heads. Reduced cognitive load leaves programmers with more energy for writing important business logic.

Bridging the gap with validators

But what if you make a mistake when you write the type? I mentioned that I had problems mixing up descendants and descendents. I actually made the same mistake the first time I wrote the Story type. The type checker cannot help me if I give it bad information from the start! Unfortunately, a static type checker cannot check types against data from an external API. But what you can do is to write a validator that will check at runtime that incoming data has the shape that you expect. Then you can extract a static type from the validator that is guaranteed to match any values that pass validation.

There is a nifty library called io-ts that works like magic. Instead of the Story type above We can define a validator using io-ts combinators:

import * as t from "io-ts"

// The `V` in `StoryV` is short for `Validator`
const StoryV = t.type({
  type: t.literal("story"), // value of property called `type` is the exact string `"story"`
  by: t.string, // username
  dead: optional(t.boolean),
  deleted: optional(t.boolean),
  descendants: t.number, // number of comments
  id: t.number,
  kids: optional(t.array(t.number)), // IDs of comments on an item
  score: t.number,
  text: optional(t.string), // HTML content if story is a text post
  time: t.number, // seconds since Unix epoch
  title: t.string,
  url: optional(t.string), // URL of linked article if the story is not text post
})

// The `optional()` combinator is defined later in the article

This looks similar to the Story type from the beginning of the post. StoryV expresses the properties of objects coming from the Hacker News API with a type for each property. (The t.type() combinator produces a validator that expects an object with the given property names and types.) But this time the "types" for each property are actually values supplied by io-ts: t.number, t.string, t.boolean, etc. Values can be referenced at runtime, types cannot. With StoryV we can validate any arbitrary Javascript value by calling StoryV.decode(whateverValue). If the given value is not an object with the expected properties then decode will return an error value.

From validator to type

What makes io-ts uniquely valuable is that it simultaneously defines a runtime validator and a static type.

If StoreV.decode() returns a success result, then TypeScript knows that the resulting value has a descendants property and does not have a descendents property.

If a value passes validation, then it is guaranteed to match that static type, and we can use it to check the correctness of the rest of the program. If a value does not pass, then you will get a failure with a clear point in the program where it should be handled.

For example:

import fetch from "node-fetch"

async function fetchTitle(storyId: number): Promise<string> {
  const res = await fetch(
    `https://hacker-news.firebaseio.com/v0/item/${storyId}.json`,
  )
  const data = await res.json()

  // If the data that is fetched does not match the `StoryV` validator then this
  // line will result in a rejected promise.
  const story = await decodeToPromise(StoryV, data)

  // This line does not type-check because TypeScript can infer from the
  // definition of `StoryV` that `story` does not have a property called
  // `descendents`.
  const ds = story.descendents

  // TypeScript infers that `story` does have a `title` property with a value of
  // type `string`, so this line passes type-checking.
  return story.title
}

// `decodeToPromise` is defined later in the article

Validating incoming data at runtime allows the program to fail fast if there is a mismatch between the data and the program's expectations. In development, that makes it easy to catch bugs early: any mismatch is identified immediately at the point where you call decodeToPromise. You don't need fixtures or unit tests to check data ingress. Yes, the validation step could lead to failures in production that you would not have seen otherwise if data comes in some unexpected shape under some condition - but the alternative is for the program to limp along with unknown data leading to possibly-undefined behavior. Failing fast is better!

To minimize unnecessary validation errors it is a good idea to make your validators permissive in what they accept. For example, err on the side of marking properties as optional if there is any possibility that those properties will be absent in some cases. And you can exclude properties from the validator that you are not going to use in your program.

Referencing types produced using io-ts

StoryV replaces the hand-written Story type - so we no longer have a way to refer to the type of story objects. But we can get that type back! Io-ts provides a type operator called t.TypeOf that extracts a static type from a validator. We can define a new Story type like this:

type Story = t.TypeOf<typeof StoryV>

Every TypeScript value has a type. You can reference and manipulate the value at runtime. Likewise, you can reference and manipulate the type at type check time. The expression typeof StoryV uses TypeScript's built-in typeof operator to get the typecheck-time representation of StoryV which conveniently holds a complete description of the shape of story objects. That description is wrapped in a validator type; t.TypeOf pulls the shape description out into an independent type.

You can use the computed Story type in annotations in the rest of your program:

function formatStory(story: Story): string {
  return `"${story.title}" submitted by ${story.by}`
}

When data comes in different shapes

The Hacker News API publishes more than just stories. The /v0/item/ endpoint alone also provides comments, job postings, polls, and poll options, which all have different shapes. We want to be able to fetch an item from that endpoint and use a runtime check on the type property in the returned object to determine what type of item it is. And we want the type checker to verify the correctness of the whole process.

Let's use io-ts to create some more item definitions. These will be similar to the definition of StoryV. Here are abbreviated definitions (see the accompanying code for complete definitions):

const CommentV = t.type(
  {
    type: t.literal("comment"),
    parent: t.number,
    text: t.string, // HTML content
    /* ... */
  },
  "Comment",
) // The second argument is a label that makes validation messages nicer.
type Comment = t.TypeOf<typeof CommentV>

const JobV = t.type(
  {
    type: t.literal("job"),
    text: optional(t.string), // HTML content if job is a text post
    url: optional(t.string), // URL of linked page if the job is not text post
    /* ... */
  },
  "Job",
)
type Job = t.TypeOf<typeof JobV>

const PollV = t.type(
  {
    type: t.literal("poll"),
    descendants: t.number, // number of comments
    parts: t.array(t.number),
    /* ... */
  },
  "Poll",
)
type Poll = t.TypeOf<typeof PollV>

const PollOptV = t.type(
  {
    type: t.literal("pollopt"),
    poll: t.number, // ID of poll that includes this option
    score: t.number,
    text: t.string, // HTML content
    /* ... */
  },
  "PollOpt",
)
type PollOpt = t.TypeOf<typeof PollOptV>

The Hacker News item API could return a story or any of these types, which means that the type of values from the item API is a union of all five types. More specifically the type is a tagged union: the type property in API responses is a tag that we can use to distinguish between types within the union. A tagged union validator looks like this:

const ItemV = t.taggedUnion(
  "type", // the name of the tag property
  [CommentV, JobV, PollV, PollOptV, StoryV],
  "Item", // a label to make validation messages nicer
)
type Item = t.TypeOf<typeof ItemV>

This is why it was important to use the t.literal() combinator instead of t.string for the type of the type property in each item validator: using t.literal() with a literal string makes the exact string value available to the type checker. With that information, TypeScript can use type guards to narrow the type of an item to a specific item type based on the value of item.type. For example:

function formatItem(item: Item): string {
  switch (item.type) {
    case "story":
      // Stories have titles, so this is ok.
      return `"${item.title}" submitted by ${item.by}`
    case "job":
      return `job posting: ${item.title}`
    case "poll":
      // Only polls have a `parts` property - this would not pass type checking
      // without the type guard.
      const numOpts = item.parts.length
      return `poll: "${item.title}" - choose one of ${numOpts} options`
    case "pollopt":
      // In some item types `text` can be undefined, but not in poll options.
      return `poll option: ${item.text}`
    case "comment":
      const excerpt =
        item.text.length > 60 ? item.text.slice(0, 60) + "..." : item.text
      return `${item.by} commented: ${excerpt}`

    // Usually TypeScript will report an error if you do not include
    // a `default` case in a `switch`. But in this function TypeScript infers
    // that all possible item types have been handled.
  }
}

By the way, io-ts also supports intersections, untagged unions, and other fun combinators. Oh, and io-ts supports Flow too - not just TypeScript!

Next steps

This was just a quick introduction to what io-ts is capable of, and techniques for applying type-checking to external data. The concepts here are not limited to consuming API data: I recommend similar use of io-ts validators when working with data loaded from a database, serialized messages between micro-services, user input, or any other case where data can come in from outside the program.

The best way to cement your understanding of a pattern is to experiment with it. I encourage you to check out the accompanying code and try adding some features. One idea is to display ID numbers with story titles and add an option so that if the user passes an ID as a command-line argument when running the script it displays a link and some comments on the corresponding story.

Appendix A: definition for optional

In a hand-written definition for an object type you can use a question mark to indicate that a property might be absent:

type Story = {
  text?: string
  url?: string
}

There is no easy way to do that with io-ts because the argument to t.type() is an actual object, and object properties are either present or not present. There is another combinator, t.partial(), that describes an object where all properties optional. The idiomatic way to represent an object where some properties are optional is to use an intersection of t.type() for required properties, and t.partial() for optional properties:

const StoryV = t.intersection(
  [
    t.type({
      // required properties
      type: t.literal("story"),
      descendants: t.number, // number of comments
    }),
    t.partial({
      // optional properties
      text: t.string, // HTML content if story is a text post
      url: t.string, // URL of linked article if the story is not text post
    }),
  ],
  "Story",
)

I used a different pattern in this article. I didn't want to introduce too many concepts all at once; so I didn't introduce intersections and nested definitions right away.

My optional() combinator is a union of the given type with undefined. Technically this implies that we expect the given property to be present in every case, but that the value might be undefined. In practice, that distinction often does not matter, and io-ts will validate an object that is missing a required property if the type of that property is allowed to be undefined. But note that io-ts might make object validation more strict in the future!

This is the definition of optional:

function optional<RT extends t.Any>(
  type: RT,
  name: string = `${type.name} | undefined`,
): t.UnionType<
  [RT, t.UndefinedType],
  t.TypeOf<RT> | undefined,
  t.OutputOf<RT> | undefined,
  t.InputOf<RT> | undefined
> {
  return t.union<[RT, t.UndefinedType]>([type, t.undefined], name)
}

That is adapted from the maybe combinator given in the io-ts README. It is pretty dense for readers who do not have much experience with advanced TypeScript use cases. This is the sort of function that should be put into a library, and I might do that in the future.

Appendix B: definition for decodeToPromise

The built-in io-ts method StoryV.decode() returns an Either value, which is a type from the package fp-ts that can hold either an error or a successful result. It is similar to a promise except that it represents an immediate result, not an asynchronous one. The examples in this article use promises; so I wrote a function, decodeToPromise to put validation results into the more familiar Promise type. Here is the definition:

import { reporter } from "io-ts-reporters"

// Apply a validator and get the result in a `Promise`
function decodeToPromise<T, O, I>(
  validator: t.Type<T, O, I>,
  input: I,
): Promise<T> {
  const result = validator.decode(input)
  return result.fold(
    (errors) => {
      const messages = reporter(result)
      return Promise.reject(new Error(messages.join("\n")))
    },
    (value) => Promise.resolve(value),
  )
}

fold() is a method on the Either type. It is used to collapse a possibility of success and a possibility of error into one definite value. TypeScript checks that the error-case callback and the value-case callback have compatible return types. One callback or the other will run depending on whether the result is an error or a success value.

decodeToPromise also invokes an io-ts reporter to translate a set of validation errors into a readable message.

Interested in working with us?