Writing Typeclass Derivations with Magnolia

Magnolia may be used to write methods which materialize typeclass instances for arbitrary case classes and sealed traits, by combining existing typeclasses Magnolia finds in scope. However, for any given typeclass, Magnolia needs a definition for how it should combine the typeclasses it finds for each of the parameters in a case class, and how it chooses between the typeclasses it finds for each subtype of a sealed trait.

This tutorial explains how to write that definition.

The Derivation Object

Magnolia’s design expects each typeclass derivation to be defined it its own object, bundling the methods which will be used to build the new typeclasses alongside the method which gets bound to the magnolia macro.

A basic derivation object will follow this structure,

import magnolia._
import scala.language.experimental.macros

object MyDerivation {
  type Typeclass[T] = ???
  def combine[T](caseClass: CaseClass[Typeclass, T]): Typeclass[T] = ???
  def dispatch[T](sealedTrait: SealedTrait[Typeclass, T]): Typeclass[T] = ???
  implicit def gen[T]: Typeclass[T] = macro Magnolia.gen[T]
}

We will need to provide implementations for the combine and dispatch methods, and the Typeclass type constructor, though these implementations will depend on the nature of the typeclass interface we are deriving for.

If you are also the author of the typeclass, the typeclass’s companion object makes a reasonable choice for the derivation object. An implicit method defined on the companion object, such as gen above, will automatically be in scope during any implicit search for an instance of the typeclass. If you are not the author of the typeclass, then you will not have access to the companion object, and users of your derivation will need to import the implicit gen method into scope to use Magnolia to generate typeclass instances, or call it directly every time.

You cannot derive for more than one typeclass in the same object.

The Type Constructor

Firstly, Magnolia needs to know how to construct the typeclass type. Most typeclasses have just a single type parameter, which will be the generic type Magnolia will derive the typeclass for, and in these cases, such as for Show, we can simply write,

type Typeclass[T] = Show[T]

though some typeclasses may take more than one parameter. If, for example, we had an Encoder[F, T] typeclass, parameterized on both the type is is encoding (T), and the format it is encoding to (F), we would need to provide a derivation for each format, fixing the format parameter in the Typeclass definition, like so,

type Typeclass[T] = Encoder[Json, T]

The combine Method

The combine method is typically the most difficult part of the derivation object definition, and its implementation will depend heavily on whether the typeclass is covariant or contravariant. This distinction will sometimes apply to the variance annotation on the generic parameter to the typeclass. But even in the common cases where typeclasses are invariant in their generic parameter, that generic parameter will typically appear in either covariant or contravaraint positions in the typeclass interface.

For example, Show is a contravariant typeclass, because its abstract method takes a value of the generic type, and returns a fixed type (String),

trait Show[T] { def show(value: T): String }

whereas Default, which returns an instance of the generic type, is a covariant typeclass (which incidentally takes no parameters),

trait Default[T] { def default: T }

Let’s start with the contravariant case, using Show as an example.

We need to implement a method which returns a new instance of Show[T], having been provided with a single value: a CaseClass instance. For a Show typeclass, we need to implement,

def combine[T](caseClass: CaseClass[Show, T]): Show[T] = ???

The CaseClass value provides everything we know about the particular case class we need to derive a new Show for. But as we are programming to a generic interface, we actually know very little concretely. Looking at the type of CaseClass[Show, T], only the type constructor, Show is universally quantified.

CaseClass provides a method,

def parameters: Seq[Param[Show, T]]

which gives us access to a sequence of objects corresponding to each of the parameters in the case class we are deriving for. The Param type has several useful methods:

An interesting point to be aware of is that the Param instance has methods which take or return values with a type corresponding to the parameter type. But any code we write will not know anything about that type, and in a sequence of Param values (of unknown length) these types will be abstract, and likely different. Yet we must find a way to make use of them to implement the new typeclass!

Each Param instance has a type member, called PType, which is the abstract type corresponding to that parameter in the case class. Thankfully, given any particular value of Param[Show, T], say p, the compiler knows that the typeclass method will return an instance of Show[p.PType], which means that it has a method, show, which takes a value of type p.PType (and returns, invariably, a String). We can apply an instance of T, say t, to p.dereference to get a p.PType value back, so by combining these two, p.typeclass.show(p.dereference(t)) now gives us an instance of String for an instance of the case class, T. So we have eliminated the existentially-quantified p.PType type.

We can, of course, apply this same function to each element of the sequence of Params, and given an instance of the case class type, T, we can produce a Seq[String]. Each Param also has a label, so we could choose to prefix each parameter with its label. That code might be,

val paramStrings = caseClass.parameters.map { p =>
  p.label+"="+p.typeclass.show(p.dereference(t))
}

A reasonable Show instance might join all of these named parameters, separated by commas, inside parentheses, and prefixed with the name of the case class, which we can obtain from the typeName member of our CaseClass instance, like so,

caseClass.typeName+paramStrings.mkString("(", ",", ")")

and we now have a single String.

But you may have noticed that I have not yet explained where the instance of the case class, t, comes from. Remember, we are not deriving a String; we are deriving a typeclass which converts a T instance into a String, and the t is simply the parameter to the show method in the new typeclass we are constructing. Putting this all together, here is a full implementation of combine for the Show typeclass.

def combine[T](caseClass: CaseClass[Show, T]): Show[T] = new Show[T] {
  def show(t: T): String = {
    val paramStrings = caseClass.parameters.map { p =>
      p.label+"="+p.typeclass.show(p.dereference(t))
    }

    caseClass.typeName+paramStrings.mkString("(", ",", ")")
  }
}

Most contravariant typeclass derivations will take a similar form. But this doesn’t work for covariant typeclasses, such as a Decoder, where we must implement a method which takes a String and constructs a new instance of a case class from that String. In these cases, we must do some similar type-gymnastics, but mostly with different methods from the Magnolia API.

Our implementation of combine will look similar to that of Show, except that we now need to find a way to implement the abstract decode method of a new Decoder[T] typeclass.

def combine[T](caseClass: CaseClass[Decoder, T]): Decoder[T] = new Decoder[T] {
  def decode(s: String): T = ???
}

Given that we have no T-typed inputs, we have only one way to produce a T, which is to use the construct method on our CaseClass[Decoder, T] instance.

construct takes a lambda of type Param[Decoder, T] => Return. This lambda needs to operate on each parameter in the case class in turn, producing a value of the appropriate type for each parameter, and then construct returns an instance of the case class type, T, composed of those parameters.

The return type of the parameter lambda is Return, and is, of course, dependent on the Param value. Unfortunately, Scala’s type system cannot represent a Function type where the return type is dependent on the input type, and still maintain reasonable type inference, so Magnolia compromises on the typesafety of the return value for the lambda. It is very important to ensure that the lambda returns a value of the right type, but it should nonetheless be obvious during the first usage of a derived typeclass instance if the implementation is incorrect, as a ClassCastException will be almost guaranteed to occur. A future version of Scala may support better typing of function types.

For now, though, we need to construct a new case class instance, by specifying how each parameter should be constructed. In the case of Decoder, we start with a String, so we will assume that we have elsewhere implemented a method, say parse, which will read the contents of the string (in whatever format we are decoding) and convert its contents into a Map of keys and values, where the keys are, by convention, the parameter names from the case class.

For each parameter, p, we can then look up its String value in the map, and use the typeclass corresponding to the parameter to decode it to the appropriate type, like so,

def combine[T](caseClass: CaseClass[Decoder, T]): Decoder[T] =
  new Decoder[T] {
    def decode(str: String): T = {
      val valueMap = parse(str)
      caseClass.construct { p => p.typeclass.decode(valueMap(p.label)) }
    }
  }

Typically, the combine method for a covariant typeclass will,

Note that the construct API doesn’t expose the sequence of parameters it operates on. Given that for any given case class, the user has no control over the length of the sequence of parameters, and no way to distinguish between different parameters, for simplicity and safety, Magnolia takes care of mapping the function over all of the parameters.

A full implementation of a Decoder derivation is given in the Magnolia examples

As well as case classes, Magnolia will derive for value classes (exactly one parameter) and case objects (zero parameters). Derivations for these types will work without any special modifications to the above code. But if there is a need to distinguish them from ordinary case classes, the CaseClass type has methods isValueClass and isObject which will be true for each of those cases, respectively.

The dispatch Method

If you only care to derive for case classes, and don’t need to handle sealed traits, dispatch may be left unimplemented. However, in most cases, it can be implemented more easily than the combine method, so it’s worth including for completeness.

Its signature looks like this,

def dispatch[T](sealedTrait: SealedTrait[Typeclass, T]): Typeclass[T]

The purpose of the dispatch method, regardless of the variance of the typeclass, is to choose the correct, or “best”, subtype to use to handle the input. Magnolia’s API presents a sequence of Subtype[Typeclass, T] instances to choose from.

In the case of contravariant typeclasses, where we have an instance of the sealed trait type as input, this choice is very easy: the instance of the sealed trait will have exactly one matching Subtype instance, which will be the only possibility. The SealedTrait API provides a convenience method, also called dispatch, for dealing with exactly these cases. It takes an instance of the sealed trait type to choose which Subtype to use, and a lambda from that Subtype instance to a return value. Because the trait is marked sealed, we know that exactly one of the Subtype instances will match.

The API for Subtype is very similar to that of Param. It provides access to the type name of the subtype (label), and the corresponding typeclass instance (typeclass), but in place of a dereference method, Subtype has a partial function called cast which will cast a value with the type of the sealed trait to the type of that subtype.

An instance of Scala’s PartialFunction may or may not be defined for the inputs it accepts, and normally we would want to check before passing one a value it can’t handle, but for the Subtype#dispatch method, given that we have already selected a Subtype instance on the basis of an input value, it is safe to call subtype.cast on that same value and know that it will return a value with that subtype’s type. This is important because the typeclass value for that Subtype will need an instance of this same type, and not the sealed trait’s type.

We can therefore implement dispatch for a Show typeclass like so,

def dispatch[T](sealedTrait: SealedTrait[Show, T]): Show[T] = new Show[T] {
  def show(t: T): String = sealedTrait.dispatch(t) { subtype =>
    subtype.typeclass.show(subtype.cast(value))
  }
}

Most implementations for contravariant type classes won’t deviate far from this pattern.

For covariant typeclasses, however, we generally have a freer choice of which subtype to use, as that choice is not dictated by our input type, and different typeclasses may make that choice in different ways. Some possibilities are,

For our covariant Decoder typeclass, our input is a String, so we will assume the existence of a getTypeName method which can extract the name of a type from the string, to be used to match against the subtypes names. Having found a matching Subtype instance, we will then use its typeclass value to decode the input String. The string will be exactly the same input as was passed to the sealed trait typeclass, so there’s no need to process or cast it; we are simply delegating processing it to a different typeclass.

Here is what the implementation of dispatch looks like for a Decoder,

def dispatch[T](sealedTrait: SealedTrait[Decoder, T]): Decoder[T] =
  new Decoder[T] {
    def decode(str: String) = {
      val name = getTypeName(str)
      val subtype = ctx.subtypes.find(_.label == name).get
      subtype.typeclass.decode(str)
    }
  }

As you can see, even this implementation is quite simple.

Summary

With a definition for Typeclass[T], and implementations of combine and dispatch, we have everything required for Magnolia to provide derivation. Several examples of typeclasses with their derivation objects exist in the examples directory of the Magnolia source.

If these members are put into the same object, say Show or Decoder in the examples above, alongside a gen[T] method, bound to macro Magnolia.gen[T], it should be possible to call Show.gen[SomeType] or Decode.gen[SomeType] and have Magnolia derive an appropriate typeclass for SomeType.

If the definiton of SomeType refers to types for which no typeclass can be found or derived, Magnolia should report which typeclass could not be found, including a “stack trace” if it is deeply nested in the ADT structure.

Note that Magnolia will only produce debugging output for explicit calls to the gen method; not for invocations through implicit search. The reason for this is that implicit search may invoke the Magnolia macro, which may fail to derive a suitable implicit, but implicit search may subsequently continue and find a matching implicit elsewhere. Magnolia has no way of knowing whether implicit search will ultimately fail, and any failure output it produces may turn out to be a false-positive.

Getting more help

Magnolia is still being actively developed, and has so far only had exposure to a limited number of typeclass derivations, so everyone is still exploring its capabilities. If you get stuck implementing a derivation for a particular typeclass, ask on Gitter, or send a tweet to @propensive.