GHC 9.2 introduces two new language extensions called OverloadedRecordDot
and
OverloadedRecordUpdate
, which overload record field access (myRecord.x.y)
and record field update (myRecord{x.y = a})
, respectively. In this short blog
post we will discuss these two extensions in the context of two libraries,
large-anon
and large-records
. Of course, OverloadedRecordDot
and
OverloadedRecordUpdate
can be useful in many other contexts too; the
principles we explain in this blog post should apply universally.
We will start by discussing large-anon, which is a library
that provides support for scalable anonymous records (records that do not have
to be defined before they can be used), along with a rich API for manipulating
such records. We will see how we can take advantage of OverloadedRecordDot
and
OverloadedRecordUpdate
to access and update fields of such records.
We then briefly discuss large-records, which provides
“regular” records but with a linear compilation time, as opposed to the
quadratic compilation time for standard records; this can really matter for
large records (records with more than say 20 or 30 fields). The usage of
OverloadedRecordDot
and OverloadedRecordUpdate
is pretty similar to their
usage in large-anon
, but as we will see, there is a small technical
complication.
OverloadedRecordDot
When the OverloadedRecordDot language extension
is enabled, then r.b
for some record r
and field b
is interpreted as
getField @"b" r
, where getField
is a method of the HasField
class defined
in GHC.Records
:
class HasField x r a | x r -> a where
getField :: r -> a
For example, the following code1, which uses the
large-anon library for anonymous records, will print
True
:
{-# LANGUAGE OverloadedRecordDot #-}
{-# OPTIONS_GHC -fplugin=Data.Record.Anon.Plugin #-}
import Data.Record.Anon
import Data.Record.Anon.Simple qualified as S
test :: IO ()
= print r.b
test where
r :: S.Record [ "a" := Int, "b" := Bool ]
= ANON { a = 5, b = True } r
The syntax also extends to nested fields; for example this code will print
Just 'x'
:
import Data.Record.Anon.Advanced qualified as A
test :: IO ()
= print r.b.d
test where
r :: S.Record [ "a" := Int, "b" := A.Record Maybe [ "c" := Bool, "d" := Char ] ]
= ANON { a = 5, b = ANON_F { c = Nothing, d = Just 'x' } } r
(The difference between “simple” anonymous records and “advanced” anonymous
records is that the latter support an additional functor argument; S.Record ["a" := Int, "b" := Bool ]
is a record with one field or type Int
and one
field of type Bool
, whereas A.Record Maybe [ "c" := Bool, "d" := Char ]
is a
record with one field of type Maybe Bool
and one field of type Maybe Char
.
For an in-depth introduction, see large-anon: Practical scalable anonymous
records for Haskell.)
OverloadedRecordUpdate
Unfortunately, the situation with
OverloadedRecordUpdate is not as clean as
with OverloadedRecordDot
. The current situation is that the syntax r{a = 6}
translates to setField @"a" r 6
, without specifying what setField
is,
precisely. The design and implementation of the corresponding class is not yet
finalized; the GHC wiki entry on overloaded
record fields contains a bunch of proposals.
In an attempt to guarantee forwards compatibility, therefore,
OverloadedRecordUpdate
currently requires
RebindableSyntax. This language extension changes
GHC so certain pieces of syntax no longer refer to their
standard definition, but rather to whatever definition is in scope; for example,
it makes it possible to change what
OverloadedStrings
does (by using a different
definition of fromString
) or to change how do
notation is desugared (by using a different definition of
>>=
). For records, it will use whatever definition of
getField
is in scope instead of using the HasField
class; and it will also
whatever definition of setField
is in scope, except that in this case there
is no standard definition yet.
The large-anon
library offers a module Data.Record.Anon.Overloading
which
can be used in combination with RebindableSyntax
: it restores all definitions
to their standard definition again, and it defines a setField
function which
depends on the HasField
class from the
record-hasfield package:
class HasField x r a | x r -> a where
hasField :: r -> (a -> r, a)
For example, the following code prints ANON { a = 6, b = True }
:
{-# LANGUAGE OverloadedRecordUpdate #-}
{-# LANGUAGE RebindableSyntax #-}
import Data.Record.Anon.Overloading
test :: IO ()
= print r { a = 6 }
test where
r :: S.Record [ "a" := Int, "b" := Bool ]
= ANON { a = 5, b = True } r
Like record field access, OverloadedRecordUpdate
also works for nested fields
(in this case OverloadedRecordDot
is also required):
test :: IO ()
= print r { b.d = Just 'a' }
test where
r :: S.Record [ "a" := Int, "b" := A.Record Maybe [ "c" := Bool, "d" := Char ] ]
= ANON { a = 5, b = ANON_F { c = Just True, d = Nothing } } r
large-records
The above examples all used the large-anon library for
anoynmous records; the situation with the large-records
is similar, with one somewhat annoying technical difference. The
large-records
library provides records whose compilation time scales linearly
with the size of the record, as opposed to the quadratic compilation time of
standard records. As discussed in Avoiding quadratic core code size with large records, it
needs to jump through quite a few hoops to achieve this. When a module
contains a record declaration such as
{-# OPTIONS_GHC -fplugin=Data.Record.Plugin #-}
{-# ANN type Person largeRecord #-}
data Person = Person { name :: String }
then large-records
turns it into something like this:
data Person = forall a. a ~ String => Person { name :: a }
This unusual representation achieves two goals: the strange existential type parameter prevent GHC from generating field accessors (one of the many sources of quadratic code), but we can still construct and pattern match on these records in a normal way.
As discussed in the GHC manual, the existential type
parameter will also prevent GHC from automatically derive a HasField
instance for fields such as name
. This wouldn’t be a problem (large-records
could generate the instance instead), if it wasn’t for this proviso in the
manual:
If a field has a higher-rank or existential type, the corresponding HasField constraint will not be solved automatically (..), but in the interests of simplicity we do not permit users to define their own instances either.
There is a GHC proposal to relax this
condition; until that proposal is implemented,
however, the use of OverloadedDot
with large-records
will depend on
RebindableSyntax
(which then bypasses the use of the standard HasField
class
entirely). Of course, as we saw in the previous section,
OverloadedRecordUpdate
depends on RebindableSyntax
anyway, so this isn’t
as much of a limitation as it might seem.
Conclusions
Record dot syntax has been available since GHC 8.6 through the use of the
record-dot-preprocessor. From 9.2 onwards
this syntax is now available without a preprocessor or plugin; for record field
access this is achieved through the use of the HasField class from
GHC.Records. For record field update the design is not
yet finalized, and we instead rely on RebindableSyntax
.
As of their latest releases, the large-anon and
large-records libraries, as well as the supporting
libraries large-generics and
typelet, are now compatible with GHC 9.2 and 9.4 and make use
of these extensions. It is also still possible to use record-dot-preprocessor
(you will need at least version 0.2.16 for GHC 9.4). The only complication is
that GHC 9.4 changed the order in which plugins are
loaded. This matters because the
large-records
plugin must be loaded before RecordDotPreprocessor
; to avoid
users having to use CPP to distinguish between GHC versions, large-records
now
offers Data.Record.Plugin.WithRDP
which combines both plugins (for
large-anon
the order does not matter).
Here and elsewhere we will only show the most important language pragmas and includes.↩︎