Over the last few weeks I have been finishing and improving the implementation of support for Multiple Home Units. A lot of the preliminary work was completed by Fendor. In short, multiple home units allows you to load different packages which may depend on each other into one GHC session. This will allow both GHCi and HLS to support multi component projects more naturally. To get a more complete overview of the why then you should first consult Fendor’s excellent introduction.
This post will concentrate on the interface and implementation of the feature. In particular we will talk about the solution to two of the issues he summarises at the end of the post, all the flags you need to know about and other limitations of the current implementation.
I originally implemented support for multiple components in Haskell Language Server (HLS) at the start of 2020 but the implementation has always been hacky. Stack also has rudimentary support for loading multiple home units into one ghci session but doesn’t allow different options or dependencies per component. Given the increasing importance of HLS, fundamental issues which affect it are now being given more priority under the normal GHC maintenance budget. Multiple Home Units is one of the first bigger projects which we hope will allow the language server to be implemented more robustly.
Interface of Multiple Home Units
Imagine that you have a project which contains two libraries, named lib-core
and lib
.
lib-core
contains some utility functions which are used by lib
, so when editing lib
you
are often also editing lib-core
. Multiple home units can make this less painful by allowing a build tool
to compile lib
and lib-core
with one command line invocation. How would a build tool make use of this
feature?
In order to specify multiple units, the -unit @⟨filename⟩
flag is given multiple times
with a response file containing the arguments for each unit. The response file contains
a newline separated list of arguments.
ghc -unit @unitLibCore -unit @unitLib
where the unitLibCore
response file contains the normal arguments that cabal would
pass to --make
mode.
-this-unit-id lib-core-0.1.0.0
-i
-isrc
LibCore.Utils
LibCore.Types
The response file for lib
, can specify a dependency on lib-core
, so then modules in lib
can use modules from lib-core
.
-this-unit-id lib-0.1.0.0
-package-id lib-core-0.1.0.0
-i
-isrc
Lib.Parse
Lib.Render
Then when the compiler starts in --make
mode it will compile both units lib
and lib-core
.
There is also very basic support for multiple home units in GHCi: at the moment you can start
a GHCi session with multiple units but only the :reload
command is supported. Most commands in GHCi
assume a single home unit, and so it is additional work (#20889) to modify the interface
to support multiple loaded home units.
Options used when working with Multiple Home Units
There are a few extra flags which have been introduced specifically for working with multiple home units. The flags allow a home unit to pretend it’s more like an installed package, for example, specifying the package name, module visibility and reexported modules.
-working-dir ⟨dir⟩
-
It is common to assume that a package is compiled in the directory where its cabal file resides. Thus, all paths used in the compiler are assumed to be relative to this directory. When there are multiple home units the compiler is often not operating in the standard directory and instead where the cabal.project file is located. In this case the
-working-dir
option can be passed which specifies the path from the current directory to the directory the unit assumes to be it’s root, normally the directory which contains the cabal file.When the flag is passed, any relative paths used by the compiler are offset by the working directory. Notably this includes
-i
and-I⟨dir⟩
flags. -this-package-name ⟨name⟩
-
This flag papers over the awkward interaction of the
PackageImports
and multiple home units. When usingPackageImports
you can specify the name of the package in an import to disambiguate between modules which appear in multiple packages with the same name.This flag allows a home unit to be given a package name so that you can also disambiguate between multiple home units which provide modules with the same name.
This solves one problem that Fendor described in his blog post.
-hidden-module ⟨module name⟩
-
This flag can be supplied multiple times in order to specify which modules in a home unit should not be visible outside of the unit it belongs to.
The main use of this flag is to be able to recreate the difference between an exposed and hidden module for installed packages.
Fendor talked about the issue of module visibility in his blog post, and this flag solves the issue.
-reexported-module ⟨module name⟩
-
This flag can be supplied multiple times in order to specify which modules are not defined in a unit but should be reexported. The effect is that other units will see this module as if it was defined in this unit.
The use of this flag is to be able to replicate the reexported modules feature of packages with multiple home units.
Offsetting Paths in Template Haskell splices
When using Template Haskell to embed files into your program, traditionally the
paths have been interpreted relative to the directory where the .cabal
file resides.
This causes problems for multiple home units as we are compiling many different libraries
at once which have .cabal
files in different directories.
For this purpose we have introduced a way to query the value of the -working-dir
flag to the Template Haskell API. By using this function we can implement a makeRelativeToProject
function which offsets a path which is relative to the original project root by the
value of -working-dir
.
import Language.Haskell.TH.Syntax ( makeRelativeToProject )
= $(makeRelativeToProject "./relative/path" >>= embedFile) foo
If you write a relative path in a Template Haskell splice you should use the
makeRelativeToProject
function so that your library works correctly with multiple home units.
A similar function already exists in the file-embed
library.
The function in template-haskell
implements this function in a more robust manner by honouring the -working-dir
flag rather
than searching the file system.
Closure Property for Home Units
For tools or libraries using the GHC API there is one very important closure property which must be adhered to:
Any dependency which is not a home unit must not (transitively) depend on a home unit.
For example, if you have three packages p
, q
and r
, then if p
depends on q
which depends on r
then it
is illegal to load both p
and r
as home units but not q
, because q
is a dependency of the home unit p
which depends
on another home unit r
.
If you are using GHC by the command line then this property is checked, but if you are using the GHC API then you need to check this property yourself. If you get it wrong you will probably get some very confusing errors about overlapping instances.
Limitations of Multiple Home Units
There are a few limitations of the initial implementation which will be smoothed out on user demand.
- Package thinning/renaming syntax is not supported (#20888)
- More complicated reexports/renaming are not yet supported.
- It’s more common to run into existing linker bugs when loading a large number of packages in a session (for example #20674, #20689)
- Backpack is not yet supported when using multiple home units (#20890)
- Dependency chasing can be quite slow with a large number of modules and packages (#20891).
- Loading wired-in packages as home units is currently not supported (this only really affects GHC developers attempting to load
template-haskell
). - Barely any normal GHCi features are supported (#20889). It would be good to support enough for
ghcid
to work correctly.
Despite these limitations, the implementation works already for nearly all packages.
It has been testing on large dependency closures, including the whole
of head.hackage
which is a total of 4784 modules from 452 packages.
Conclusion
With the first iteration of this implementation the necessary foundational aspects
have been implemented to allow GHC API clients such as HLS to load multiple home
units at once.
The next steps are for the maintainers of build tools such as Cabal and Stack
to modify their repl
commands to support the new interface.
Well-Typed is able to work on GHC, HLS, Cabal and other core Haskell infrastructure thanks to funding from various sponsors. If your company is interested in contributing to this work, sponsoring maintenance efforts, or funding the implementation of other features, please get in touch.