This seventh edition of our GHC activities report marks the one-year anniversary since that start of sending out these regular updates on the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of June and and July 2021.
You can find the previous editions collected under the ghc-activities-report tag.
A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK, Facebook, and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.
If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch. If you are interested in working with us, we recently announced a hiring round.
Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!
Team
Currently, Ben Gamari, Andreas Klebinger, Matthew Pickering and Zubin Duggal are working primarily on GHC-related tasks. In addition, Alfredo Di Napoli has been doing some work on GHC in the last two months, next to other projects he is working on. Many others within Well-Typed, including Adam Gundry, Alp Mestanogullari, Douglas Wilson and Oleg Grenrus, contributed to GHC more occasionally.
Release management
Ben has been handling backports and release planning for the 9.2.1 and 9.0.2 releases.
Matt worked on the structure of the bindists produced by
hadrian
, now they are much more like the ones produced bymake
. This also fixes some other issues with the Windows packaging from the 9.0.1 release. (!6133)Matt worked on fixing some issues with the 8.10.5 Darwin packaging which caused several tickets to be reported about hangs (#19950, #19968, #20004, !5992, !6003).
Zubin fixed some bugs with LLVM version detection in the HEAD and 8.10.5 releases (#19973, #19828, #19959).
Zubin has a patch in progress (!5965) that will allow the GHC library to be re-installed, so that GHC API clients (e.g. Haskell Language Server) are not restricted to the boot library versions that shipped with the compiler. This paves the way for smaller binary distributions of GHC, since users would be able to recompile the GHC library to have access to things like profiling and debug information, instead of having to ship all these configurations in the binary distribution.
Compiler error messages
Alfredo continued his work on GHC’s new diagnostic API (#18516, #19905). After completing the foundational work, he started to port existing GHC errors and warnings to the new API as well as fine-tuning the design (!6087, !6249, !6165, !6129, !5924, !5872, !5719). He also created a few newcomer-friendly tickets to help with the conversion work: these tickets have a lot of context to guide first-time GHC contributors towards their first merged MR. See for example #20118 and #20119.
Alfredo is also finalising an introductory blog post to the new GHC diagnostic API which will be published in the next few weeks.
Frontend
Matt has started preliminary work on fixing a long standing bug where mixing optimisation levels would lead to optimisations not firing in some cases (#12847, #13002, #20021, #8635, #9370). With the patch (!6080), the pragmas are always read from interface files but we are careful to not look when optimisation is turned off. It turns out that using some information from interface files improves compiler performance because simpler code is produced.
Matt has continued on his crusade to refactor and modernise GHC’s driver code (!5987, !6178). This time the code that drives
--make
has been in his sights. Amongst other things the patch tries to separate the specification of the build graph from the execution of the build graph, so it is possible to describe different execution strategies. The patch also simplifies (and specifies) how module cycles are compiled which has long been a pain-point for people modifying this area.Matt fixed the
-Wunused-packages
warning to work correctly with reexported packages (!6130).Zubin fixed a bug affecting Backpack users that resulted in a compiler panic instead of a type error in certain cases (#19244).
Ben introduced driver support for Clang’s
--target
flag, improving robustness of builds in multi-architecture environments (e.g. Darwin with Rosetta, #20162).
Haddock and documentation
- Zubin rebased and improved the long pending
hi
Haddock work, which should allow Haddock to generate documentation using only GHC interface (.hi
) files (!6224). This greatly simplifies Haddock’s implementation, and allows it to skip parsing, renaming and type-checking files if the appropriate information already exists in the interface files, speeding it up greatly in such cases. This also reduces Haddock’s peak memory consumption. Identifiers in Haddock comments will also be renamed by GHC itself, and the results are also serialized into.hi
files for tooling to make use of. A number of Haddock bugs were fixed along the way (#20034, haddock 30, haddock 665, haddock 921).
GHCi and developer experience
Zubin improved GHCi completion to better support Unicode characters and operators, fixing a bug in the 9.2 pre-release, which erased the entire line the user typed if completion was triggered on an operator name (#20101, !6160).
Matt has fixed a number of 9.2 regressions involving GHCi (!6032, !6090).
Profiling and debugging
Matt took
ghc-debug
for a test on a puzzling profile presented by a user (#20065) which seemed to have a large discrepancy between live bytes and the information reported in the profile. It turned out that the application had a severely fragmented heap, which was easy to diagnose and observe usingghc-debug
.Andreas is still working on ways to make
perf
and similar tools work well on Haskell code. He wrote a blog post with more details for the curious.
Compiler performance
- Matt squashed a leak in the simplifier which should reduce maximum residency for all programs, and in particular reduced maximum residency in the test from 2GB to 1.3GB (!6202).
- Matt found a very subtle space leak caused by a reference being retained on the stack longer than necessary (!6185).
- Andreas improved register allocation performance under high register pressure (!6209).
Runtime performance
- Ben characterised and worked to resolve a number of runtime performance regressions observed in GHC 9.0.1 (#19557, #19701, #19822, #19727, #19769, #19790).
Compiler correctness
Ben wrote a blog post motivating the
keepAlive#
operation introduced in GHC 9.0, as well as several of the considerations relevant to its design.Ben performed a refactoring of GHC’s “adjustor” mechanism used by some foreign calls, fixing a bug manifesting with some newer
libffi
versions (#20051) while fixing a few nearbylibffi
-related bugs (#19869).Ben collected and characterised a number of issues manifesting on AArch64/Darwin which were ultimately found to be due to the rather peculiar ABI of that platform (#20079). He performed an audit (#20085) of Hackage packages looking for similar issues in common packages and wrote a blog post providing advice to users for writing portable, robust foreign library bindings.
Ben carried out a thorough refactoring of the internals of the
process
library, fixing a subtle correctness bug manifesting under Darwin (#19994) while reducing process spawn cost in many cases.Matt started looking into an old static pointers correctness issue (#16981) which a few users had commented on recently. We know what the problem is but it seems that to fix the ticket robustly a more invasive change will be needed to how static pointers are compiled.
Runtime system
Andreas enabled the
pthread
-based RTS ticker implementation by default for the single-threaded RTS (!6158), improving compatibility with foreign libraries using signal-based alarms.Ben diagnosed and fixed a bug the GHC runtime’s threading abstraction leading to severe GC performance regressions in 9.2 and
master
(#20144).Ben diagnosed and fixed a subtle bug in the non-moving garbage collector due to an inconsistency in size units in the array write barrier implementation (#19715).
CI and infrastructure
Matt has added support to head.hackage to run a test-suite of programs. This replaces tests in GHC’s testsuite which depended on external packages and hence were never executed during normal test runs. Now it will be straightforward to add tests with more complicated dependencies.
Matt worked on GHC’s performance dashboard infrastructure, using the data collected during
head.hackage
and validation builds to monitor GHC’s compilation performance.Ben migrated GHC build artifacts and Docker images to local storage to improve service availability.
Ben refactored the GHC CI infrastructure on Darwin to make it uniform with other platforms and reducing the potential for
nix
paths leaking into binary distributions (#20131).