This is the third edition of our new GHC activities report, which is intended to provide regular updates on the work on GHC and related projects that we are doing at Well-Typed. This edition covers roughly the months of October and November 2020.
The previous editions are here:
A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK and Facebook, are providing us with funding to do this work. We also recently announced a partnership with Hasura with a focus on working toward better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.
If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.
Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!
Release management
GHC 9.0.1 continues to move forward since the alpha1 release at the end of September. Currently we are working on finishing work to resolve the long-standing issue #17760, which we have deemed a blocker for release due to the large number of similar-looking reports that we have seen in recent months. Work on this ticket is proceeding apace.
Concurrently, GHC 8.10.3 is moving ahead, with debugging on Windows to try to fix the remaining known linking issues on that platform (#18496, #18548, #18946, #17236, #18634).
Runtime performance and measurement
Ben Gamari has finished his bottom-up profiling (!3871) branch, which allows the user to easily add cost-centers to all call-sites of a set of functions. This is particularly useful in conjunction with a global view provided by the ticky-ticky profiler: ticky-ticky can spotlight functions which are entered more often than expected under the user’s cost model. Bottom-up profiling can then be used to work up from these functions to determine by which paths through the call graph these functions were called.
Meanwhile, Andreas Klebinger has been putting the final touches on this tag inference (#16970) patch. This large patch introduces a new pass to GHC’s pipeline, allowing tag checks to be elided in cases where the nature of the closure can be statically determined.
Andreas also noticed and fixed an instance of suboptimal code generation in an inner loop of the garbage collector (#12416).
Compiler performance
Ben has finished a patch (#17958, !2952) improving GHC’s treatment of nullary type synonyms and the Type
kind
during typechecking. Happily, this refactoring reduces compiler allocations by several percent in some
tests of the performance testsuite and improves compiler runtime measurably when building the Cabal
library
Ben introduced a set of new compile-time performance testing jobs
into GHC’s CI pipeline, using the compilation of a selection of GHC’s core libraries as an instrument for
characterising GHC compilation performance. This, in addition to the
head.hackage
infrastructure (described in a previous post), will allow us to build a long-term record of the evolution
of GHC’s compile-time performance.
Andreas pushed through a refactoring
of GHC’s TyCon
data structures, improving the efficiency of some very common operations.
Andreas made small improvements to GHC’s internal Map
and Set
types (!4230).
The main benefit is code clarity, but there is also a (very small) compile time benefit to the change as well,
as we can often avoid on indirection on look-up.
Portability and infrastructure
Ben has finished and merged his branch fixing the runtime system’s treatment of memory-ordering (!2642). This work has taken on renewed urgency with the coming of Apple’s ARM silicon, which is known to take advantage of the ARM architecture’s weak memory ordering guarantees. For this reason we will be backporting this branch to the 8.10 and 9.0 release series.
Ben has implemented a refactoring (!4234) of the Hadrian build system making its notion of flavours more compositional, easing configuration for GHC developers.
Ben, Andreas and Tamar Christina have cleaned up the known Windows CI failures and have removed the accept_failure
flag
from the Windows validation jobs. This is a major milestone in the life of our Windows CI infrastructure
and will allow us to catch regressions sooner.
Debugging tools and observability
Matthew Pickering and David Eichmann have been working on ghc-debug, a debugging framework for GHC programs which provides a facility for traversing the heap of another Haskell process. The goal of this work is to enable rich profiling of live production systems with minimal performance overheads.
Matt is also continuing his work in source provenance information for info tables. This will allow for more flexible profiling and could serve as the basis for a portable stack backtrace mechanism in the future.
Ben has been working on making the compiler itself more observable by improving timings output (!4449), enabling eventlogging by default (!4448), adding eventlog integration into the Ticky-Ticky profiler (!3085), and exploring some ideas for reducing the cost of eventlogging with an eye towards making enabling eventlogging in the default runtime system.
Error messages refactoring
Alfredo Di Napoli has been picking up the work that Alp started this summer on structured error
messages, with the aim to
rebase Alp’s changes on top of the current HEAD
as well as further carrying along the conversion of errors
from textual values to richer types. Apart from the wiki
a good place to track is the GHC ticket (#18516)
containing the technical discussion.
Records
Adam Gundry’s proposal to simplify the handling of DuplicateRecordFields
has been accepted by the GHC Steering Committee. Adam has also been working on a
refactoring that will fix various bugs with DuplicateRecordFields
(!4467).
Bug fixes
Ben finished and merged a set of patches re-enabling code unloading and dynamic linking
support in GHC’s runtime linker (#16525, !3842). This should significantly improve the usability
of Template Haskell splicing in long-running compiler processes
(e.g. as used by ghc-ide
, see ghc-ide#854).
Andreas diagnosed and fixed a bug (!4362) in the native code generator’s lowering of 64-bit integer comparisons on i386. Not only does the new approach fix a glaring bug, but it should also be a bit faster.
Ben continues work on fixing the long-standing soundness issue, #17760. After working through several possible designs, we have arrived at a final compromise design which has largely implemented.
Ben has fixed a few bugs afflicting the Windows linker, resolving a major source of segmentation faults when compiling Template Haskell splices on Windows (#15808). However, there is still more to be in this area (#18496).
Ben has fixed a long-standing bug (#18733) in recompilation checking which could result in incorrect code generation for some programs containing recursive binding groups.
Ben and a contributor worked to identify a critical soundness bug (#18919) in the garbage collector’s treatment
of MVar
s and in the process fixed (!4408) some invariants in the runtime system’s heap sanity checker.
Ben refactored the bytecode interpreter to allow it to compile programs containing very large functions (#14334).
Ben has fixed a long-standing bug (#18043) in the flush behavior of GHC’s event-log which could result in loss of events in some cases.
Ben worked to fix a bug (#18857) in the LLVM code generator breaking the memcmp
primitive when
compiling with the -fllvm
.