This is the twenty-third edition of our GHC activities report, which describes the work Well-Typed are doing on GHC, Cabal, HLS and other parts of the core Haskell toolchain. The current edition covers roughly the months of March to May 2024. You can find the previous editions collected under the ghc-activities-report tag.
Sponsorship
We are delighted to offer new Haskell Ecosystem Support Packages to provide commercial users with access to Well-Typed’s experts while investing in the Haskell community and its technical ecosystem. If your company is using Haskell, read more about our offer, or get in touch with us today, so we can help you get the most out of the toolchain, and continue our essential maintenance work.
Many thanks to our existing sponsors who make this work possible: Anduril and Juspay. In addition, we are grateful to Mercury for funding specific work on improved performance for developer tools on large codebases, to the Sovereign Tech Fund for funding work on Cabal, and to the HLS Open Collective for funding work on HLS. Of course, Haskell tooling is a large community effort, of which Well-Typed’s contributions are just one part. We are immensely grateful to everyone contributing to the Haskell ecosystem!
Team
The GHC/Cabal/HLS team at Well-Typed currently consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal, Sam Derbyshire, Rodrigo Mesquita, Hannes Siebenhandl and Mikolaj Konarski. In addition, many others within Well-Typed are contributing to GHC more occasionally: Adam Gundry is secretary to the GHC Steering Committee, and this month’s report includes contributions from Duncan Coutts and Finley McIlwaine.
GHC Releases
Ben released GHC 9.10.1 in May. This includes some significant steps forward, including:
- The new
GHC2024
language edition - The implementation of Ben’s exception backtrace proposal
- Improved mechanisms for debugging and performance analysis
Zubin released GHC 9.6.5 in April, and published a blog post for the GHC developer blog on GHC release plans. Check out the GHC status page for up to date information on releases.
HLS
Thanks to ongoing support from the HLS Open Collective, Zubin released HLS 2.8.0.0 in May, and Well-Typed continue working on keeping HLS maintained and up to date with new GHC releases. In particular, Zubin and Hannes are working towards supporting GHC 9.10 and preparing to release HLS 2.9.0.0.
Cabal
Mikolaj is working as a maintainer of Cabal, supporting users and contributors.
He coordinated the release of Cabal
3.12 as part of GHC 9.10 and assisted
in releasing it as a standalone library, updating, documenting and streamlining
the release process. He’s taking part in the release effort for version 3.12
of the cabal-install
build tool.
Matthew, Rodrigo and Sam have been working to address longstanding architectural
and maintenance issues in the Cabal
library and the cabal-install
build
tool, thanks to support from the Sovereign Tech
Fund. See our introductory blog
post and
the previous activities report
for more details. This has included a wide range of bug fixes and code refactorings,
as well as the development of specific new features.
A new home for GHC’s internals
Ben has been working for some time on creating the ghc-internal
package to clearly distinguish user-facing APIs (in base
) from compiler implementation details (in ghc-internal
). This saw its first public release alongside GHC 9.10.1.
As far as possible, we want to make implementation details such as the existence of the ghc-internal
package invisible to end users, but perhaps inevitably, the split exposed various issues where this was not the case, particularly in Haddock. In addition, compiler plugins that mistakenly hard-code references to identifiers in base
may break due to internal identifiers moving to ghc-internal
. Ben fixed several ghc-typelits-*
plugins to resolve identifier locations correctly, thereby avoiding this problem (#24680).
More work is needed to gradually disentangle implementation details from user-facing APIs, and deprecate the parts of base
that are not intended for direct use by users, in collaboration with the Core Libraries Committee.
Specialisation
Finley published a two-part series of blog posts on Choreographing a dance with the GHC specializer:
Part 1 acts as a reference manual documenting exactly how, why, and when specialization works in GHC.
Part 2 introduces new tools and techniques we’e developed to help make more precise, evidence-based decisions regarding the specialization of our programs.
Andreas added a new -fexpose-overloaded-unfoldings
flag to GHC (!9940),
allowing specialisations to fire without the full overhead of
-fexpose-all-unfoldings
.
Haddock merged into GHC tree!
A longstanding pain point for GHC development has been that Haddock is closely coupled to GHC, but was being developed in its own repository and included via a git submodule, which complicated making changes that span both GHC and Haddock. Ben recently assisted Hécate, the Haddock maintainer, merge the submodule into the main GHC tree (#24834, !11058). This allows for subsequent simplifications to Haddock (!12743).
Profiled dynamic way
Matthew has been working on adding support for building dynamic libraries with profiling in GHC and Cabal (#15394, !12595, Cabal MR 9900).
Deterministic object code
Thanks to a lot of past work by dedicated GHC contributors, GHC produces deterministic interface files (#4012), so compiling the same source code with the same compiler will always produce the same ABI. However, GHC does not yet produce deterministic object files (#12935), so compiling identical source code may produce object files that are not bit-for-bit identical (in particular this arises when compiling multiple modules concurrently).
This is an issue for build systems that rely on hashing compilation outputs to improve performance or ensure reproducibility. Rodrigo has started work on a new effort towards deterministic object code, and has made some promising initial progress.
Cost centre profiling
Andreas modified GHC to avoid adding cost centres to static data (#24103, !12498),
resulting in much smaller code sizes with -fprof-late
. For a profiled build of
GHC the size of build artifacts goes down by about 25% in total and we expect
similar benefits for other projects.
This is a step towards making it feasible to distribute libraries compiled for profiling with late cost centres included (#21732, !10930), which will improve the profiling and debugging experience.
Segfaults / backend soundness
- Andreas investigated and fixed a segfault due to a tag inference bug (#24870).
- Andreas fixed a serious but thankfully hard to trigger soundness bug due to anunsound pattern match optimization (#24507, !12256).
- Andreas fixed an issue with the FMA primop generating a wrong result on x86_64 (#24496).
- Andreas investigated an Arm codegen issue with jumps being out of range (#24648) when linking large projects on Mac. This turned out to be a linker bug/deficiency on newer Mac linkers.
process
library
Ben released two new versions of the core process
library, to address several issues:
HSEC-2024-0003, a security advisory relating to potential command injection via argument lists on Windows.
The introduction of a new API
System.Process.CommunicationHandle
for platform-independent interprocess communication, the need for which came out of our work on Cabal.Various other bug fixes and API improvements.
A new I/O manager based on io_uring
Duncan is gradually working on a long-term project to introduce a new RTS I/O
manager based on the io_uring
Linux kernel system call interface. This will
allow asynchronous I/O for block devices such as SSDs to make significantly
greater use of parallelism, improving performance for applications that make
heavy use of disk I/O.
As a preparatory step, Duncan has been refactoring and improving the RTS code for I/O managers (!9676) with review support from Ben and other GHC developers.
Compiler performance and memory usage
Hannes, Zubin and Matthew have been working on reducing memory usage of GHC, GHCi and HLS and improving their performance on very large codebases, thanks to support from Mercury. This includes:
- Using more efficient representations of interface files (!12263, !12346,
!12371). This is particularly helpful when using the
-fwrite-if-simplified-core
option to include Core definitions in interface files for better performance. A new-fwrite-if-compression
option makes it possible to select different space/time trade-offs. - Choosing appropriate memory-efficient data structures (!12140, !12142, !12170).
- Making sure that
-fwrite-if-simplified-core
causes recompilation when appropriate (!12484). - Many other memory usage improvements (!12345, !12347, !12348, !12582, !12442, !12222, !12200, !12070).
- Using a more efficient algorithm for
checkHomeUnitsClosed
(!12162).
Rodrigo significantly improved the performance of the dynamic linker on MacOS (#23415), finishing off and landing a patch by Alexis King to reduce dependency-loading time by looking up symbols only in the relevant dynamic libraries (!12264). GHCi load time for a client project affected by this issue went down from 35 seconds to 2 seconds.
Runtime performance
Andreas made the magic
inline
function work in the presence of casts and will look throughcoerce
to find a function it can inline (#24808).Andreas fixed an issue where the bottomness of an unreachable branch was affecting performance (#24806).
Foreign function interface
Andreas added support for
(# #)
as an FFI argument (#24818).Andreas opened a GHC proposal around time profiling of safe FFI calls.
Ben added support for
State# RealWorld
inforeign import prim
(#24598).
Software transactional memory
Andreas completed his deep dive into STM and identified various improvements, including making starvation less likely in some cases (#24142, #24446, !12194).
Continuous integration and testing
While producing alpha releases for 9.10, it became clear that more validation was needed to detect problems earlier.
Matthew improved the monitoring setup with a Grafana nightly pipeline dashboard with the ability to send alerts on nightly job failures.
Hannes picked up earlier work by Ben to collect CI performance metrics via
perf
(!7414), which will allow more precise performance analysis.Matthew made various other improvements to the CI pipelines, including upgrading the runners to GHC 9.6, and extending GHC’s CI infrastructure for testing installation with
ghcup
to test a variety of explicit linker configurations (ghcup-ci MR 14).