Skip to content

Keep a running InstalledPackageIndex to skip expensive per-package ghc-pkg calls#11767

Draft
sheaf wants to merge 1 commit intohaskell:masterfrom
sheaf:cached-ipi
Draft

Keep a running InstalledPackageIndex to skip expensive per-package ghc-pkg calls#11767
sheaf wants to merge 1 commit intohaskell:masterfrom
sheaf:cached-ipi

Conversation

@sheaf
Copy link
Copy Markdown
Collaborator

@sheaf sheaf commented Apr 27, 2026

Building on the in-library refactor that landed in 6867dd5, this patch extracts computePackageInfoFromIndex from the Cabal computePackageInfo function, which allows cabal-install to use a running InstalledPackageIndex instead of having to repeatedly query ghc-pkg, once per package.

Several cabal-install functions now pass around a TVar InstalledPackageIndex which keeps track of the InstalledPackageIndex used for building a project. Each time a dependency gets registered, we update this TVar, and read from it to obtain an InstalledPackageIndex to pass to the Cabal configure function, skipping slow calls to ghc-pkg.


Template Α: This PR modifies behaviour or interface

  • Patches conform to the coding conventions.
  • Any changes that could be relevant to users have been recorded in the changelog.
  • The documentation has been updated, if necessary.
  • No manual QA notes necessary (no change to the command line interface).
  • Testing approach: the existing tests continue to work even though we query ghc-pkg far less.

@sheaf sheaf force-pushed the cached-ipi branch 6 times, most recently from f27441c to 06795ae Compare April 29, 2026 11:57
@sheaf sheaf marked this pull request as ready for review April 29, 2026 11:59
@sheaf
Copy link
Copy Markdown
Collaborator Author

sheaf commented Apr 29, 2026

This was a bit subtle to write, and getting it wrong means that we pass an incomplete or incorrect InstalledPackageIndex when configuring, which can lead to "Package not found" errors.

@ulysses4ever
Copy link
Copy Markdown
Collaborator

Since the title of this PR mentions replacing "expensive" ghc-pkg calls, I am curious if there is any fast, back-of-envelope timings of what sorts of gains one can expect from this change.

@sheaf
Copy link
Copy Markdown
Collaborator Author

sheaf commented Apr 29, 2026

Since the title of this PR mentions replacing "expensive" ghc-pkg calls, I am curious if there is any fast, back-of-envelope timings of what sorts of gains one can expect from this change.

I will get back to you with more precise measurements but for typical packages this can save over half of the time spent in configure. Taken with #11768 I am seeing typical configure times go from ~3s to ~0.5s.

@sheaf
Copy link
Copy Markdown
Collaborator Author

sheaf commented Apr 30, 2026

I have put up a spreadsheet here, compiling pandoc from scratch and using the --build-timings feature from #11769.

Base average time spent in configure: 2.9 seconds.
With this patch: it goes down to 1.7 seconds.
With #11768: it goes down to 1.2 seconds.
With both: it goes down to 0.15 seconds.

Copy link
Copy Markdown
Member

@Mikolaj Mikolaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Looks plausible to me.

Copy link
Copy Markdown
Collaborator

@ulysses4ever ulysses4ever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caching strategy is straightforward and looks good. The relation to some ghc-pkg calls is indirect (you can't see it in the diff), so it's unclear to me. I'd appreciate if it was clarified, to which end I leave a comment for the changelog below. Thanks!

Comment thread changelog.d/cached-ipi.md Outdated
Comment thread cabal-install/src/Distribution/Client/ProjectBuilding/PackageFileMonitor.hs Outdated
Comment on lines +199 to +200
--------------------------------------------------------------------------------

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this separator between imports and the rest of the module needed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it helpful to clearly demarcate the module header from the rest of the file, especially in this case with the CPP just above. I'd prefer to keep it if that's OK with you.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This position is just after the last import. I'm a vim user and I can get to that location with "/^import" then "N".

I wonder if HLS provides a fold/unfold for all of the imports?

@sheaf sheaf marked this pull request as draft May 4, 2026 10:20
@sheaf
Copy link
Copy Markdown
Collaborator Author

sheaf commented May 4, 2026

Re-marking as draft, as I have been using a version of cabal-install with these changes and have been running into some issues, with errors like:

Error: [Cabal-5000]
The following package dependencies were requested
    --dependency='th-abstraction=th-abstraction-0.7.2.0-f7bd9111da30b5c4189a982e4515bcca407b3baf'
however the given installed package instance does not exist.

@sheaf
Copy link
Copy Markdown
Collaborator Author

sheaf commented May 5, 2026

I have attempted to fix the issue I was seeing with the latest push to the branch. I will do a bit more testing locally to make sure things are working as expected.

We extract 'computePackageInfoFromIndex' from the Cabal 'computePackageInfo'
function, which allows cabal-install to use a running 'InstalledPackageIndex'
instead of having to repeatedly query ghc-pkg (once per package).

Several cabal-install functions now pass around a 'TVar InstalledPackageIndex'
which keeps track of the 'InstalledPackageIndex' used for building a
project. Each time a dependency gets registered, we update this 'TVar',
and read from it to obtain an 'InstalledPackageIndex' to pass to the
Cabal configure function, skipping slow calls to 'ghc-pkg'.

For more details, please see Note [Per-project InstalledPackageIndex]
in Distribution.Client.ProjectBuilding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants