Explore relational database state storage by bcardiff · Pull Request #1465 · haskell/hackage-server

bcardiff · 2026-02-11T20:49:30Z

This is the exploration done during AmeriHac to migrate the acid store to a relational database. I will leave it as draft since it's an exploration probably that will trigger discussions rather than a patch. I also hope to continue soon to keep migrating more state to the database to keep validating the approach.

The goals were

To use beam to eventually provide Sqlite and PG options to users
Sqlite might be best suited for local development and small self-hosted hackage while PG should be available for bigger instances
Make the server migrate acid to database upon boot
Introduce a Database feature in a similar way other features are written
The database feature itself manage the connection, encouraging to always use it within a transaction
Other features defines in their State module the types that represents the rows of their tables.
The database feature depends on those type declaration but not the feature itself.

Things not covered

PG Support
How to perform database schema migrations
Add tests regarding the data migration from acid to the database

Other general doubts

Should the backup/restore functionality cover the database?
Should we aim for a gradual vs full code migration
Should we allow maintainers of the hackage instances to choose to keep using acid
What is the actual desired schema of the database? Do we want created_at/updated_at columns for example in every table? Should be have a single users tables with their password hash and their account details?

To keep the exploration shorter we assumed that db backups would probably be done using the usual db backup tools and not via CSV files. It would be simpler to aim for full code migration and no choice but to migrate to the database. If these assumptions are not right then we will need some abstractions to keep reading and writing to the store (acid or db) which increases the effort.

This is the first time using Beam for me, so there might be better ways of doing things.

Thanks to @danbornside for all the pairing on this

seed the database

Depends on TemplateHaskell

Not really needed since they run in a different directory

So there is no need to override the convention

To have a more controlled migration

To use UserId directly we need to change it to a non machine-dependent size for Beam. MemSize and Pretty are not defined on Int32. But other places start to complain after that

Database.SQLite.Simple.execute_ is not able to run multiple statements at once

Enums are saved using Show and pared with Read They currently abort the request if the data is malformed

danbornside

i'm not any more familiar with hackage-server codebase than brian, only that we paired on this a bit and i'm more familiar with beam. Even with requested changes i'm not confident enough to approve as-is without some input from mantainers

src/Distribution/Server/Features/UserDetails.hs

danbornside · 2026-02-11T21:03:25Z

src/Distribution/Server/Features/UserDetails.hs

+                  } madetails
+  let cdetails = change adetails
+
+  accountDetailsUpsert AccountDetailsRow {


this is a bit ormic. this should really use the fine grained syntax to affect only the relevant fields without needing to read it out of the db first.

src/Distribution/Server/Features/UserDetails.hs

src/Distribution/Server/Features/Users.hs

danbornside · 2026-02-11T21:34:39Z

src/Distribution/Server/Users/AuthToken.hs

    deriving (Eq, Ord, Read, Show, MemSize)

+instance FromBackendRow Sqlite AuthToken where
+  fromBackendRow = AuthToken . BSS.toShort <$> fromBackendRow


does this need to be encoded? the corresponding column in the schema is TEXT; and i rather think that's not a binary-safe column type. sqlite supports BLOB and postgress has BINARY, which probably could work, although for a small value like this, they're pretty inconvenient compared to encoding to hex or base64 or something

As mentioned elsewhere, you probably want to implement FromField rather than FromBackendRow instead

danbornside · 2026-02-11T21:35:25Z

src/Distribution/Server/Features/UserDetails/State.hs

@@ -0,0 +1,47 @@
+{-# LANGUAGE DeriveAnyClass #-}


DeriveAnyClass is a bit dodgy

danbornside · 2026-02-11T21:35:44Z

src/Distribution/Server/Features/UserDetails/State.hs

+
+type AccountDetailsRow = AccountDetailsT Identity
+
+deriving instance Show AccountDetailsRow


instances for type aliases are a bit dodgy

This is actually the idiomatic way to provide instances for Beam table types, for what it's worth

danbornside · 2026-02-11T22:33:43Z

src/Distribution/Server/Features/UserDetails/State.hs

+  sqlValueSyntax = autoSqlValueSyntax
+
+instance FromBackendRow Sqlite AccountDetailsKind where
+  fromBackendRow = read . unpack <$> fromBackendRow


read will throw an exception if there's a parsing problem. either fail pure . Text.Read.readEither would be better.

supplying a meaningfull error message instead of Prelude.read no parse would be better still

You should not have to define explicit instances for FromBackendRow. The class BeamBackend ultimately states that reading columns from Sqlite is done via Database.Sqlite.Simple.FromField.FromField.

Therefore, you should define an instance of Database.Sqlite.Simple.FromField.FromField AccountDetailsKind instead

LaurentRDC

I help maintain Beam. I provided some comments on idiomatic Beam usage. Feel free to summon me in this PR and others if this helps

LaurentRDC · 2026-02-12T01:31:18Z

src/Distribution/Server/Features/UserDetails/State.hs

+  sqlValueSyntax = autoSqlValueSyntax
+
+instance FromBackendRow Sqlite AccountDetailsKind where
+  fromBackendRow = read . unpack <$> fromBackendRow


You should not have to define explicit instances for FromBackendRow. The class BeamBackend ultimately states that reading columns from Sqlite is done via Database.Sqlite.Simple.FromField.FromField.

Therefore, you should define an instance of Database.Sqlite.Simple.FromField.FromField AccountDetailsKind instead

LaurentRDC · 2026-02-12T01:34:33Z

src/Distribution/Server/Features/Database.hs

+runSelectReturningOne :: forall a. (FromBackendRow Sqlite a) => SqlSelect Sqlite a -> Transaction (Maybe a)
+runSelectReturningOne q =
+  Transaction $ ReaderT $ \(SqlLiteConnection conn) -> runBeamSqlite conn $ Database.Beam.runSelectReturningOne q
+
+runInsert :: forall (table :: (Type -> Type) -> Type). SqlInsert Sqlite table -> Transaction ()
+runInsert q =
+  Transaction $ ReaderT $ \(SqlLiteConnection conn) -> runBeamSqlite conn $ Database.Beam.runInsert q
+
+newtype Transaction a = Transaction {unTransaction :: ReaderT Connection IO a}
+  deriving (Functor, Applicative, Monad)
+
+runTransaction :: Transaction a -> Connection -> IO a
+runTransaction t = runReaderT (unTransaction t)


I understand that the plan is to support both Sqlite and Postgres. I think this can be done by abstracting over the backend instead of using Sqlite specifically here. Then, either via ExistentialQuantification or some typeclass, the only database-specific bit is the runBeamPostgres vs runBeamSqlite functions

LaurentRDC · 2026-02-12T01:38:01Z

src/Distribution/Server/Features/Users.hs

+    forM_ (Users.enumerateAllUsers users) $ \(uid, uinfo) -> do
+      let (status, authInfo) = 
+            case userStatus uinfo of
+              AccountEnabled a -> (Enabled, Just a)
+              AccountDisabled ma -> (Disabled, ma)
+              AccountDeleted -> (Deleted, Nothing)
+
+      Database.runInsert $
+        insert
+          (_tblUsers Database.hackageDb)
+          (insertValues [UsersRow {
+            _uId = uid,
+            _uUsername = userName uinfo,
+            _uStatus = status,
+            _uAuthInfo = authInfo,
+            _uAdmin = Group.member uid admins
+          }])
+
+      forM_ (Map.toList (userTokens uinfo)) $ \(token, desc) -> do
+        Database.runInsert $
+          insert
+            (_tblUserTokens Database.hackageDb)
+            (insertExpressions [UserTokensRow {
+              _utId = default_,
+              _utUserId = val_ uid,
+              _utToken = val_ token,
+              _utDescription = val_ desc
+            }])


It might be more performant to execute runInsert fewer times, but each time with more than a single value.

LaurentRDC · 2026-02-12T01:39:01Z

src/Distribution/Server/Users/AuthToken.hs

    deriving (Eq, Ord, Read, Show, MemSize)

+instance FromBackendRow Sqlite AuthToken where
+  fromBackendRow = AuthToken . BSS.toShort <$> fromBackendRow


As mentioned elsewhere, you probably want to implement FromField rather than FromBackendRow instead

LaurentRDC · 2026-02-12T01:39:35Z

src/Distribution/Server/Users/State.hs

+instance FromBackendRow Sqlite UsersStatus where
+  fromBackendRow = read . unpack <$> fromBackendRow


You should define an instance of FromField rather than FromBackendRow

LaurentRDC · 2026-02-12T01:40:32Z

src/Distribution/Server/Users/Types.hs

+instance FromBackendRow Sqlite UserId where
+  fromBackendRow = UserId . fromIntegral @Int32 <$> fromBackendRow


You should derive an instance of FromField rather than FromBackendRow

LaurentRDC · 2026-02-12T01:40:48Z

src/Distribution/Server/Users/Types.hs

+instance FromBackendRow Sqlite UserName where
+  fromBackendRow = UserName . unpack <$> fromBackendRow


Same here, FromField instead of FromBackendRow

LaurentRDC · 2026-02-12T01:41:07Z

src/Distribution/Server/Users/Types.hs

+instance FromBackendRow Sqlite UserAuth where
+  fromBackendRow = UserAuth . PasswdHash . unpack <$> fromBackendRow


FromField instead of FromBackendRow

bcardiff · 2026-02-15T01:45:07Z

Thanks for the feedback @LaurentRDC . I wasn't aware of FromField. The tutorial 3 and other examples I found online always used FromBackendRow.

Before jumping into FromField I wonder if, with the goal of being able to work with PG and Sqlite, if FromBackendRow isn't a better way to express how to encode the result. My (probably naive) way of thinking this, is that I want to declare custom types to have a more type safe description of the table, but ultimately I know which existing type my custom type should be mapped to: UserId -> Int32, UserName -> String, etc.. Optionally with some read/write conversion. With this mindset I really wish this could be driver agnostic. Relying on FromField seems Sqlite only solution. Am I wrong?

Maybe this is not the right place to have this conversation since it's a "how to use beam" kind of question. If the answer is not trivial we can move to haskell forum or somewhere else this bit.

LaurentRDC · 2026-02-15T02:10:39Z

You're right that the Beam tutorial shows an explicit implementation of FromBackendRow, so I guess it's not wrong to go with that.

I want to declare custom types to have a more type safe description of the table, but ultimately I know which existing type my custom type should be mapped to

If you can express this as a newtype, then this is where things are most ergonomic. For example, for a UserId ~ Int32 type:

newtype UserId = MkUserId Int32
    deriving stock (Show)
    deriving newtype (Postgres.FromField, Sqlite.FromField)

instance FromBackendRow Sqlite UserId
instance FromBackendRow Postgres UserId

I think this is the simplest way to have the representation of UserId be the same for the two drivers, although this isn't truly "driver agnostic", as you have to exhaustively derive the appropriate instances (Postgres.FromField and Sqlite.FromField)

bcardiff added 30 commits February 10, 2026 13:41

Wiring of a Feature.Database

8bb80d3

Move database initialization to the top

020798f

Read user details from database

3a018dc

Update user details

e18dea8

Manual AccountKind decoding/encoding from text

868640b

Add dependencies

ac15c36

Initialize db schema script

0830a12

Update admin info

9ece470

Migrate data from acid to database

990a0a8

Add connection pool, reorganize, introduce Transaction monad

3454536

Update test to use :memory: db

a3f796e

Refactor: testGolden is defined locally and used always in the same way

e9ddf63

Make testUserDetailsFeature reach the database

82047c6

seed the database

Add --database-path cli option for server

59a005a

Initialize database schema

b68e4aa

Embed the init_db.sql file. Test run on another directory

715a047

Depends on TemplateHaskell

Specify the test directory

b86b883

Not really needed since they run in a different directory

Change HackageDb table field

3eac683

So there is no need to override the convention

Narrow interface of Database module

cbc6479

Refactor to hide db initialization from tests

0de8d41

Expose whether the database is fresh

75df49c

To have a more controlled migration

Cleanup

2de425d

Use custom type for user id

7579f25

To use UserId directly we need to change it to a non machine-dependent size for Beam. MemSize and Pretty are not defined on Int32. But other places start to complain after that

Add users table and types

29610b7

Make DatabaseFeature a dependency of Users

433f9b4

Fix db schema setup

265529f

Database.SQLite.Simple.execute_ is not able to run multiple statements at once

Update readme

58403d1

Use enums for account kind

9f90e36

Enums are saved using Show and pared with Read They currently abort the request if the data is malformed

Migrate users state to database

b51ecb5

Use UserId, drop DBUserId

3fee1dd

Migrate auth tokens

5cd58fe

bcardiff changed the title ~~Database beam sqlite~~ Explore relational database state storage Feb 11, 2026

danbornside suggested changes Feb 11, 2026

View reviewed changes

LaurentRDC reviewed Feb 12, 2026

View reviewed changes

bcardiff added 3 commits February 14, 2026 22:07

Cleanup

bdb68fc

Use lookup_

42e5aab

Insert multiple rows at once

79aacce


		type AccountDetailsRow = AccountDetailsT Identity

		deriving instance Show AccountDetailsRow

		instance FromBackendRow Sqlite UsersStatus where
		fromBackendRow = read . unpack <$> fromBackendRow

		instance FromBackendRow Sqlite UserId where
		fromBackendRow = UserId . fromIntegral @Int32 <$> fromBackendRow

		instance FromBackendRow Sqlite UserName where
		fromBackendRow = UserName . unpack <$> fromBackendRow

		instance FromBackendRow Sqlite UserAuth where
		fromBackendRow = UserAuth . PasswdHash . unpack <$> fromBackendRow

Conversation

bcardiff commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danbornside left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LaurentRDC left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bcardiff commented Feb 15, 2026

Uh oh!

LaurentRDC commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bcardiff commented Feb 11, 2026 •

edited

Loading