Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 40 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,13 +80,51 @@ This provider is in early development. It supports **read-only queries** — you

`Math.Abs`, `Floor`, `Ceiling`, `Round`, `Truncate`, `Pow`, `Sqrt`, `Cbrt`, `Exp`, `Log`, `Log2`, `Log10`, `Sign`, `Sin`, `Cos`, `Tan`, `Asin`, `Acos`, `Atan`, `Atan2`, `RadiansToDegrees`, `DegreesToRadians`, `IsNaN`, `IsInfinity`, `IsFinite`, `IsPositiveInfinity`, `IsNegativeInfinity` — with both `Math` and `MathF` overloads.

### INSERT via SaveChanges

`SaveChanges` supports INSERT operations using the driver's native `InsertBinaryAsync` API — RowBinary encoding with GZip compression, far more efficient than parameterized SQL.

```csharp
await using var ctx = new AnalyticsContext();

ctx.PageViews.Add(new PageView
{
Id = 1,
Path = "/home",
Date = new DateOnly(2024, 6, 15),
UserAgent = "Mozilla/5.0"
});

await ctx.SaveChangesAsync();
```

Entities transition from `Added` to `Unchanged` after save, just like any other EF Core provider.

**Batch size** is configurable (default 1000) — controls how many entities are accumulated before flushing to ClickHouse:

```csharp
optionsBuilder.UseClickHouse("Host=localhost", o => o.MaxBatchSize(5000));
```

Comment on lines +103 to +108
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation shows configuring MaxBatchSize via o => o.MaxBatchSize(5000), but there is currently no MaxBatchSize option exposed on ClickHouseDbContextOptionsBuilder/ClickHouseOptionsExtension in this PR (and ClickHouseModificationCommandBatchFactory references a missing MaxBatchSize member). Either implement and document the option end-to-end, or remove/adjust this section to match the actual public API.

Suggested change
**Batch size** is configurable (default 1000) — controls how many entities are accumulated before flushing to ClickHouse:
```csharp
optionsBuilder.UseClickHouse("Host=localhost", o => o.MaxBatchSize(5000));
```
**Batch size** (currently fixed at 1000) controls how many entities are accumulated before flushing to ClickHouse.

Copilot uses AI. Check for mistakes.
### Bulk Insert

For high-throughput loads that don't need change tracking, use `BulkInsertAsync`:

```csharp
var events = Enumerable.Range(0, 100_000)
.Select(i => new PageView { Id = i, Path = $"/page/{i}", Date = DateOnly.FromDateTime(DateTime.Today) });

long rowsInserted = await ctx.BulkInsertAsync(events);
```

This calls `InsertBinaryAsync` directly, bypassing EF Core's change tracker entirely. Entities are **not** tracked after insert.

### Not Yet Implemented

- INSERT / UPDATE / DELETE (modification commands are stubbed)
- UPDATE / DELETE (ClickHouse mutations are async, not OLTP-compatible)
- Migrations
- JOINs, subqueries, set operations
- Advanced types: Array, Tuple, Nullable(T), LowCardinality, Nested, TimeSpan/TimeOnly
- Batched inserts

## Building

Expand Down
58 changes: 58 additions & 0 deletions src/EFCore.ClickHouse/Extensions/ClickHouseBulkInsertExtensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
using ClickHouse.EntityFrameworkCore.Storage.Internal;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Infrastructure;
using Microsoft.EntityFrameworkCore.Metadata;

namespace ClickHouse.EntityFrameworkCore.Extensions;

public static class ClickHouseBulkInsertExtensions
{
/// <summary>
/// Inserts entities into ClickHouse using the driver's native binary insert protocol.
/// This bypasses EF Core change tracking entirely and is intended for high-throughput bulk loads.
/// Entities are NOT tracked or marked as Unchanged after insert.
/// </summary>
public static async Task<long> BulkInsertAsync<TEntity>(
this DbContext context,
IEnumerable<TEntity> entities,
CancellationToken cancellationToken = default) where TEntity : class
{
var connection = context.GetService<IClickHouseRelationalConnection>();
var client = connection.GetClickHouseClient();

var entityType = context.Model.FindEntityType(typeof(TEntity))
?? throw new InvalidOperationException(
$"The entity type '{typeof(TEntity).Name}' is not part of the model for the current context.");

var tableName = entityType.GetTableName()
?? throw new InvalidOperationException(
$"The entity type '{typeof(TEntity).Name}' is not mapped to a table.");

// Build column list and property accessors
var properties = entityType.GetProperties()
.Where(p => p.GetTableColumnMappings().Any())
.ToList();

var columns = properties
.Select(p => p.GetTableColumnMappings().First().Column.Name)
.ToList();

var accessors = properties
.Select(p => p.GetGetter())
.ToList();

// Convert entities to row arrays
// TODO quite inefficient, update this after adding direct POCO insert to client API
var rows = entities.Select(entity =>
{
var row = new object[accessors.Count];
for (var i = 0; i < accessors.Count; i++)
{
row[i] = accessors[i].GetClrValue(entity) ?? DBNull.Value;
}
return row;
});

return await client.InsertBinaryAsync(tableName, columns, rows, cancellationToken: cancellationToken);
}
}
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
using System.Data;
using System.Data.Common;
using ClickHouse.Driver;
using ClickHouse.Driver.ADO;
using ClickHouse.EntityFrameworkCore.Infrastructure.Internal;
using Microsoft.EntityFrameworkCore;
Expand Down Expand Up @@ -69,4 +70,14 @@ public override Task<IDbContextTransaction> BeginTransactionAsync(
IsolationLevel isolationLevel,
CancellationToken cancellationToken = default)
=> Task.FromResult<IDbContextTransaction>(new ClickHouseTransaction());

public IClickHouseClient GetClickHouseClient()
{
if (_dataSource is ClickHouseDataSource clickHouseDataSource)
return clickHouseDataSource.GetClient();

throw new InvalidOperationException(
"Cannot obtain IClickHouseClient. The connection must be configured with a connection string " +
"or ClickHouseDataSource, not a raw DbConnection.");
}
Comment on lines +74 to +82
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetClickHouseClient() only works when _dataSource is a ClickHouseDataSource, but UseClickHouse(DbDataSource ...) / ClickHouseOptionsExtension.DataSource accept any DbDataSource. This introduces a runtime failure path for contexts configured with a non-ClickHouse DbDataSource which previously worked (at least for read scenarios). Consider either (a) restricting the public API to ClickHouseDataSource, (b) validating the type early when configuring options, or (c) providing a fallback client creation strategy when _dataSource isn't ClickHouseDataSource (if feasible).

Copilot uses AI. Check for mistakes.
}
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
using ClickHouse.Driver;
using Microsoft.EntityFrameworkCore.Storage;

namespace ClickHouse.EntityFrameworkCore.Storage.Internal;

public interface IClickHouseRelationalConnection : IRelationalConnection
{
IClickHouseRelationalConnection CreateMasterConnection();
IClickHouseClient GetClickHouseClient();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
using ClickHouse.EntityFrameworkCore.Storage.Internal;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Storage;
using Microsoft.EntityFrameworkCore.Update;

namespace ClickHouse.EntityFrameworkCore.Update.Internal;

public class ClickHouseModificationCommandBatch : ModificationCommandBatch
{
private readonly List<IReadOnlyModificationCommand> _commands = [];
private readonly int _maxBatchSize;
private bool _completed;
private bool _moreExpected;

public ClickHouseModificationCommandBatch(int maxBatchSize)
{
_maxBatchSize = maxBatchSize;
}

public override IReadOnlyList<IReadOnlyModificationCommand> ModificationCommands => _commands;

public override bool RequiresTransaction => false;

public override bool AreMoreBatchesExpected => _moreExpected;

public override bool TryAddCommand(IReadOnlyModificationCommand modificationCommand)
{
if (_completed)
throw new InvalidOperationException("Batch has already been completed.");

if (modificationCommand.EntityState is EntityState.Modified)
throw new NotSupportedException(
"UPDATE operations are not supported by the ClickHouse EF Core provider. " +
"ClickHouse mutations (ALTER TABLE ... UPDATE) are asynchronous and not OLTP-compatible.");

if (modificationCommand.EntityState is EntityState.Deleted)
throw new NotSupportedException(
"DELETE operations are not supported by the ClickHouse EF Core provider. " +
"ClickHouse mutations (ALTER TABLE ... DELETE) are asynchronous and not OLTP-compatible.");

if (modificationCommand.EntityState is not EntityState.Added)
throw new NotSupportedException(
$"Unexpected entity state '{modificationCommand.EntityState}'. " +
"The ClickHouse EF Core provider only supports INSERT (EntityState.Added).");

// Block server-generated values (ClickHouse has no RETURNING / auto-increment)
foreach (var columnMod in modificationCommand.ColumnModifications)
{
if (columnMod.IsRead)
throw new NotSupportedException(
$"Server-generated values are not supported by the ClickHouse EF Core provider. " +
$"Column '{columnMod.ColumnName}' on table '{modificationCommand.TableName}' is configured " +
$"to read a value back from the database after INSERT. Remove ValueGeneratedOnAdd() or " +
$"use HasValueGenerator() with a client-side generator instead.");
}

if (_commands.Count >= _maxBatchSize)
return false;

_commands.Add(modificationCommand);
return true;
}

public override void Complete(bool moreBatchesExpected)
{
_completed = true;
_moreExpected = moreBatchesExpected;
}

public override void Execute(IRelationalConnection connection)
=> ExecuteAsync(connection).GetAwaiter().GetResult();

public override async Task ExecuteAsync(
IRelationalConnection connection,
CancellationToken cancellationToken = default)
{
if (_commands.Count == 0)
return;

var clickHouseConnection = (IClickHouseRelationalConnection)connection;
var client = clickHouseConnection.GetClickHouseClient();

// Group commands by table name and write-column set for correct row alignment
var groups = _commands.GroupBy(c => (
c.TableName,
Columns: string.Join(",", c.ColumnModifications.Where(cm => cm.IsWrite).Select(cm => cm.ColumnName))));

foreach (var group in groups)
{
var tableName = group.Key.TableName;
var commands = group.ToList();

var columns = commands[0].ColumnModifications
.Where(cm => cm.IsWrite)
.Select(cm => cm.ColumnName)
.ToList();

var rows = commands.Select(cmd =>
{
var writeColumns = cmd.ColumnModifications.Where(cm => cm.IsWrite).ToList();
var row = new object[writeColumns.Count];
for (var i = 0; i < writeColumns.Count; i++)
{
row[i] = writeColumns[i].Value ?? DBNull.Value;
}
return row;
});

await client.InsertBinaryAsync(tableName, columns, rows, cancellationToken: cancellationToken);
}
}
}
Original file line number Diff line number Diff line change
@@ -1,15 +1,22 @@
using ClickHouse.EntityFrameworkCore.Infrastructure.Internal;
using Microsoft.EntityFrameworkCore.Infrastructure;
using Microsoft.EntityFrameworkCore.Update;

namespace ClickHouse.EntityFrameworkCore.Update.Internal;

public class ClickHouseModificationCommandBatchFactory : IModificationCommandBatchFactory
{
public ClickHouseModificationCommandBatchFactory(ModificationCommandBatchFactoryDependencies dependencies)
private const int DefaultMaxBatchSize = 1000;
private readonly int _maxBatchSize;

public ClickHouseModificationCommandBatchFactory(
ModificationCommandBatchFactoryDependencies dependencies)
{
_maxBatchSize = dependencies.CurrentContext.Context.GetService<IDbContextOptions>()
.Extensions.OfType<ClickHouseOptionsExtension>()
.FirstOrDefault()?.MaxBatchSize ?? DefaultMaxBatchSize;
Comment on lines +15 to +17
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClickHouseOptionsExtension in this repo currently doesn't define a MaxBatchSize member, so this won't compile (.FirstOrDefault()?.MaxBatchSize). Either add/pipe through a MaxBatchSize option on ClickHouseOptionsExtension (and surface it via ClickHouseDbContextOptionsBuilder), or remove this options lookup and keep the hard-coded default until the option exists. Also consider validating the configured value is > 0 to avoid creating batches that can never accept a command.

Suggested change
_maxBatchSize = dependencies.CurrentContext.Context.GetService<IDbContextOptions>()
.Extensions.OfType<ClickHouseOptionsExtension>()
.FirstOrDefault()?.MaxBatchSize ?? DefaultMaxBatchSize;
var options = dependencies.CurrentContext.Context.GetService<IDbContextOptions>();
var extension = options.Extensions.OfType<ClickHouseOptionsExtension>().FirstOrDefault();
if (extension != null)
{
var maxBatchSizeProperty = extension.GetType().GetProperty("MaxBatchSize");
var value = maxBatchSizeProperty?.GetValue(extension);
if (value is int configuredMaxBatchSize && configuredMaxBatchSize > 0)
{
_maxBatchSize = configuredMaxBatchSize;
}
else
{
_maxBatchSize = DefaultMaxBatchSize;
}
}
else
{
_maxBatchSize = DefaultMaxBatchSize;
}

Copilot uses AI. Check for mistakes.
}

public ModificationCommandBatch Create()
=> throw new NotSupportedException(
"SaveChanges write operations are not supported by ClickHouse.EntityFrameworkCore yet. " +
"This provider currently supports read-only query scenarios.");
=> new ClickHouseModificationCommandBatch(_maxBatchSize);
}
Loading
Loading