Skip to content

[Feature] Select specific node by parameters when proxy #523

@zghong

Description

@zghong

Is your feature request related to a problem? Please describe.
No.

Describe the solution you'd like
For a ClickHouse cluster with 2 shards and 2 replicas, the chproxy configuration could be as follows:

replicas:
  - name: "replica1"
    nodes: ["127.0.1.1:8123", "127.0.1.2:8123"]
  - name: "replica2"
    nodes: ["127.0.2.1:8123", "127.0.2.2:8123"]

By default, chproxy will select a node from these 4 nodes based on load balancing for proxy. However, sometimes I want to specify some node(s) to execute query, for example:

  • Scenario 1: Data (already divided according to the sharding_key of the distributed table) needs to be written into the local table of a specific shard (e.g., the 2nd shard), because it is more efficient.
  • Scenario 2: Query logs or execute maintenance commands on a specific node (e.g., node 127.0.1.2:8123).

In these cases, can we control chproxy's node selection strategy with parameters like /?replica=2 and /?replica=1&node=2?

Describe alternatives you've considered
By the way, why use keywords replica and node instead of shard and replica? Perhaps this naming would make the cluster topology clearer, for example:

shards:
  - name: "shard1"
    replicas: ["127.0.1.1:8123", "127.0.1.2:8123"]
  - name: "shard2"
    replicas: ["127.0.2.1:8123", "127.0.2.2:8123"]

So, we can select the specific node with parameters like /?shard=2 and /?shard=1&replica=2. If the feature is developed, which naming is better? Or is it possible to modify the configuration specification?

Additional context
No.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions