HIVE-29391: Enable Independent Scaling of HMS REST Catalog from HMS#6270
HIVE-29391: Enable Independent Scaling of HMS REST Catalog from HMS#6270deniskuzZ merged 2 commits intoapache:masterfrom
Conversation
f846f08 to
d02fe25
Compare
|
The intention sounds great! I have one challenge: Do we really need active-passive with ZooKeeper? RESTful API should always be able to make use of a load balancer, whose configuration is typically easier than ZK. |
|
@okumin Active-passive mode is not necessary for scaling, but active-active seems to what is needed. I used Zookeeper for consistency and code reuse, because it is already used in several places in Hive. |
@difin FYI there is an ongoing work to decommission Zookeeper. btw, why do we need coordinator here? i would envision the following flow: |
while we're looking for native kubernetes alternatives for things we're currently doing with ZK, ZK is still a valid choice, as getting rid of it in the whole hive codebase would be too much in one go, especially because it's battle-tested, so reusing |
I also think this is more than enough in most cases. |
|
Thanks everyone! I am changing the implementation to support this flow, without Zookeeper:
|
fc55364 to
139244e
Compare
139244e to
de28c69
Compare
de28c69 to
b20f5e7
Compare
7afe72b to
d00bf81
Compare
| if (configExtWarehouse != null) { | ||
| properties.put("external-warehouse", configExtWarehouse); | ||
| // HiveCatalog reads this property directly from Configuration, not from properties map | ||
| configuration.set("hive.metastore.warehouse.external.dir", configExtWarehouse); |
There was a problem hiding this comment.
You might not have an answer. I'm just curious what makes the drift between Iceberg's HiveCatalog and our HiveCatalog.
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 |
There was a problem hiding this comment.
Please use asf.header at this point. I locally have a follow-up patch of HIVE-29245, but it's not complete.
https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=6270&issueStatuses=OPEN,CONFIRMED&sinceLeakPeriod=true
...st-catalog/src/main/java/org/apache/iceberg/rest/standalone/StandaloneRESTCatalogServer.java
Show resolved
Hide resolved
| * | ||
| * <p>Multiple instances can run behind a Kubernetes Service for load balancing. | ||
| */ | ||
| public class StandaloneRESTCatalogServer { |
There was a problem hiding this comment.
would be nice to use spring-boot or something, but that's not a blocker
...etastore/metastore-rest-catalog/src/main/java/org/apache/iceberg/rest/HMSCatalogFactory.java
Show resolved
Hide resolved
a233585 to
a3ce8ce
Compare
a3ce8ce to
4c81548
Compare
|



What changes were proposed in this pull request?
Adds a standalone HMS REST Catalog Server that can scale independently from HMS.
Currently, HMS REST Catalog Server is tied to HMS and can only be started together with HMS in a single instance.
This PR introduces a standalone REST Catalog server that:
Architecture:
Client → Kubernetes Load Balancer/API Gateway → Standalone REST Catalog Server → HMS
Why are the changes needed?
Allows independent scaling of the HMS REST Catalog Server from HMS, enabling:
Does this PR introduce any user-facing change?
Yes. Adds a new standalone server mode:
How was this patch tested?
New integration tests.