-
-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Description
The current implementation of the operator uses a Kubernetes Job workload running spark-submit in order to create Spark applications.
While this works fine for most cases, it has several drawbacks that cannot be easily addressed.
The primary reason for these drawbacks is the fact the Job pod adds a level of indirection between the operator and the applications such that the operator loses some control over parts of the resources owned by the application.
This architecture makes it unnecessarily complex to run applications in "client" mode. But this mode has many benefits over the "cluster" mode:
- It makes application monitoring more robust and reliable.
- It makes the application life cycle management more robust. This is particularly important for streaming applications.
- It makes application owned resource cleanup configurable and transparent.
Note
In client mode, the driver runs on the same Pod as as the submit command and it requires a headless service where the executors can connect back to the driver. To create this service, the operator needs to know the name of the driver Pod. The operator cannot know the name of this Pod until it creates the Job object at which point it doesn't have control over any other resources (most notably executor Pods). This lack of control can then lead to application startup failures.
Further improvements:
- Allow users to delete application CRs on condition (always, onSuccess) to free up resources and make app names reusable.
Proposal
Summary: Introduce a new version of the SparkApplication CRD without the spec.job field and that runs in client mode only.
- How does it make monitoring more robust?
When the operator is in control of driver and executor pods, it can take all of them in consideration when updating the application status.
Currently only the driver pod is taken into consideration.
- How does it make application life cycle more robust?
The operator can recognize failure states of both driver and executors and can act accordingly.
Currently the Job workload resubmits applications without much control from the operator.
In some cases, failure states are not propagated properly from executors to the Job pod, which leaves the application "hanging" in a failure state.
- How does it make resource cleanup better?
All resources owned by the application are created by the operator and not by the submit job.
Therefore the operator knows exactly what it needs to clean up and when.