OpenVINO BK fallback to Llama.CPP CPU BK #27
+305
−7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enabling the OpenVINO backend fallback to Llama.cpp CPU backend mechanisms.
Below is a summary of the main process:
1. Dynamic Dimension Computation
Function: compute_cgraph_dynamic_dims()
Purpose: Determines the dynamic dimensions for each node in the computation graph. This is essential for handling nodes with variable shapes during runtime.
Process:
Traverses the computation graph.
Assigns dynamic dimension indices to nodes based on their operation type and dependencies.
Handles specific operations like [GGML_OP_VIEW], [GGML_OP_RESHAPE], and others to propagate dynamic dimensions.
2. Adding Extra Model Outputs
Function: add_extra_model_outputs_for_fallback()
Purpose: Ensures that all relevant nodes in the computation graph are included as model outputs for fallback scenarios.
Process:
Maps tensor data addresses to their corresponding nodes, excluding [GGML_OP_VIEW] nodes.
Adds nodes to the [m_model_outputs] map if they are not already present.
3. Adding Extra Model Inputs
Function: add_extra_model_inputs_for_fallback()
Purpose: Ensures that all necessary input nodes are included as model inputs for fallback scenarios.
Process:
Iterates through the source nodes of each computation graph node.
Skips nodes already in [m_model_weights] or [m_model_inputs].
Excludes intermediate nodes from [m_node_info_list].
Creates OpenVINO parameter nodes for eligible source nodes and updates the [m_inputs] and [m_model_inputs] maps.
For example: