server: preserve context checkpoint coverage#22826
Conversation
Instead of always removing the oldest context checkpoint, remove the one that appears most redundant based on the distance between its neighbors.
|
Hi @jacekpoplawski, thanks for your contribution! Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:
Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below. |
|
The idea is OK, but it is still a "poor-man" solution. The most optimal way to do the checkpoints is to leverage the changes in #21885 and take into account the structure of the conversation. |
If I understand correctly, #21885 would tell us where the important positions are, and checkpoint removal should prefer keeping checkpoints around those positions. Or do you mean that this information should be used when creating checkpoints instead? |
Yes, the information should be used for creating the checkpoints right before user inputs. |
|
Instead of always removing the oldest context checkpoint when the checkpoint limit is reached, remove the checkpoint that appears most redundant based on the distance between its neighbors.
Overview
This is my attempt to fix
forcing full prompt re-processing due to lack of cache dataThis changes the checkpoint removal policy: when the limit is reached, it removes an interior checkpoint whose neighboring checkpoints are closest together.
Additional information
I use the following arguments:
--ctx-checkpoints 24 --checkpoint-every-n-tokens 8192 --cache-ram 65536After just a few prompts in a pi coding agent, I see:
the server needed a checkpoint around
n_past = 3579, but all available checkpoints were much later, from20479to32656, causing full prompt re-processing.The root cause seems to be that checkpoints are not only created at the
--checkpoint-every-n-tokensinterval. Additional checkpoints can be created near prompt/request boundaries, and with the previous FIFO removal policy these dense recent checkpoints can erase older checkpoints.I first tried disabling the additional checkpoint creation, but that did not work well.
I tested this change with
--ctx-checkpoints 8to trigger checkpoint removal sooner and I could not reproduce theforcing full prompt re-processing due to lack of cache dataRequirements