Describe the feature you'd like to request
Perhaps an environment variable that either sets a batch size for the embedding or else defaults to the model's config for batch size.
Describe the solution you'd like
We have tried setting a dozen different places for adjusting the physical_batch_size in various configs, either in the ccb data or else in localai or even in our model yaml files, but context chat always seems to hit the same limit no matter what we do: batch size: 512
Describe alternatives you've considered
Perhaps not using localai embedding and instead the built-in embedding from ccb doesn't have this problem, but I'm not just yet ready to try it, since it took a month for me to get the ccb+local ai on one machine and nc aio on the other.
Describe the feature you'd like to request
Perhaps an environment variable that either sets a batch size for the embedding or else defaults to the model's config for batch size.
Describe the solution you'd like
We have tried setting a dozen different places for adjusting the physical_batch_size in various configs, either in the ccb data or else in localai or even in our model yaml files, but context chat always seems to hit the same limit no matter what we do: batch size: 512
Describe alternatives you've considered
Perhaps not using localai embedding and instead the built-in embedding from ccb doesn't have this problem, but I'm not just yet ready to try it, since it took a month for me to get the ccb+local ai on one machine and nc aio on the other.