Python: fix(redis): wrap IndexDefinition prefix in list; use _get_redis_key in JSON delete#13970
Conversation
…y in JSON delete
Two bugs in the Redis connector:
1. ensure_collection_exists passed prefix as a bare str to IndexDefinition.
redis-py iterates over it character by character, producing a PREFIX
clause with one entry per character instead of a single collection prefix.
Fix: wrap in a list.
2. RedisJsonCollection._inner_delete called json().delete(key) with the raw
caller-supplied key, but upsert stores keys as {collection_name}:{key}.
The delete always targeted a non-existent key and silently returned 0.
Fix: use self._get_redis_key(key), consistent with RedisHashsetCollection.
Fixes microsoft#13894, microsoft#13904
There was a problem hiding this comment.
Pull request overview
This PR fixes two Redis connector bugs in the Python vector store implementation (RedisCollection / RedisJsonCollection) to ensure RediSearch indexes are created with the intended key prefix and JSON deletions target the actual stored keys.
Changes:
- Wrap
IndexDefinition.prefixin a list soredis-pydoesn’t iterate the prefix string character-by-character. - Apply
_get_redis_key()when deleting JSON records so deletes work when key prefixing is enabled.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| fields = _definition_to_redis_fields(self.definition, self.collection_type) | ||
| index_definition = IndexDefinition( | ||
| prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type] | ||
| prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type] | ||
| ) | ||
| await self.redis_database.ft(self.collection_name).create_index(fields, definition=index_definition, **kwargs) |
| index_definition = IndexDefinition( | ||
| prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type] | ||
| prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type] | ||
| ) | ||
| await self.redis_database.ft(self.collection_name).create_index(fields, definition=index_definition, **kwargs) |
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 94%
✓ Correctness
Both fixes are correct and well-motivated. Bug 1 wraps the IndexDefinition prefix string in a list, which is required because redis-py's IndexDefinition iterates over the prefix argument — a bare string would be iterated character-by-character. Bug 2 adds the missing _get_redis_key() call in RedisJsonCollection._inner_delete, making it consistent with RedisHashsetCollection._inner_delete (line 581) and RedisJsonCollection._inner_get (line 699), both of which already use _get_redis_key to prepend the collection name prefix.
✓ Security Reliability
Both changes are correct and well-scoped bug fixes. Fix 1 wraps the IndexDefinition prefix in a list to prevent character-by-character iteration of the string — consistent with redis-py's expected API. Fix 2 adds the missing _get_redis_key() call in RedisJsonCollection._inner_delete, making it consistent with _inner_get (line 699) and the hashset collection's _inner_delete (line 581). No security or reliability concerns are introduced by these changes.
✓ Test Coverage
Both fixes are correct and address real bugs, but neither has unit test coverage that would actually catch a regression. Bug 1 (IndexDefinition prefix as list):
test_create_indexmockscreate_indexentirely, so theIndexDefinitionis constructed but never inspected — the test passes whether prefix is a string or a list. Bug 2 (JSON delete with_get_redis_key):test_deleteonly uses non-prefix collections (prefix_collection_name_to_key_names=False), making_get_redis_keya no-op. The fix is only meaningful with prefixing enabled, but no such test exists for delete. Compare withtest_get, which does parametrize overprefix=True/Falsefor both collection types. Both fixes should have corresponding tests that verify the corrected behavior.
✗ Design Approach
The JSON delete change follows the existing key-normalization pattern correctly, but the index-definition fix is still incomplete: it wraps the prefix in a list, yet the auto-created index prefix remains unconditional and does not match the default write path when
prefix_collection_name_to_key_namesis left at its defaultFalse.
Flagged Issues
-
python/semantic_kernel/connectors/redis.py:281hard-codes an index prefix ofcollection_name:even though both concrete collections defaultprefix_collection_name_to_key_names=False(redis.py:524, redis.py:652) and store keys via_get_redis_key, which returns the raw key when that flag is false (redis.py:250-253, :599, :723). The defaultupsertwritesid1while the index only matchestest:-prefixed keys. The auto-generated index definition should derive its prefix fromprefix_collection_name_to_key_namesinstead of always forcingcollection_name:.
Automated review by nileshpatil6's agents
| @override | ||
| async def _inner_delete(self, keys: Sequence[str], **kwargs: Any) -> None: | ||
| await asyncio.gather(*[self.redis_database.json().delete(key, **kwargs) for key in keys]) | ||
| await asyncio.gather(*[self.redis_database.json().delete(self._get_redis_key(key), **kwargs) for key in keys]) |
There was a problem hiding this comment.
Missing test coverage. This fix is correct — RedisHashsetCollection._inner_delete (line 581) already uses _get_redis_key. However, the existing test_delete (test_redis_store.py:274) only uses non-prefix collections (collection_hash/collection_json), where _get_redis_key returns the key unchanged. The test passes identically with the old and new code. Please add a test with collection_with_prefix_json that asserts json().delete() receives 'test:id1', not 'id1'.
| fields = _definition_to_redis_fields(self.definition, self.collection_type) | ||
| index_definition = IndexDefinition( | ||
| prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type] | ||
| prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type] |
There was a problem hiding this comment.
Design mismatch & missing test coverage. Wrapping the prefix in a list fixes the redis-py API usage, but the index prefix is still unconditionally set to collection_name:. Both RedisHashsetCollection and RedisJsonCollection default prefix_collection_name_to_key_names=False (redis.py:524, redis.py:652), so _get_redis_key returns the raw key (redis.py:250-253). Default upsert writes id1 while the index only matches test:-prefixed keys. The prefix should be derived from prefix_collection_name_to_key_names. Additionally, test_create_index (test_redis_store.py:294) mocks create_index without inspecting its arguments — consider asserting on the mock's call_args to verify the definition was constructed with the correct prefix.
…x_collection_name_to_key_names When prefix_collection_name_to_key_names=False (the default for both RedisHashsetCollection and RedisJsonCollection), _get_redis_key() returns the raw key, so upsert stores e.g. "id1". The previous fix still unconditionally set prefix=["collection_name:"] in ensure_collection_exists, meaning the RediSearch index only matched prefixed keys while records were stored under raw keys — vector search would return nothing by default. Fix: only set the prefix when prefix_collection_name_to_key_names=True; otherwise create the IndexDefinition without a prefix so the index matches all keys regardless of naming scheme.
…elete key - test_create_index_respects_prefix_flag: asserts that ensure_collection_exists passes a list prefix ["test:"] to create_index when prefix_collection_name_to_key_names=True, and no prefix when False. - test_delete_json_with_prefix: asserts that RedisJsonCollection._inner_delete calls json().delete() with the prefixed key "test:id1" when prefix_collection_name_to_key_names=True.
Summary
Two bugs in
python/semantic_kernel/connectors/redis.py:Bug 1 —
IndexDefinitionprefix iterates character-by-character (fixes #13894)ensure_collection_existspassed a barestras theprefixargument:redis-py'sIndexDefinitionaccepts an iterable of prefixes. When passed a plain string it iterates character by character, generating aPREFIX N c o l l e c t i o n ...clause instead ofPREFIX 1 collection_name:. This causes the index to match nothing (or incorrectly).Bug 2 —
RedisJsonCollection._inner_deletesilently no-ops (fixes #13904)_inner_deletepassed the raw caller key tojson().delete():But
_single_upsertstores keys as{collection_name}:{key}(via_get_redis_key). The delete always targeted a non-existent key and returned 0 without raising.RedisHashsetCollection._inner_deletealready uses_get_redis_keycorrectly.Testing
Both fixes are straightforward and can be verified with a live Redis instance by:
RedisJsonCollection, upserting a record, then deleting it and confirming it's gone.FT.INFO <collection_name>to confirm thePREFIXclause is1 <collection_name>:afterensure_collection_exists.