-
-
Notifications
You must be signed in to change notification settings - Fork 25
Add an improved backend for the Cause search #658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,6 +15,7 @@ | |
|
|
||
| from donations.models.donors import Donor | ||
| from donations.models.ngos import Cause, Ngo | ||
| from utils.text.registration_number import probable_registration_number | ||
|
|
||
|
|
||
| class ConfigureSearch: | ||
|
|
@@ -99,7 +100,7 @@ def get_search_results(cls, queryset: QuerySet, query: str, language_code: str) | |
| return ngos | ||
|
|
||
|
|
||
| class CauseSearchMixin(CommonSearchMixin): | ||
| class DeprecatedCauseSearchMixin(CommonSearchMixin): | ||
| @classmethod | ||
| def get_search_results(cls, queryset: QuerySet, query: str, language_code: str) -> QuerySet[Cause]: | ||
| search_fields = ["name"] | ||
|
|
@@ -126,6 +127,44 @@ def get_search_results(cls, queryset: QuerySet, query: str, language_code: str) | |
| return causes | ||
|
|
||
|
|
||
| class CauseSearchMixin(CommonSearchMixin): | ||
| @classmethod | ||
| def get_search_results(cls, queryset: QuerySet, query: str, language_code: str) -> QuerySet[Cause]: | ||
| if settings.ENABLE_CAUSE_SEARCH_EXACT_MATCH: | ||
| query_filter = Q(name__icontains=query) | ||
| # If the query looks like a registration number then also try to find the main causes owned by | ||
| # organisations which have that registration number | ||
| registration_number = probable_registration_number(query) | ||
| if registration_number: | ||
| query_filter = query_filter | (Q(ngo__registration_number=registration_number) & Q(is_main=True)) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't we remove the initial
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't know if we have a real registration number or just some plain text. |
||
|
|
||
| exact_causes: QuerySet[Cause] = queryset.filter(query_filter).order_by("id").distinct("id") | ||
| if exact_causes.count(): | ||
| return exact_causes | ||
|
|
||
| search_vector: SearchVector = ConfigureSearch.vector(("name",), language_code) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should the vector be configured by what we're searching? |
||
| search_query: SearchQuery = ConfigureSearch.query(query, language_code) | ||
|
|
||
| if settings.ENABLE_CAUSE_SEARCH_WORD_SIMILARITY: | ||
| trigram_similarity = TrigramWordSimilarity(query, "name") | ||
| similarity_threshold = 0.4 | ||
| else: | ||
| trigram_similarity = TrigramSimilarity("name", query) | ||
| similarity_threshold = 0.3 | ||
|
|
||
| fuzzy_causes: QuerySet[Cause] = ( | ||
| queryset.annotate( | ||
| rank=SearchRank(search_vector, search_query), | ||
| similarity=trigram_similarity, | ||
| ) | ||
| .filter(Q(rank__gte=0.3) | Q(similarity__gt=similarity_threshold)) | ||
| .order_by("id") | ||
| .distinct("id") | ||
| ) | ||
|
|
||
| return fuzzy_causes | ||
|
|
||
|
|
||
| class NgoCauseMixedSearchMixin(CommonSearchMixin): | ||
| @classmethod | ||
| def get_search_results(cls, queryset: QuerySet, query: str, language_code: str) -> QuerySet[Cause]: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Search is a bit annoying, but we should take into account one path if we get a registration number (CIF) and a different one if it isn't that.
It seems a bit more straightforward if it's a registration number and we may need to just run a filter with
Q(ngo__registration_number=registration_number) & Q(is_main=True)and not much more (we probably don't need search vectors or other things.In this
ifstatement I'd branch out things and directlyreturn Cause.objects.filter(Q(ngo__registration_number=registration_number) & Q(is_main=True))or something like thatSimplifies the flow and leaves a clearer trail to follow once you only have the name.