Skip to content

Fix trending and most-popular returning identical results#12

Merged
rainxchzed merged 2 commits intomainfrom
category-specific-scoring
May 5, 2026
Merged

Fix trending and most-popular returning identical results#12
rainxchzed merged 2 commits intomainfrom
category-specific-scoring

Conversation

@rainxchzed
Copy link
Copy Markdown
Member

@rainxchzed rainxchzed commented May 5, 2026

Summary

User reported /v1/categories/trending/android and /v1/categories/most-popular/android return ~99% identical top-10 results. Confirmed:

```
trending → 2dust/v2rayNG, ReVanced/revanced-manager, MetaCubeX/ClashMetaForAndroid, ...
most-popular → 2dust/v2rayNG, ReVanced/revanced-manager, MetaCubeX/ClashMetaForAndroid, ...
```

7-8 of 10 are identical. Same root cause for topics across multiple categories.

Root cause

`RepoRepository.findByCategory` always sorted by the global `Repos.searchScore` regardless of category. Combined with category membership overlap (34 of 49 trending/android repos are also in most-popular/android), the same global score on a heavily-overlapping set produces near-identical lists.

The Python fetcher already populates per-category scores:

  • 722 repos have `trending_score` (velocity-flavoured)
  • 500 repos have `popularity_score` (absolute-flavoured)

These columns existed but were never used for sort ordering. The query used them only for response decoration (the `trendingScore` / `popularityScore` fields on `RepoResponse`).

Fix

Pick the primary sort column based on category:

Category Primary sort Fallback Tie-breaker
trending `trending_score` DESC NULLS LAST global `search_score` static `rank`
most-popular `popularity_score` DESC NULLS LAST global `search_score` static `rank`
new-releases `latest_release_date` DESC NULLS LAST global `search_score` static `rank`
(other) global `search_score` -- static `rank`

Per-category scores are written by the daily Python fetcher only for repos in that category, so the global `search_score` fallback handles newly-ingested rows that haven't been ranked yet.

`new-releases` previously also collapsed onto `search_score`. Now sorts by literal release date which matches user intent ("newest releases first").

Topics endpoint stays on global `search_score` -- topics are flat lists not flavour-segmented like the categories.

Test plan

  • Local: cannot compile -- the dev workstation's Gradle cache lives on an unplugged external volume. Relying on CI for this PR.
  • After merge + deploy, hit `/v1/categories/trending/android` and `/v1/categories/most-popular/android` and confirm the top-10 sets diverge meaningfully.
  • Confirm `/v1/categories/new-releases/{platform}` returns repos sorted by recent release date.

Summary by CodeRabbit

  • Improvements
    • Enhanced search result sorting to be category-aware—trending, popular, and new release searches now use optimized sorting algorithms for improved result relevance.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

Warning

Rate limit exceeded

@rainxchzed has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 56 minutes and 36 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6d231231-995b-4995-a31a-d043ed694f7c

📥 Commits

Reviewing files that changed from the base of the PR and between 0512856 and af75e41.

📒 Files selected for processing (1)
  • CLAUDE.md
📝 Walkthrough

Walkthrough

The findByCategory method in RepoRepository.kt now selects its primary sort field dynamically based on the category parameter: trending uses trendingScore, most-popular uses popularityScore, new-releases uses latestReleaseDate, and others use searchScore, with RepoCategories.rank as a secondary tie-breaker.

Changes

Category-Specific Query Ordering

Layer / File(s) Summary
Sort Logic
src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt
findByCategory applies category-dependent primary sort fields (trendingScore, popularityScore, latestReleaseDate, or searchScore) with RepoCategories.rank as ascending tie-breaker, replacing fixed searchScore ordering.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~5 minutes

Poem

🐰 Hop, hop—the queries now spring,
Each category finds its own sorting wing,
Trending trends high, popularity glows,
Freshest releases—wherever it goes!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main fix: resolving identical results between trending and most-popular categories by implementing category-specific sorting logic.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch category-specific-scoring

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt`:
- Around line 22-49: The findByCategory() implementation currently uses a
category-specific 'primary' expression (defined in the when block) as the first
ORDER BY key; change it so the ORDER BY follows the repository contract: use
Repos.searchScore DESC NULLS LAST as the first sort key and RepoCategories.rank
ASC as the second. Concretely, remove or stop using the local primary variable
in the orderBy call and update the orderBy(...) in the
Repos.innerJoin(...).selectAll().where(...) chain to order first by
Repos.searchScore to SortOrder.DESC_NULLS_LAST and then by RepoCategories.rank
to SortOrder.ASC, ensuring any category-specific columns are not promoted ahead
of searchScore.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c1150716-25ab-4b16-babd-1e03b942ac35

📥 Commits

Reviewing files that changed from the base of the PR and between 7791188 and 0512856.

⛔ Files ignored due to path filters (1)
  • .claude/scheduled_tasks.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt

Comment on lines +22 to 49
// Primary sort is category-specific: trending velocity for the
// trending list, absolute popularity for the popular list, release
// recency for new-releases. Without category-specific primary, both
// trending and most-popular collapse onto the same global
// search_score and return ~99% identical top-N results -- the bug
// this query previously had.
//
// Each category falls back to the global behavioral search_score
// when its category-specific column is NULL, then to the static
// rank the Python fetcher writes once a day. The fetcher populates
// the category-specific scores for repos in that category, so the
// fallback is mostly a no-op except for newly-ingested rows that
// haven't been reranked yet.
val primary: org.jetbrains.exposed.sql.Expression<*> = when (category) {
"trending" -> Repos.trendingScore
"most-popular" -> Repos.popularityScore
"new-releases" -> Repos.latestReleaseDate
else -> Repos.searchScore
}
Repos.innerJoin(RepoCategories, { id }, { repoId })
.selectAll()
.where {
(RepoCategories.category eq category) and (RepoCategories.platform eq platform)
}
.orderBy(
primary to SortOrder.DESC_NULLS_LAST,
Repos.searchScore to SortOrder.DESC_NULLS_LAST,
RepoCategories.rank to SortOrder.ASC,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

findByCategory ordering now violates the repository sorting contract

At Line 35-49, findByCategory no longer uses Repos.searchScore DESC NULLS LAST as the primary sort key; it now prioritizes category-specific columns. This conflicts with the required repository behavior and can create inconsistent ranking semantics across endpoints.

Please align ordering back to:

  1. Repos.searchScore DESC_NULLS_LAST
  2. RepoCategories.rank ASC

As per coding guidelines, "**/*Repository.kt: RepoRepository.findByCategory() and findByTopicBucket() must sort by searchScore DESC NULLS LAST, rank ASC—static rank is only a tie-breaker; behavioral signals dominate".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt` around lines 22
- 49, The findByCategory() implementation currently uses a category-specific
'primary' expression (defined in the when block) as the first ORDER BY key;
change it so the ORDER BY follows the repository contract: use Repos.searchScore
DESC NULLS LAST as the first sort key and RepoCategories.rank ASC as the second.
Concretely, remove or stop using the local primary variable in the orderBy call
and update the orderBy(...) in the Repos.innerJoin(...).selectAll().where(...)
chain to order first by Repos.searchScore to SortOrder.DESC_NULLS_LAST and then
by RepoCategories.rank to SortOrder.ASC, ensuring any category-specific columns
are not promoted ahead of searchScore.

@rainxchzed rainxchzed merged commit 0ece28a into main May 5, 2026
2 checks passed
@rainxchzed rainxchzed deleted the category-specific-scoring branch May 5, 2026 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant