Add book_slate_recommendation template (Graph + Prescriptive CSP) by chriscoey · Pull Request #59 · RelationalAI/templates

chriscoey · 2026-05-08T01:34:39Z

What this template adds

A Graph + Prescriptive (CSP) recsys template that picks K books per reader from a heterogeneous knowledge graph and orders them by slate position. Slot 1 is the hero (top of row, highest engagement); position-decay is the canonical recsys engagement model, so the order matters as much as the selection. Constraints: cardinality, slot uniqueness, already-read exclusion, author uniqueness, subject concentration cap, freshness floor, in-house exposure floor, cold-start cap, hero pin, and explanation-path floor (10 ICs). Objective maximizes sum((K + 1 - slot) * path_count_total) -- the canonical position-decay engagement model.

Three things make this template distinctive:

Bounded paths-library walks drive the candidate set. Item.connected_to.repeat(1, MAX_HOPS).all_paths() (relationalai.semantics.std.paths) walks generate the Candidate concept and its per-(user, candidate) typed-evidence counts; Graph.triangle_count() over the book-similarity graph drives the slot-1 hero pin to a structurally-central pick.
Ordered slate via integer slot decisions. Each Candidate.slot ∈ {1, ..., K, K+1} where K+1 is the unpicked sentinel, so the position weight (K+1 - slot) is 0 at unpicked and no auxiliary picked-indicator is needed. The same encoding handles cardinality, position decay, and the per-pick explanation weighting in one decision variable.
Pure-integer CSP on MiniZinc. Per-pair count caps (GCC idiom) for slot / author / subject uniqueness, plus a per-user existential count for the hero pin. all_different would conflict with the shared K+1 sentinel.

Modeling patterns this surfaces

Heterogeneous-KG Item super-concept with typed sub-concepts (User / Book / Author / Subject), plus a single 2-arity Item.connected_to super-edge populated as the symmetric union of typed edges. The unified-edge layer is needed because a path() call walks one 2-arity relationship at a time.
Direct shared-entity joins for typed evidence (path_count_via_author, path_count_via_subject) sit alongside a true bounded-walk count (path_count_via_kg_walk). Three integer features, blended via path_count_total into both IC clauses and the objective.
count(...).per(c).where(...) | 0 densification so every Candidate has every typed-evidence property defined (otherwise sum-over-pick aggregates silently undercount). Arithmetic sum (a + s + w), not sum(model.union(a, s, w)) -- union inside an aggregate body deduplicates on projected values.
K+1 unpicked sentinel in the slot domain absorbs both the pick/unpicked partition and the engagement-decay weighting. The position weight (K+1-slot) evaluates to 0 at unpicked, K at the hero slot, and decays monotonically.
Per-pair count caps as PyRel's CSP-native shape for "no value repeats more than X times across decisions". count(distinct ...) is rejected by the prescriptive rewriter today; the per-pair cap form compiles to MiniZinc GCC propagation.
Pre-solve Python-level assertion that materialises the Candidate set, anti-joins against User.read, and refuses to solve if any user fails any of the per-IC feasibility necessary conditions. The error message lists affected users per shortfall and a strategy block keyed by which condition fired -- sparse customer data hits a clear ValueError rather than a quiet INFEASIBLE solve.
Real-world Open Library (CC0) bundled slice, with an in-tree --size sm|md|lg fetch script that caches under data/_cache/, atomic on write, JSON-validated on read, and process-pid-tagged so concurrent runs don't race.

Privacy

Marked private: true so it ships only on the private docs site for now -- same gating pattern used for the predictive (GNN) templates while the paths library matures.

Verification

Live run on relationalai==1.1.0, MiniZinc backend: status OPTIMAL, objective 648, num_points 1; problem.verify() re-evaluates all 10 ICs in the returned solution clean.
Pre-solve assertion fires before any model rule installs and surfaces actionable per-user shortfall lists.
Ruff clean; data integrity validated (FK edges, unique keys on entity tables and similarity edges, in_house ∈ {0, 1} domain check, non-negative age_days).

References

Eksombatchai et al., Pixie (WWW 2018); Wang et al., KGAT (KDD 2019); Wang et al., KPRN (AAAI 2019); Xian et al., PGPR (SIGIR 2019); Ying et al., PinSage (KDD 2018); Wang et al., K-RagRec (ACL 2025).

Three-pillar Graph + Paths + Prescriptive (MIP) template. Sketch state -- not yet validated end-to-end. Next step is an E2E run against a live RAI account to debug the pipeline. Pipeline: - Graph: PageRank over Movie.similar_to graph -> structural prior. - Paths: bounded explanation-path enumeration (<=3 hops) -> path-counts-by-type as integer features. - Prescriptive (HiGHS MIP): K-item slate per user under genre / director / actor diversity, freshness floor, originals exposure floor, cold-start cap, explanation-path floor; objective blends PageRank prior with path signal. Lead dataset: small hand-crafted sample modelled on the MovieLens-1M-KG schema (KGAT distribution). README points at the KGAT distribution for the realistic-instance build. Production precedent: Pinterest Pixie (random-walk recsys at production scale); Alibaba iGraph, eBay KPRN, LinkedIn Career Explorer, GE Healthcare KARE; regulatory drivers (GDPR Art. 22, EU AI Act Art. 86, ECJ C-203/22). Plan: ~/plans/csp-templates-coverage-epic.md "kg_aware_slate_recommendation" deferred-candidate entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Working pipeline: status OPTIMAL, objective 9779.76, all 9 ICs verified on the bundled sample. Watched-exclusion confirmed (user 3 correctly skipped la_la_land which they had already watched). Changes vs. the initial sketch: - Item supertype + Item.connected_to(Item, Item) unified edge. Workaround for the v1.1.0 paths-lib gap on multi-edge path() (paths/README.md "Currently unsupported patterns" §1, design epic RAI-44166). Preserves real heterogeneous KG bounded walks (User -> Movie -> Director -> Movie ...) instead of falling back to a Movie.similar_to-only walk. - PageRank stays Float (HiGHS handles float coefficients on binary decisions natively, same pattern as supply_chain). Dropped the (score * SCALE).cast(Integer) rescale that doesn't lift in 1.1.0. - Problem(model, Float); Candidate.pick is a Float-typed bin var to match supply_chain's binary-on-Float-problem pattern. - Watched-exclusion via a pick == 0 IC at the prescriptive layer (negation in rules not yet supported in 1.1.0; same gap compliance_rule_audit documents). - User.watched ingest uses a named Integer.ref() for rating to avoid the unground-variable typer error. - Slate size K reduced from 8 to 3 to fit the bundled 24-movie / 4-user sample (production K is 8-12 with MovieLens-1M-KG). - Renamed path_count_via_similar -> path_count_via_kg_walk to reflect that this count is now the heterogeneous KG-walk count, not a similar_to-only count. - README pipeline section updated to document the Item supertype + Item.connected_to layer and call out the v1.1.0 gap workaround. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Switch the bundled dataset and template domain from synthetic movies to a deterministic slice of Open Library (~60 books, ~58 authors, 12 subjects). Open Library publishes its bibliographic catalogue under CC0, so the template ships in full without licensing exposure (MovieLens / Goodreads / Amazon-Book all carry non-commercial clauses incompatible with shippable customer templates). - Add data/fetch_open_library_slice.py: deterministic, cached fetcher with sm/md/lg size profiles. Pulls works + authors + subjects from the public Open Library API and emits the 10-CSV bundle. Synthetic users / read events / similar_to edges are generated on top. - Rename Movie -> Book, Director -> Author, Genre -> Subject. Drop the Actor concept; the four-type heterogeneous KG (User, Book, Author, Subject) is plenty for KG-walks story. - Apply the documented `| 0` default-value pattern to every count(...).per(c) expression so path_count_via_author / _via_subject / _via_kg_walk are defined for *every* Candidate, not just those with at least one match. Without this, the composite path_count_total = via_a + via_s + via_kg drops any Candidate missing one operand, which collapses the explanation-floor MIP constraint to zero and renders feasible problems infeasible. Confirmed via problem.display(): every user now has a richly-grounded explanation IC (~30 terms / user) vs the prior 0-1 terms. - Final E2E: status OPTIMAL, objective 9971.95, all 8 ICs verified (slate-size, exclude-read, subject-diversity, author-uniqueness, freshness, originals, cold-start cap, explanation floor). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…n-dedup pitfall Add experiments/count_variants.py + README probing six formulations of the per-(user, candidate) typed path-count features. Confirms (with problem.display) that the production form (variant A: three counts each | 0, arithmetic sum) is correct and that sum(model.union(propA, propB, propC)) silently undercounts under value collisions because union deduplicates on projected values. No prescriptive ≠ pyrel divergence found: PR #1117 / #1118 / #1213 stack already pins all observed behaviors via the iff suite (u_same_prop pins the dedup spec; empty_body_semantics pins the cascade-drop; arithmetic_filtered pins scope isolation across sibling aggregates). Inline comment in the production file points future readers to the experimental harness so the choice of arithmetic over sum-of-union is discoverable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Address findings from a multi-round review of the template: - Utility now blends pagerank with path_count_total (was 2*via_author + via_subject, dropped via_kg_walk despite docs calling it the headline paths-pillar signal). - Subject-distribution inspection rewritten to use aggs.sum().per(User, Subject) -- prior form was Cartesian over Candidate × Subject (~13740 rows on sm; now 25 × 12 = 300). - Honest description of MAX_HOPS=2 walker reach in module docstring, README pillar 2, and inline comment: User -> read_Book (length 1) + User -> read_Book -> similar_Book (length 2). Per-typed counts (via_author, via_subject) clarified as direct shared-entity joins, not path walks. - README frontmatter brought in line with other v1 templates (quoted industry, reasoning_types block, Title-Case tags). - Data-precondition comment block before slate_size_ic explaining the joint-feasibility requirements (cold users, over-read users, books missing author/subject, fresh/in-house floors). - Fetcher emits WARNING summary lines when author-name resolve falls back to OL key tail, and when first_publish_date is synthesised. - current_year hardcode replaced with date.today().year. - Stale "10-CSV bundle" -> "8-CSV", "~30-40 authors" -> "~58 authors", "two+ subjects" -> "at least one shared subject". - README + fetcher now correctly describe similar_to as derived (not synthetic) -- the GDPR Art.22 explainability framing requires the graph to be evidence, not fabrication. - Removed RAI Jira IDs (RAI-44166), internal pytest node ids, and references to nonexistent sibling templates from customer-facing prose. Cleaned the same class of leak from experiments/README.md and experiments/count_variants.py. - count_variants module docstring now lists six variants (was five but defined six); added entry for variant F. - Fetcher error message preserves original error class for diagnosis. - FRESH_WINDOW_DAYS comment expanded to explain catalogue-vs-streaming tuning. EXPLANATION_FLOOR / WEAK_EXPLANATION_THRESHOLD scaling note added to README customise section for md/lg slices. E2E status: OPTIMAL on bundled sm slice; objective ~15619 (was ~9779 before utility wired path_count_total in -- expected to rise). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… fixes - Module docstring + L207 unified-edge comment now describe per-typed counts as the explanation surface (not "top-aggregate-relevance path"); explicitly state typed-evidence joins are direct shared- entity joins, not per-hop edge introspection. - Fetcher: REFERENCE_YEAR = 2026 frozen constant for deterministic age_days across calendar years (cached reruns now produce identical CSVs in any year). - Fetcher: drop works with no resolvable authors (with WARNING) so the runner's author-coverage precondition holds. - README: regulatory section softened from "are required" to transparency-obligation framing with compliance-team caveat. - README intro + Pipeline summary: separate reach (walker generates candidates) from evidence (typed-joins score them); no more "two reach signals" ambiguity. - README + runner: LLM-explanation Customise bullet now references per-typed counts plus a documented small extension to materialise top paths; "eliminating hallucination" softened to "reducing hallucination risk". - experiments/README.md reframed as engineering notes useful to advanced customers; runner comment removed internal pick_5_12 debug coefficient name.

…endation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…endation

…dation The previous name stacked two technical qualifiers ("kg_aware", "slate"). The new name anchors to the lead instance (Open Library books) while keeping the K-items "slate" shape that distinguishes this template from a single-best recommendation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Rename module-level data path to uppercase DATA_DIR to match the recent CSP cart templates (synthetic_eligibility_records, product_configurator, synthetic_order_lifecycle). - Rewrite Quickstart to the canonical 6-step shape (Download / venv / Install / Configure / Run / Expected output) and add a Solve result block keyed to the bundled --size sm slice. - Document experiments/ in the template structure block so the ZIP layout matches the README. - Replace remaining kg_aware_* docstring and User-Agent strings in data/ and experiments/ left over from the rename. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…add troubleshooting - Section markers, Model() placement, docstring Run:/Output: blocks, per-CSV / per-concept comments, US spellings (matches recently merged v1 templates). - Drop misleading "PyRel doesn't support not in rules" rationale comment; cite the actual prescriptive-rewriter constraint and the Stock.is_non_representative cart precedent in portfolio_balancing. - Remove the GNN/Predictive customise mention (replaced with a generic "custom scoring signal" bullet) since the predictive reasoner is not yet public. - Add a Troubleshooting section to the README with INFEASIBLE diagnostic, slow-solve, and fetcher-network details blocks. Expand the runner's data-preconditions block to enumerate the cold-start + explanation-floor joint feasibility relation and the author-uniqueness x slate-size interaction. - Reorder inspection blocks so the chosen slate is printed second (after Users); diagnostic candidate-set and PageRank dumps move to the end. - Reference fixes: drop the GE Healthcare KARE bullet (wrong attribution and method); correct KPRN attribution (NUS / eBay / USTC); disambiguate Alibaba iGraph from AliCoCo; drop dead Alibaba blog URL; trim the Pinterest engagement claim to its publication-date framing; trim the LinkedIn precedent to the skills graph (no platform-total numbers). - Drop unused aggs alias; bare sum used uniformly for aggregates. - Annotate Book.in_house as Integer 0/1 in the README schema block; add demo-grade tuning rationale to the utility weights. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…oc fixes - Add a pre-solve "Candidate count per user" inspect block so customers can spot users with zero candidates before infeasibility. Document the per-user IC anchoring caveat: each floor IC fires only for users with at least one matching Candidate row, so sparse customer data may pass vacuously rather than becoming infeasible. - Open Library fetcher: pad User-Agent with a contact-stub note (Open Library API guidance asks for contact info), bump inter-request sleep from 0.2s to 1.0s to honour the documented unidentified rate limit, make cache writes atomic via temp-file rename, and treat a JSONDecodeError on cache read as a miss (interrupted writes or stale error bodies no longer poison the cache forever). - Tighten the utility-weights comment to acknowledge that with the default scales, PageRank effectively serves as a tie-breaker rather than a co-equal blend, and point production deployments at min-max normalisation before applying these weights. - Correct the K-RagRec / ItemRAG citation in the README and the matching mention in the runner Customise block. - Sharpen the Troubleshooting INFEASIBLE recipe: lead with the new pre-solve diagnostic, note that problem.verify already prints per-IC violations on a failing solve. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…r fixups - Add a Python-level pre-solve assertion that materialises the Candidate set, anti-joins against User.read, and refuses to solve if any user has fewer than SLATE_SIZE_K unread candidates. The per-user floor ICs anchor on Candidate rows, so without this guard customers with sparse data get a silent missing-row contract violation rather than an explicit infeasibility signal. The check also prints unread counts per user so reach can be inspected pre-solve. - Fetcher cache temp file now embeds the process pid so concurrent fetcher runs against the same _cache directory don't race on a shared .tmp path. - Replace the duplicated K-RagRec entry in the LLM+KG hybrid-pattern list with GraphRAG. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… in-house floors The Candidate-anchored per-user floor ICs cover three thresholds: SLATE_SIZE_K (cardinality), FRESHNESS_FLOOR (fresh items), and ORIGINALS_FLOOR (in-house items). Without per-floor pre-solve checks, a user whose unread candidates contain zero fresh items would still pass the SLATE_SIZE_K guard but produce a slate that silently violates the freshness floor (the where(...) filter on freshness_ic removes the user entirely from the IC's row set). Extend the assertion to also check per-user unread fresh count >= FRESHNESS_FLOOR and per-user unread in-house count >= ORIGINALS_FLOOR; surface affected users in the error message. Tighten the README troubleshooting block to describe the assertion accurately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Move CSV loads to top of file and add pre-solve invariants (unique-key, no-dangling-FK, non-negative age_days) using the _assert_* helpers + declarative FK-edges table that patient_cohort_recruitment establishes for v1 - Drop "Background and precedent" deep-dive from README; keep a short "Where this fits" framing - Trim the academic References block to just Open Library - Pin pandas>=2.0 to match other v1 templates - Drop reasoning_types: Paths in favor of canonical [Graph, Prescriptive] vocabulary; description re-framed accordingly - Compress essay comments around tuning constants, the path walker, the Item.connected_to relationship, and the Data-preconditions block (joint-feasibility detail already lives in README Troubleshooting) - Drop the bottom "Customize-section variants" comment block (it duplicates the README "Customize this template" section) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three remaining cross-references still spelled "Paths" / "three-pillar" after the front-matter switched to [Graph, Prescriptive]: - Module docstring header and "Three-pillar pipeline" line - "Pillar 2: Paths" section header in the .py - "What's included" line + "Pipeline" step 2 header in the README - v1 index description Re-frame as "Multi-reasoner" and "Graph (bounded KG walks)" so the labels match the canonical reasoning_types vocabulary throughout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-08T05:12:58Z

The docs preview for this pull request has been deployed to Vercel!

✅ Preview:	https://relationalai-docs-4natc25lc-relationalai.vercel.app/build/templates
🔍 Inspect:	https://vercel.com/relationalai/relationalai-docs/D7ZVESkXhp32eTTWeoLyv52kg8mo

…gle-count embeddedness floor The old shape framed Graph + Paths + Prescriptive as three coequal pillars but PageRank was just a per-Book Float that any retrieval- stage scalar could substitute. That made the Graph contribution swappable and effectively optional, and the "Paths" pillar read as a sidecar to PageRank rather than the centerpiece. Restructure so Paths visibly leads: - Reorder the .py: Pillar 1 = Paths (Candidate concept + per-typed counts), Pillar 2 = Graph, Pillar 3 = Prescriptive. Reorder the README "Pipeline" section to match. - Replace PageRank with `Graph.triangle_count()` per Book. Triangle count is a topological measure of where each Book sits in the similarity neighborhood; it cannot be supplied externally without reconstructing the graph, which is what makes it a Graph-pillar contribution rather than a data-layer input. - Add `embeddedness_ic`: at least EMBEDDEDNESS_FLOOR picks per user must have triangle_count >= EMBEDDEDNESS_THRESHOLD. The Graph pillar now drives a structural-diversity *constraint*, not just an objective term. - Drop the utility blend and the PAGERANK_WEIGHT / PATH_SIGNAL_WEIGHT constants. Objective collapses to `sum(path_count_total * pick)` -- pure path-driven, integer-only. - Update README "Why MIP, not CSP" rationale (no longer about float coefficients), the "Customize this template" "Custom scoring signal" bullet (now framed as adding an *additive* Float term, not swapping out PageRank), the troubleshooting section (added the embeddedness-floor infeasibility cause), the constants list, and the v1 index entry to match. Three Slack signals motivated picking triangle_count over Louvain (which would have been the cleaner community-diversity story): recent louvain() test failures locally, an open question about re-implementing Louvain via loops in PyRel, and ambiguity about its deprecation timeline. WCC was the other option but the bundled similarity graph is one connected component (60/60), so a "slate spans >= N components" IC is trivially satisfied or trivially infeasible. Triangle count has a real per-book distribution (0-107, two isolates, varied mid-tail) on the bundled slice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…/path story - Switch solver from HiGHS MIP to MiniZinc CSP. Pure-integer model: binary picks, integer coefficients, no float blend. Problem(model, Integer) and solve("minizinc", ...). - Update reasoning_types tag and template description to reflect CSP. - Strengthen the graph/path narrative through three new ICs and a derived property: * subject_span_ic: each user's slate must touch >= MIN_DISTINCT_SUBJECTS distinct subjects, expressed via count(Subject, Candidate.pick == 1) -- distinct-value counting that sum-of-indicators cannot express directly. * Candidate.primary_evidence: derived integer property (1=author, 2=subject, 3=walker) from argmax of the three typed path counts. Three mutually exclusive define rules. * path_evidence_diversity_ic: each slate must touch >= MIN_EVIDENCE_TYPES distinct primary-evidence types, expressed via count(Integer.ref(), Candidate.pick == 1) over distinct primary_evidence values. CSP-native distinct-value counting. * strong_walker_ic: at least MIN_STRONG_WALKERS picks must have path_count_via_kg_walk >= STRONG_WALKER_THRESHOLD. Anchors at least one pick to the headline Paths-pillar signal rather than the cheaper shared-author / shared-subject joins. - Drop subject_diversity_ic (subsumed by subject_span_ic + cardinality). - Inspect output now also surfaces primary_evidence per picked item. README, troubleshooting, and v1 index updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ent-decay objective Switch the prescriptive layer from binary pick to multi-valued integer slot in {1..K, K+1} (K+1 = unpicked sentinel). Slot order matches the canonical recsys position-decay engagement model -- top-of-row picks dominate impressions -- and lets the objective directly weight items by position via sum((K+1-slot) * path_count_total). Pillar-2 (Graph) contribution moves from a "somewhere in the slate" embeddedness floor to a hero-slot pin: slot 1 must come from a Book whose triangle count clears HERO_EMBEDDEDNESS_THRESHOLD, concentrating the structural-quality signal at the highest-engagement position. Drop the demo-shaped extras (primary_evidence Integer property and its 3 mutually-exclusive define rules, path_evidence_diversity_ic, strong_walker_ic). They were added to showcase count-distinct CSP syntax, but PyRel's prescriptive rewriter does not currently support distinct aggregates in IC compilation, and the underlying constraints weren't real product rules. The CSP idioms that DO exercise here: multi-valued integer decisions, GCC-style per-pair count caps (slot, author, subject), slot-equality reification (hero pin), and the reified domain rule for the already-read exclusion. Replace author_diversity_ic / subject_span_ic count-of-Concept forms with per-pair count caps (count(Candidate, slot<=K).per(user, X) <= N) -- the shape PyRel actually compiles. Add slot_uniqueness_ic (count Candidates per (user, Slot.pos) <= 1) to enforce that each slate position is filled exactly once; combined with slate_size_ic this is a bijection between picks and positions 1..K. Strengthen the pre-solve assertion to also flag users with no hero-eligible candidate or fewer than K distinct unread authors. E2E (sm slice, MiniZinc): OPTIMAL, objective 648, all ten ICs verify clean, slate ordered 1..K per user.

…ighten docs; drop experiments dir Pre-solve guard now catches the three remaining IC infeasibility modes that previously surfaced as silent INFEASIBLE solves: - cold_start: users with fewer than K - COLD_START_CAP strongly-explained candidates - subject_span: users whose unread pool spans fewer than ceil(K / MAX_PER_SUBJECT) subjects - explanation: users whose top-K position-weighted score upper bound is below EXPLANATION_FLOOR ValueError repair hint is now keyed by which condition fired (densify reach vs lower a per-IC floor vs lower SLATE_SIZE_K vs lower EXPLANATION_FLOOR), replacing the generic "densify Book.similar_to" hint that was misleading for short-author shortfalls. Data-domain validations: - in_house must be in {0, 1}; values outside silently disqualify books from the originals pool - book_similar (src_book_id, dst_book_id) must be unique; duplicates would inflate triangle counts Doc tightening: - New "How this template differs from other CSP templates" README section names the three architectural choices that follow from encoding an ordered slate (unified super-edge, K+1 sentinel, per-pair count caps + per-user existential hero pin) - Sections reordered: differences before count-idioms note (architectural orientation precedes the implementation caveat) - Version-neutralized "paths-lib limitation" wording at 4 sites (was pinned to v1.1.0) - Corrected via-author/via-subject path-count comments: counts are bag-style (one per join row), not distinct - Fixed objective wording so "lower slot indices = top of row = hero" lands unambiguously - Updated solve-time claim from ~1 minute to "a few seconds" for the bundled slice - Quickstart now mentions re-running the solver after fetching a larger slice Drop experiments/ directory (engineering scratch, was the only v1/*/experiments across all templates; load-bearing insight retained in README). Regenerated v1/README.md index. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Singular form aligns with the other reasoning_types entries (Graph, Prescriptive, Predictive, Rules-based) and with the customer-facing class/method names (PathTraversal, model.path()). The plural-form KG-Paths tag stays since it refers to graph paths as a data structure rather than to the reasoner. Updates: front-matter reasoning_types, description string, README prose pillar headers, script docstring and pillar comment, v1 index. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…r-pillar framing Per the customer-facing reasoner taxonomy, Path is taxonomically subordinate to Graph (treated as a subset, not a peer of Graph / Predictive / Prescriptive / Rules-based). Adjusted framing throughout so the paths library is positioned as a load-bearing technique rather than a separate reasoner pillar: - Front-matter reasoning_types narrowed to [Graph, Prescriptive]. - Description reframed as "Graph + Prescriptive (CSP) recsys template ... bounded knowledge-graph walks via the paths library generate the candidate set". - Pillar bullets in README + script docstring renamed: bounded KG walks (paths library, central) is the architectural centerpiece; Graph reasoner and Prescriptive reasoner are the two reasoner pillars. Code section markers renumbered Pillar 1 = Graph, Pillar 2 = Prescriptive. - KG-Paths tag retained (it refers to graph paths as a data structure, not the reasoner taxonomy). - v1 README index regenerated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-internal 'first showcase' positioning from customer-facing prose The 'Removing X collapses Y' rhetorical construction in README and docstring added no information beyond what the surrounding bullets already convey. Replaced with direct descriptions of what the path walks and Graph contribution each produce. Also dropped 'first showcase in v1' and 'first ordered slate' phrasings from the README — those are cart-positioning notes that belong in the PR description, not in customer-facing prose. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace "pillar" with "reasoner" or "stage" to match the customer- facing reasoner taxonomy and the language other multi-reasoner templates (e.g. telco_network_recovery) use. Remove the "Sibling CSP templates" comparison and the PyRel-roadmap reference -- both are internal positioning that doesn't help the reader. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mponent) Same gating pattern used for the predictive (GNN) templates: the paths-library template stays in the private docs site while the underlying capability stabilises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cafzal

Ship with nits. Pre-solve assertion (lines 694-846) is the best in v1 — per-IC necessary conditions with actionable per-user error lists. K+1 sentinel verified by construction (every count IC gates slot <= K; objective uses (K+1-slot) so K+1 contributes 0 naturally). Aggregate densification (| 0) consistent. Fetch script is robust: atomic writes, JSON validation w/ cache invalidation, polite UA + retry, frozen REFERENCE_YEAR for determinism. Customization section names real off-domain retargets (e-commerce, courses, news).

Issues

IMPORTANT — data/authors.csv carries Open Library noise that surfaces in user-visible output (lines 5 "TC", 12 "Aurora Irvine", 42 "Bible", 51 "Alex Goody", 59 "Booking", 4 "Les éditions du Rey" — edition/publisher records mis-classified as authors). Author names appear in per-pick explanations. Suggest filtering in data/fetch_open_library_slice.py:301-323: drop authors with names <3 chars, all-caps without punctuation, or matching a publisher denylist; fall back to dropping the work if its sole author is filtered.
NIT — data/subjects.csv has near-duplicates that dilute the diversity dial (rows 4/5/6/8 are flavors of "adventure"; row 11 is the literal Dewey "823/.8"). With MAX_PER_SUBJECT=2 these are seen as distinct subjects. Normalize in fetch_open_library_slice.py:404-411 (strip Dewey codes, collapse "adventure*" variants).
NIT — README.md:3 description is 447 chars / one stream-of-consciousness sentence with five em-dash clauses. Trim to ~150 chars, business framing first then a colon to the technique tagline.
NIT — README.md:151-194 ("Expected output") mixes "what you'll see" with "how to scale up". Move scaling guidance into the existing "Scaling the bundled data" section at line 422.
NIT — book_slate_recommendation.py:127-132 mixed CSV-var naming (users_csv, books_csv, then read_csv_data, ba_csv, bs_csv, bsim_csv). Make consistent.

py_compile and ruff check clean.

- Add publisher/imprint and Dewey-code filters in fetch_open_library_slice.py so authors.csv excludes corporate / single-token noise (TC, Bible, Booking, "Les éditions du Rey", ...) and subjects.csv collapses "adventure" variants and drops Dewey codes ("823/.8"). Regenerate the bundled sm slice. - Trim the front-matter description to a business framing + technique tagline and move "scaling the bundled data" guidance out of the Quickstart's expected-output step into the dedicated section. - Rename CSV-load locals (read_csv_data/ba_csv/bs_csv/bsim_csv) to a consistent <name>_csv pattern.

- fetch_open_library_slice.py: replace the regex-chain normalization in _normalize_subject with an explicit canonical-tag map (drop / strip / merge phases). Variants like "action & adventure" now actually fold to "adventure" (the prior chain claimed to but did not). Collapse internal whitespace before lookup so multi-space variants normalize identically. Catch json.JSONDecodeError in _http_get_json so HTML rate-limit pages returned with HTTP 200 trigger the retry loop instead of escaping it. - README.md: rewrite the front-matter description in plain language; correct the bundled-data counts to 59 books / 52 authors. - Regenerate the bundled sm slice with the new normalization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Module docstrings (book_slate_recommendation.py, fetch_open_library_slice.py): correct the bundled-data count to 59 books / 52 authors. - _is_publisher_or_noise_author docstring: list the cascade in the same order the implementation runs (length -> denylist -> token-set). - Runner docstring: clarify that the per-(user, candidate) explanation evidence feeds the Prescriptive reasoner; the Graph reasoner runs separately on the similarity graph for triangle_count. - _SUBJECT_CANONICAL_MAP comment: record why the merge map is kept narrow -- aggressive genre-merging makes the shared-subject similarity graph dense enough that the MiniZinc CSP can't reach OPTIMAL within the time budget on the bundled slice. - problem.solve: bump time_limit_sec from 60 to 180. The bundled instance solves to OPTIMAL well within 180s; the wider budget gives margin so the runner doesn't error out on slightly slower cloud queries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t numbers - README: update the two stale `time_limit_sec=60` references in the scaling and troubleshooting sections to match the runner's current `time_limit_sec=180`. - fetch_open_library_slice.py: update the Usage block to say `~59 books` (the actual count after the publisher-noise filter). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…p comment Trim the _SUBJECT_CANONICAL_MAP block comment to a neutral one-liner that describes what the map does. The previous version named specific genre families and described iteration history that doesn't belong in a public-facing template. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chriscoey and others added 16 commits April 30, 2026 12:03

Merge remote-tracking branch 'origin/main' into kg_aware_slate_recomm…

078097f

…endation

kg_aware_slate_recommendation: register in v1/README.md index

fe10571

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into kg_aware_slate_recomm…

7b8e5d4

…endation

github-actions Bot temporarily deployed to Preview May 8, 2026 04:56 Inactive

github-actions Bot temporarily deployed to Preview May 8, 2026 05:00 Inactive

github-actions Bot added the deployed label May 8, 2026

chriscoey and others added 4 commits May 7, 2026 22:28

chriscoey changed the title ~~Add book_slate_recommendation template (Graph + Paths + Prescriptive MIP)~~ Add book_slate_recommendation template (Graph + Paths + Prescriptive CSP) May 8, 2026

chriscoey marked this pull request as ready for review May 8, 2026 16:29

chriscoey requested review from jablonskidev and somacdivad as code owners May 8, 2026 16:29

chriscoey changed the title ~~Add book_slate_recommendation template (Graph + Paths + Prescriptive CSP)~~ Add book_slate_recommendation template (Graph + Path + Prescriptive CSP) May 8, 2026

github-actions Bot temporarily deployed to Preview May 8, 2026 17:15 Inactive

chriscoey changed the title ~~Add book_slate_recommendation template (Graph + Path + Prescriptive CSP)~~ Add book_slate_recommendation template (Graph + Prescriptive CSP) May 8, 2026

github-actions Bot temporarily deployed to Preview May 8, 2026 17:20 Inactive

github-actions Bot temporarily deployed to Preview May 8, 2026 17:33 Inactive

github-actions Bot temporarily deployed to Preview May 8, 2026 17:35 Inactive

cafzal approved these changes May 8, 2026

View reviewed changes

github-actions Bot temporarily deployed to Preview May 8, 2026 19:11 Inactive

chriscoey and others added 2 commits May 8, 2026 12:47

github-actions Bot temporarily deployed to Preview May 8, 2026 20:55 Inactive

github-actions Bot temporarily deployed to Preview May 8, 2026 20:59 Inactive

github-actions Bot deployed to Preview May 8, 2026 21:28 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add book_slate_recommendation template (Graph + Prescriptive CSP)#59

Add book_slate_recommendation template (Graph + Prescriptive CSP)#59
chriscoey wants to merge 31 commits intomainfrom
book_slate_recommendation

chriscoey commented May 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 8, 2026 •

edited

Loading

Uh oh!

cafzal left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chriscoey commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this template adds

Modeling patterns this surfaces

Privacy

Verification

References

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cafzal left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chriscoey commented May 8, 2026 •

edited

Loading

github-actions Bot commented May 8, 2026 •

edited

Loading