-
-
Notifications
You must be signed in to change notification settings - Fork 118
Remove StructuralAnnotationSet model and simplify architecture #801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The StructuralAnnotationSet model was designed to share structural annotations across corpuses, but analysis showed this benefit was never realized since each corpus duplicates structural annotations for corpus-specific embeddings. Changes: - Remove StructuralAnnotationSet model and all FK references - Add 5 reversible migrations for safe data migration - Simplify query_optimizer.py (remove union queries) - Update parser.py (structural annotations go to documents directly) - Update add_document() to copy structural annotations per-corpus - Fix export/import V2 document hash matching bug - Delete 5 obsolete test files, update remaining tests - Add backward compatibility for legacy V2 imports Migration strategy ensures safe rollback if needed: 1. Remove XOR constraints 2. Migrate data from structural_set to document 3. Remove FK fields 4. Delete model All tests pass (24 export/import, 32 permanent deletion).
Pull Request Review - PR #801SummaryThis PR removes the ✅ Strengths1. Excellent Migration Strategy
2. Architecture Simplification
3. Critical Bug Fix
4. Backward Compatibility
5. Testing & Documentation
|
Summary
StructuralAnnotationSetmodel which was designed to share structural annotations across corpuses but never provided actual benefits (each corpus duplicates for corpus-specific embeddings anyway)Test plan