Summary
During our migration of OntoMathPRO v8 to Neo4j graph database, we identified 32 data quality issues:
- 23 cyclic rdfs:subClassOf relationships (causing circular hierarchies)
- 9 orphan nodes (missing parent relationships)
Impact: These issues prevent direct DAG (Directed Acyclic Graph) implementation, which is essential for graph databases and ontology reasoning systems.
Issue #1: Cyclic rdfs:subClassOf Relationships (23 cycles)
Severity
HIGH - Prevents proper hierarchy traversal and reasoning
Description
Cycles occur when rdfs:subClassOf relationships form circular paths, violating the DAG property required for proper ontology hierarchies.
Example cycle:
- A rdfs:subClassOf B
- B rdfs:subClassOf C
- C rdfs:subClassOf A ← Cycle!
Detected Major Cycles
-
E34 Cycle (Length: 5)
- Path: E34 → E1660 → E4830 → E5122 → E6214 → E34
- Concepts: Mathematical knowledge object chain
-
E2844 Cycle (Length: 3)
- Path: E2844 → E1660 → E34 → E2844
- Concepts: Element of mathematical analysis chain
-
Matrix Cycle (Length: 4)
- Path: MatrixOperation → SquareMatrix → DiagonalMatrix → MatrixOperation
Plus 20 more cycles (mostly 2-node bidirectional relationships)
Reproduction Steps
Using Protégé:
- Open `ontomathpro_v8.owl` in Protégé
- Select "Tools" → "Reasoner" → "HermiT"
- Run "Start Reasoner"
- Navigate to E34 class
- Expand "SubClass Of" hierarchy
- Observe circular reference
Using Neo4j Cypher (after import):
```cypher
MATCH path = (n:ObjectType)-[:GENERALIZES*]->(n)
RETURN [node in nodes(path) | node.name] as cycle_path,
length(path) as cycle_length
ORDER BY cycle_length DESC
```
Recommended Fix
For the E34 cycle specifically, we recommend removing E34 → E1660 relationship:
Rationale:
- "Mathematical knowledge object" (E34) should NOT be subclass of "Value" (E1660)
- Counter-examples: Theorem, Operator, Formula are not values
- Keeping E1660 → E34 (Value is-a Mathematical knowledge object) is semantically correct
General approach:
- Analyze semantic correctness of each rdfs:subClassOf in the cycle
- Remove the weakest relationship (least semantically justified)
- Re-validate hierarchy
Issue #2: Orphan Nodes (9 nodes)
Severity
MEDIUM - Reduces hierarchy completeness
Description
9 nodes have no parent relationships due to encoding/naming mismatches in the OWL file.
Detected Orphans
Emden-Fowler Family (5 nodes):
- `Emden–FowlerEquation` (expected parent: E1897)
- `Emden–FowlerTypeEquation`
- `EmdenEquation`
- `Thomas–FermiEquation`
- `Euler–Poisson–DarbouxEquation`
Root Cause: Encoding mismatch (`â` vs `-`)
ElementMatrices Family (4 nodes):
- `ElementMatriсesTheory` (Cyrillic 'с')
- `DeterminantMatrix`
- `MatrixOperation`
- `TraceMatrix`
Root Cause: Cyrillic character in parent name (`с` instead of `c`)
Recommended Fix
- Normalize encoding: Convert all em-dashes to regular hyphens
- Fix Cyrillic characters: Replace Cyrillic 'с' with Latin 'c' in `ElementMatricesTheory`
- Add missing relationships:
```xml
<owl:Class rdf:about="EmdenEquation">
<rdfs:subClassOf rdf:resource="Emden-FowlerEquation"/>
</owl:Class>
```
Impact Analysis
Current State
- Total Classes: 4,052
- With cycles: 23 classes affected
- Orphaned: 9 classes
- Effective completeness: ~99.2%
Consequences
- ❌ Cannot be used in Neo4j without manual fixes
- ❌ Reasoners may produce incorrect inferences
- ❌ Hierarchy visualization tools fail
- ❌ SPARQL queries return incomplete results
Full Details
For complete analysis including all 23 cycles, reproduction scripts, and detailed recommendations, see our full report:
Repository: [Our internal analysis repository]
Report File: `palantir/docs/ontomathpro_issues_report.md`
Environment
- OntoMathPRO Version: v8 (`ontomathpro_v8.owl`)
- Detection Method: Neo4j graph database migration + Python OWL parsing
- Analysis Date: 2025-11-08
- Reporter: Math Ontology Migration Team
We're happy to provide additional details or collaborate on fixes. Thank you for maintaining this valuable resource!
Summary
During our migration of OntoMathPRO v8 to Neo4j graph database, we identified 32 data quality issues:
Impact: These issues prevent direct DAG (Directed Acyclic Graph) implementation, which is essential for graph databases and ontology reasoning systems.
Issue #1: Cyclic rdfs:subClassOf Relationships (23 cycles)
Severity
HIGH - Prevents proper hierarchy traversal and reasoning
Description
Cycles occur when rdfs:subClassOf relationships form circular paths, violating the DAG property required for proper ontology hierarchies.
Example cycle:
Detected Major Cycles
E34 Cycle (Length: 5)
E2844 Cycle (Length: 3)
Matrix Cycle (Length: 4)
Plus 20 more cycles (mostly 2-node bidirectional relationships)
Reproduction Steps
Using Protégé:
Using Neo4j Cypher (after import):
```cypher
MATCH path = (n:ObjectType)-[:GENERALIZES*]->(n)
RETURN [node in nodes(path) | node.name] as cycle_path,
length(path) as cycle_length
ORDER BY cycle_length DESC
```
Recommended Fix
For the E34 cycle specifically, we recommend removing E34 → E1660 relationship:
Rationale:
General approach:
Issue #2: Orphan Nodes (9 nodes)
Severity
MEDIUM - Reduces hierarchy completeness
Description
9 nodes have no parent relationships due to encoding/naming mismatches in the OWL file.
Detected Orphans
Emden-Fowler Family (5 nodes):
Root Cause: Encoding mismatch (`â` vs `-`)
ElementMatrices Family (4 nodes):
Root Cause: Cyrillic character in parent name (`с` instead of `c`)
Recommended Fix
```xml
<owl:Class rdf:about="EmdenEquation">
<rdfs:subClassOf rdf:resource="Emden-FowlerEquation"/>
</owl:Class>
```
Impact Analysis
Current State
Consequences
Full Details
For complete analysis including all 23 cycles, reproduction scripts, and detailed recommendations, see our full report:
Repository: [Our internal analysis repository]
Report File: `palantir/docs/ontomathpro_issues_report.md`
Environment
We're happy to provide additional details or collaborate on fixes. Thank you for maintaining this valuable resource!