Summary
The population sampler assigns partners via households.py but does not reconcile marital_status or household_size afterward. These attributes are sampled independently from the spec's distributions before household pairing occurs, leading to contradictions.
Evidence (from ASI Announcement study, 5,000 agents)
- 896 agents have
marital_status = 'Single' but a non-null partner_id
- 1,114 agents have
household_size = 1 but a non-null partner_id
- 78 married/divorced teenagers (age 18-19)
- 25 agents with Graduate Degree under age 22
- 86 retired agents under age 50
Root Cause
In extropy/population/sampler/core.py, the sampling loop processes attributes in topological order based on depends_on. marital_status depends on [age] and household_size depends on [marital_status, age]. Both are sampled from their distributions with age-based modifiers.
Then households.py pairs agents into partner relationships and assigns dependents — but it never updates the already-sampled marital_status or household_size to reflect the actual household structure.
Expected Behavior
After household assignment:
- Agents with a
partner_id should have marital_status in {'Married', 'Domestic Partner'} (not 'Single')
household_size should reflect the actual household membership count (at minimum >= 2 if partnered)
- Age-implausible combinations (married teen, retired 30-year-old) should be filtered or reconciled
Suggested Fix
Add a post-household reconciliation step in the sampler that:
- Sets
marital_status = 'Married' for all agents with a partner_id and marital_status = 'Single'
- Recomputes
household_size based on actual household membership
- Optionally: validates age-plausibility constraints (no married under-18, no grad degrees under 22, etc.)
Summary
The population sampler assigns partners via
households.pybut does not reconcilemarital_statusorhousehold_sizeafterward. These attributes are sampled independently from the spec's distributions before household pairing occurs, leading to contradictions.Evidence (from ASI Announcement study, 5,000 agents)
marital_status = 'Single'but a non-nullpartner_idhousehold_size = 1but a non-nullpartner_idRoot Cause
In
extropy/population/sampler/core.py, the sampling loop processes attributes in topological order based ondepends_on.marital_statusdepends on[age]andhousehold_sizedepends on[marital_status, age]. Both are sampled from their distributions with age-based modifiers.Then
households.pypairs agents into partner relationships and assigns dependents — but it never updates the already-sampledmarital_statusorhousehold_sizeto reflect the actual household structure.Expected Behavior
After household assignment:
partner_idshould havemarital_statusin{'Married', 'Domestic Partner'}(not'Single')household_sizeshould reflect the actual household membership count (at minimum >= 2 if partnered)Suggested Fix
Add a post-household reconciliation step in the sampler that:
marital_status = 'Married'for all agents with apartner_idandmarital_status = 'Single'household_sizebased on actual household membership