Clarify Migrations

I think the documentation of migrations could use a bit more detail here: [migration table section of the docs](https://tskit.readthedocs.io/en/latest/data-model.html#sec-migration-table-definition). Specifically, the docs read:

> Migrations are performed by individual ancestors, but most likely not by an individual whose genome is tracked as a node (as in a discrete-deme model they are unlikely to be both a migrant and a most recent common ancestor). So, tskit records when a segment of ancestry has moved between populations. 
and
> The node column records the ID of the node that was associated with the ancestry segment in question at the time of the migration event. 

I think a clearer way of explaining this is that **migrations chart how nodes are associated with a population**. Specifically, the migration node refers to the child node of the edge where the migration occurred, so a migration from source population _x_ to destination population _y_ at a given time, node, and left/right coordinate means that at the edge (or edges) denoted by the node and coordinates, ancestors younger than the time of the migration on the relevant edges belong to population _x_ and the older ancestors along the relevant edges belong to population _y_ (or at least until an intervening migration occurs). Furthermore, all the older ancestors of the parent node of the edge which exist between the left and right coordinates also belong to population _y_ (until they are affected by an older migration) and all descendants of this edge belong to population _x_ over the left/right coordinates (until they are affected by a more recent migration).
Here's an example of when this is important: if you wanted to know which tracts of ancestry (note this does not necessarily correspond to the haplotypes since it doesn't depend on variant sites) in modern samples are the result of a historic migration, we would look at the migration node and the marginal trees existing between the left and right coordinates, and then find the relevant leaf nodes. This will give the "ancestry segments" carried by samples which are the result of migrations in the absence of intervening migrations. Note that these left/right segments do _not_ always correspond to the breakpoints between edges, which I found surprising at first.

If I have all that right, then I think we should clarify a few thing: (1) migrations explain how ancestral nodes how/why ancestral nodes have a population, (2) that the migration node is the child node of the edge where the migration occurred, (3) that, barring multiple migrations on an edge, the child node of the edge belongs to source population and the parent node belongs to the destination population.

If others agree, I'll make a PR with to document this a bit more clearly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarify Migrations #1157

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarify Migrations #1157

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions