Skip to content

Commit 3808fda

Browse files
authored
Tidy up docstrings, esp for union (#3316)
1 parent dacbf38 commit 3808fda

File tree

1 file changed

+33
-19
lines changed

1 file changed

+33
-19
lines changed

python/tskit/trees.py

Lines changed: 33 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4771,7 +4771,8 @@ def nodes(self, *, order=None):
47714771
Returns an iterable sequence of all the :ref:`nodes <sec_node_table_definition>`
47724772
in this tree sequence.
47734773
4774-
.. note:: Although node ids are commonly ordered by node time, this is not a
4774+
.. note::
4775+
Although node ids are commonly ordered by node time, this is not a
47754776
formal tree sequence requirement. If you wish to iterate over nodes in
47764777
time order, you should therefore use ``order="timeasc"`` (and wrap the
47774778
resulting sequence in the standard Python :func:`python:reversed` function
@@ -5620,7 +5621,8 @@ def alignments(
56205621
By default ``L`` is therefore equal to the :attr:`.sequence_length`,
56215622
and ``a[j]`` is the nucleotide value at genomic position ``j``.
56225623
5623-
.. note:: This is inherently a **zero-based** representation of the sequence
5624+
.. note::
5625+
This is inherently a **zero-based** representation of the sequence
56245626
coordinate space. Care will be needed when interacting with other
56255627
libraries and upstream coordinate spaces.
56265628
@@ -6585,13 +6587,13 @@ def write_vcf(
65856587
to the sites in the tree sequence object.
65866588
65876589
.. note::
6588-
Older code often uses the ``ploidy=2`` argument, because old
6589-
versions of msprime did not output individual data. Specifying
6590-
individuals in the tree sequence is more robust, and since tree
6591-
sequences now typically contain individuals (e.g., as produced by
6592-
``msprime.sim_ancestry( )``), this is not necessary, and the
6593-
``ploidy`` argument can safely be removed as part of the process
6594-
of updating from the msprime 0.x legacy API.
6590+
Older code often uses the ``ploidy=2`` argument, because old
6591+
versions of msprime did not output individual data. Specifying
6592+
individuals in the tree sequence is more robust, and since tree
6593+
sequences now typically contain individuals (e.g., as produced by
6594+
``msprime.sim_ancestry( )``), this is not necessary, and the
6595+
``ploidy`` argument can safely be removed as part of the process
6596+
of updating from the msprime 0.x legacy API.
65956597
65966598
:param io.IOBase output: The file-like object to write the VCF output.
65976599
:param int ploidy: The ploidy of the individuals to be written to
@@ -7359,7 +7361,7 @@ def decapitate(self, time, *, flags=None, population=None, metadata=None):
73597361
is its associated ``time`` value, or the time of its node if the
73607362
mutation's time was marked as unknown (:data:`UNKNOWN_TIME`).
73617363
7362-
Migrations are not supported, and a LibraryError will be raise if
7364+
Migrations are not supported, and a LibraryError will be raised if
73637365
called on a tree sequence containing migration information.
73647366
73657367
.. seealso:: This method is implemented using the :meth:`.split_edges`
@@ -7517,8 +7519,8 @@ def union(
75177519
1. Individuals whose nodes are new to ``self``.
75187520
2. Edges whose parent or child are new to ``self``.
75197521
3. Mutations whose nodes are new to ``self``.
7520-
4. Sites which were not present in ``self``, if the site contains a newly
7521-
added mutation.
7522+
4. Sites whose positions are not present in the site positions in
7523+
``self``, if the site contains a newly added mutation.
75227524
75237525
This can be thought of as a "node-wise" union: for instance, it can not
75247526
be used to add new edges between two nodes already in ``self`` or new
@@ -7555,13 +7557,19 @@ def union(
75557557
``all_mutations=True, check_shared_equality=False`` can be used
75567558
to add mutations to ``self``.
75577559
7558-
If the resulting tree sequence is invalid (for instance, a node is
7559-
specified to have two distinct parents on the same interval),
7560-
an error will be raised.
7560+
.. warning::
7561+
If an equivalent node is specified in ``other``, the
7562+
version in ``self`` is used without checking the node
7563+
properties are the same. Similarly, if the same site position
7564+
is present in both ``self`` and ``other``, the version in
7565+
``self`` is used without checking that site properties are
7566+
the same. In these cases metadata and e.g. node times or ancestral
7567+
states in ``other`` are simply ignored.
75617568
7562-
Note that this operation also sorts the resulting tables, so the
7563-
resulting tree sequence may not be equal to ``self`` even if nothing
7564-
new was added (although it would differ only in ordering of the tables).
7569+
.. note::
7570+
This operation also sorts the resulting tables, so the resulting
7571+
tree sequence may not be equal to ``self`` even if nothing new
7572+
was added (although it would differ only in ordering of the tables).
75657573
75667574
:param TreeSequence other: Another tree sequence.
75677575
:param list node_mapping: An array of node IDs that relate nodes in
@@ -7579,6 +7587,11 @@ def union(
75797587
assigned new population IDs.
75807588
:param bool record_provenance: Whether to record a provenance entry
75817589
in the provenance table for this operation.
7590+
:return: The union of the two tree sequences.
7591+
:rtype: tskit.TreeSequence
7592+
:raises: **tskit.LibraryError** -- If the resulting tree sequence is invalid
7593+
(for instance, a node is specified to have two distinct
7594+
parents on the same interval)
75827595
"""
75837596
tables = self.dump_tables()
75847597
other_tables = other.dump_tables()
@@ -10374,7 +10387,8 @@ def genealogical_nearest_neighbours(self, focal, sample_sets, num_threads=0):
1037410387
1037510388
For an precise mathematical definition of GNN, see https://doi.org/10.1101/458067
1037610389
10377-
.. note:: The reference sets need not include all the samples, hence the most
10390+
.. note::
10391+
The reference sets need not include all the samples, hence the most
1037810392
recent common ancestral node of the reference sets, :math:`a`, need not be
1037910393
the immediate ancestor of the focal node. If the reference sets only comprise
1038010394
sequences from relatively distant individuals, the GNN statistic may end up

0 commit comments

Comments
 (0)