ToyTree objects¶

The main class object users interact with in toytree is called a ToyTree. This object contains a number of useful functions for interacting with the underlying TreeNode structure (e.g., rooting, dropping tips) and for drawing trees and adding data from the tree (e.g., support values) to the plots. The link between tree structure and the data used to build tree drawings is tightly linked in toytree with the goal of making it very difficult for users to accidentally plot tip or node labels in an incorrect order. This section of the tutorial is primarily about how ToyTree objects store data, and how to access it easily using their functions.

[1]:

import toytree
import toyplot
import numpy as np

[2]:

# load a tree for this tutorial
tre = toytree.tree("https://eaton-lab.org/data/Cyathophora.tre")

Selecting parts of a tree¶

Toytree provides many functions for modifying the tree structure (e.g., rooting a tree, dropping tips), as well as methods for applying styles to specific parts of the tree (e.g., coloring edges differently). Both of these require an easy and reliable method for selecting specific parts of the tree while also minimizing the chance for user error.

Selecting subtrees with tip labels¶

In toytree we recommend using tip labels to select the location in the tree where it should be manipulated. Whey use tip labels instead of node names or indices? Well, using node indices (e.g., idx labels) would be a reasonable alternative, but it turns out this would likely be more error prone for users (although it is also allowed as an option). This is because if the tree is modified (e.g., if tips are dropped or the tree is re-rooted) the node indices will change. In contrast, the relationships among tips (i.e., who shares a more recent common ancestor with whom) does not change with any of these tree modifications. Node names are another option, but in most trees internal nodes are not named.

The plot below shows how node idx labels changes as the tree is modified. This is the reason why using idx labels as selectors is more error prone.

[3]:

# store a rooted copy of tre (more on this later...)
rtre = tre.root(['33588_przewalskii', '32082_przewalskii'])

[4]:

rtre.draw();

[25]:

# a multitree storing the unrooted and rooted toytrees
mtre = toytree.mtree([tre, rtre])

# plot shows that idx labels change with rerooting
mtre.draw(
    node_labels='idx',
    node_sizes=15,
);

Fuzzy tip label matching¶

Many toytree functions allow for a variety of input methods to select the list of tip labels to represent a clade. To create the name list without having to type each name out by hand, you can use fuzzy name matching. The three options are to write each name into a list using the names argument; to select samples based on a shared unique string sequence in their names with wildcard; or using a regex (regular expression) statement to match samples using more complex name patterns.

Get node idx label from tip labels¶

In the example below I use the function .get_mrca_idx_from_tip_labels(), which returns the correct node index of the mrca of the tips entered as arguments. You can see in the example below that the names, wildcard, and regex arguments return the correct node label for the clade that includes the two przewalskii samples (see the figure above) for each tree.

[26]:

# get an idx label of przewalskii clade using names, wildcard or regex
print('tre: ', tre.get_mrca_idx_from_tip_labels(names=['33588_przewalskii', '32082_przewalskii']))
print('tre: ', tre.get_mrca_idx_from_tip_labels(wildcard="prz"))
print('tre: ', tre.get_mrca_idx_from_tip_labels(regex="[0-9]*_przewalskii"))

# get an idx label of przewalskii clade using names, wildcard or regex
print('rtre:', rtre.get_mrca_idx_from_tip_labels(names=['33588_przewalskii', '32082_przewalskii']))
print('rtre:', rtre.get_mrca_idx_from_tip_labels(wildcard="prz"))
print('rtre:', rtre.get_mrca_idx_from_tip_labels(regex="[0-9]*_przewalskii"))

tre:  19
tre:  19
tre:  19
rtre: 23
rtre: 23
rtre: 23

Get TreeNode object from node idx label¶

[27]:

tre.idx_dict[19]

[27]:

<toytree.TreeNode.TreeNode at 0x7f2b105d0320>

Get tip labels from a node idx label¶

If you really want to select parts of the tree using nodes because maybe the tip names are very hard to match then this can be done using the get_tip_labels() function to build a list of tip names from a node idx label. If you enter an idx argument to this function it will return a list of names descended from the node. If no idx argument is entered then the root node idx is used so that all tip labels are returned.

[28]:

# get list of tips descended from a specific node in the tree
tre.get_tip_labels(idx=19)

[28]:

['33588_przewalskii', '32082_przewalskii']

[29]:

# get list of all tips in the tree
tre.get_tip_labels()

[29]:

['38362_rex',
 '39618_rex',
 '35236_rex',
 '35855_rex',
 '40578_rex',
 '30556_thamno',
 '33413_thamno',
 '33588_przewalskii',
 '32082_przewalskii',
 '30686_cyathophylla',
 '29154_superba',
 '41478_cyathophylloides',
 '41954_cyathophylloides']

The .get_tip_labels() function can be combined with .get_mrca_idx_from_tip_labels() function to get a list of names that are all descendend from a common ancestor. For example, in the rooted tree above if I wanted to get a list of all tip labels in the ingroup clade I could select just one sample from each of the two subclades in it with .get_mrca_idx_from_tip_labels() to get the node idx of their common ancestor. Then pass this to .get_tip_labels() to return the full list of descendants. This is an efficient way to build a list of tip label names for large clade without having to write them all out by hand.

[30]:

# get node index (idx) of mrca
idx = rtre.get_mrca_idx_from_tip_labels(["29154_superba", "40578_rex"])

# get tip labels descended from node idx
rtre.get_tip_labels(idx=idx)

[30]:

['38362_rex',
 '39618_rex',
 '35236_rex',
 '35855_rex',
 '40578_rex',
 '30556_thamno',
 '33413_thamno',
 '41478_cyathophylloides',
 '41954_cyathophylloides',
 '30686_cyathophylla',
 '29154_superba']

Modifying ToyTrees¶

ToyTrees provide a number of functions for modifying the tree structure. All of these methods return a modified copy of the object – they do not change your original tree by modifying it in place. This is useful because you can reliably chain together multiple tree modification functions (e.g., see Chaining many functions and arguments). As discussed above, it is generally good practice to use tip name selectors to identify clades that should be modified on the tree. In some cases, if you are modifying a tree and using plotting styles that both rely on the tree structure, it may be easier and more clear to separate the code into multiple separate function calls. The process of chaining arguments together makes for elegant code, but use whichever method is most comfortable for you. See the Cookbook gallery for more examples.

Rooting trees¶

You can root toytrees using the .root() function call. This takes as an argument either a single tip name, or a list of tip names. You can use the fuzzy name matching options to match multiple tip names, as shown below.

[31]:

# three ways to do the same re-rooting
rtre = tre.root(names=["32082_przewalskii", "33588_przewalskii"])
rtre = tre.root(wildcard="prz")
rtre = tre.root(regex="[0-9]*_przewalskii")

# draw the rooted tree
rtre.draw(node_labels='idx', node_sizes=15);

There is also a function .unroot() to remove the root node from trees. This creates a polytomy at the root. Technically there still exists a point on the treenode structure that we refer to as the root, but it does not appear in drawings.

[32]:

# an unrooted tree
rtre.unroot().draw();

Drop tips¶

Dropping tips from a tree retains the structure of the remaining nodes in the tree. Here again you can use fuzzy name matching to select the tips you wish to drop from the tree. In this case the names that are selected with matching do not have to form a monophyletic clade, however, if you select to remove all tips in the tree then it will raise an error.

[33]:

rtre.drop_tips(wildcard="cyatho").draw();

Ladderize¶

By default toytrees are ladderized unless you change the tip order in some way, by either entering a fixed_order for tip labels, by dropping tips from the tree, or by rotating nodes. If you want to return a tree to being ladderized you can do so with the .ladderize() function.

[34]:

# dropping tips unladderized the tree, so we re-ladderized it before plotting
rtre.drop_tips(wildcard="cyatho").ladderize().draw();

Rotate nodes¶

Rotating nodes of the tree does not affect the actual tree structure (e.g., the newick structure does not change), it simply affects the order of tips when the tree is drawn. You can rotate nodes by entering tip names as in the previous examples using either names, wildcard, or regex. The names must form a monophyletic clade for one of the descendants of the node you wish to rotate. Rotating nodes for plotting is usually done for some aesthetic reason, such as aligning tips better with geography or trait values plotted on the tips of the tree.

[35]:

rtre.rotate_node(wildcard="prz").draw();

Resolve polytomy¶

This method should generally not be used much unless needed. The problem is that you usually don’t know what to set the branch length to for the new edge when you split a polytomy. If the tree is unrooted then you should use .root() instead to root it. If you have a hard polytomy in the tree and need to resolve it then this will resolve all polytomies in the tree. You can change what the default .dist and .support values will be on the new node.

[36]:

toytree.tree("((a,b,c),d);").resolve_polytomy(dist=1.).draw();

Chaining functions and arguments¶

Because the tree modification calls in toytrees always return a copy of the object, you can chain together many of these functions when building a plot. This is especially nice if you are only modifying the tree temporarily for the purpose of plotting (e.g., rotating nodes), and so you don’t need to store the intermediate trees. It’s kind of analagous to using pipes in bash programming.

When chaining many function calls and plotting styles together in toytree code it is best to use good coding practices. In the example below I split each function call and style option over a separate line. This makes the code more readable, and easier to debug, since you can comment out a line at a time to examine its effect without it breaking the rest of the command. The parentheses surrounding the main function calls makes this possible.

[37]:

# readable style for writing long draw functions
canvas, axes, mark = (
    tre
    .root(wildcard="prz")
    .drop_tips(wildcard="superba")
    .rotate_node(wildcard="30686")
    .draw(
        tip_labels_align=True,
        edge_style={
            "stroke": toytree.colors[3],
        }
    )
)

Attributes and functions¶

[38]:

rtre.get_tip_labels()             # list of labels in node-plot order
rtre.get_tip_coordinates()        # array of tip plot coordinates in idx order
rtre.get_node_values()            # list in node-plot order
rtre.get_node_dict()              # dict mapping idx:name for each tip
rtre.get_node_coordinates()       # array of node plot coordinates in idx order
rtre.get_edge_values()            # list of edge values in edge plot order
rtre.get_edge_values_mapped();    # list of edge values with mapped dict in edge plot order

[39]:

rtre.is_bifurcating()             # boolean
rtre.is_rooted();                 # boolean

[40]:

rtre.nnodes                       # number of nodes in the tree
rtre.ntips                        # number of tips in the tree
rtre.newick                       # the newick representation of the tree
rtre.features                     # list of node features that can be accessed
rtre.style;                       # dict of plotting style of tree

Saving/writing ToyTrees¶

[41]:

# if no file handle is entered then the newick string is returned
rtre.write()

[41]:

'((32082_przewalskii:0.00259326,33588_przewalskii:0.00247134)100:0.0179371,(((29154_superba:0.00634237,30686_cyathophylla:0.00669945)100:0.00237995,(41954_cyathophylloides:8.88803e-05,41478_cyathophylloides:5.28218e-05)100:0.00941021)100:0.00297626,(33413_thamno:0.00565358,(30556_thamno:0.00653218,((40578_rex:0.00335406,35855_rex:0.00339963)100:0.00223,(35236_rex:0.00580525,(39618_rex:0.000962081,38362_rex:0.00109218)100:0.00617527)96:0.0007389)99:0.000783365)100:0.0010338)100:0.00538723)100:0.0179371);'

[42]:

# the fmt (format) options write different newick formats.
rtre.write(tree_format=9)

[42]:

'((32082_przewalskii,33588_przewalskii),(((29154_superba,30686_cyathophylla),(41954_cyathophylloides,41478_cyathophylloides)),(33413_thamno,(30556_thamno,((40578_rex,35855_rex),(35236_rex,(39618_rex,38362_rex)))))));'

[43]:

# write to file
rtre.write("/tmp/mytree.tre", tree_format=0)