Welcome to Toytree¶
Documentation¶
The toytree ethos¶
Welcome to toytree, a minimalist tree manipulation and plotting library. Toytree combines a popular Tree data structure based on the ete3 library with modern plotting tools based on the toyplot plotting library. The goal of toytree is to provide a light-weight Python equivalent to commonly used tree manipulation and plotting libraries in R, and in doing so, to promote further development of phylogenetic and other tree-based methods in Python.
Toytree Goals:
- style: beautiful “out-of-the-box” figures that require minimal styling.
- customization: extensive style options using CSS semantics.
- simplicity: several pre-defined plotting styles (e.g., coalescent, dark-mode).
- tree manipulation: easily traversable class object. Rooting, ordering, etc.
- tree statistics: edge lengths, node heights.
- tree comparisons: cloud tree plotting, Robinson-Fould’s calculations.
- transparency: interactive plots make raw data available as pop-ups.
- reproducibility: code and plots in jupyter notebooks.
- extendability: combine with toyplot scatterplots, barplots, colormapping, etc.
- minimalism: lightweight, easy installation.
Try it now¶
If you are not familiar with jupyter-notebooks, or just want to try out toytree before installing, you can do so easily by connecting to an interactive notebook running in the cloud. Use the following link to connect to a tutorial notebook running through a free service called binder:
Launch binder: toytree binder
Installation¶
Toytree can be installed using pip or conda. I recommend the conda version. Either should pull in all dependencies including toyplot.
Conda install¶
conda install toytree -c eaton-lab
Pip install¶
pip install toytree
Dependencies:¶
Toytree dependencies:
- toyplot
- numpy
- future
- requests
- notebook
Additional toyplot dependencies:
- multipledispatch
- pandas >=0.14.1
- pypng >=0.0.18
- pillow
- reportlab
- graphviz
- networkx
- six
- mock
- custom_inherit
- ghostscript
- arrow
Quick Guide¶
Toytree is a Python tree plotting library designed for use inside jupyter notebooks. In fact, this entire tutorial was created using notebooks, and assumes that you are following along in a notebook of your own. To begin, we will import toytree
, and the plotting library it is built on, toyplot
, as well as numpy
for generating some numerical data.
[1]:
import toytree # a tree plotting library
import toyplot # a general plotting library
import numpy as np # numerical library
[2]:
print(toytree.__version__)
print(toyplot.__version__)
print(np.__version__)
2.0.0
0.20.0-dev
1.17.3
Load and draw your first tree¶
The main Class object is toytree is a ToyTree
, which provides plotting functionality in addition to a number of useful functions and attributes for returning values and statistics about trees. As we’ll see below, you can generate a ToyTree object in many ways, but generally it is done by reading in a newick formatted string of text. The example below shows the simplest way to load a ToyTree which is to use the toytree.tree()
convenience function to parse a file, URL, or string.
[3]:
# load a toytree from a newick string at a URL
tre = toytree.tree("https://eaton-lab.org/data/Cyathophora.tre")
[4]:
# root and draw the tree (more details on this coming up...)
rtre = tre.root(wildcard="prz")
rtre.draw(tip_labels_align=True);
Parsing Newick/Nexus data¶
ToyTrees can be flexibly loaded from a range of text formats. Below are two newick strings in different tree_formats
. The first has edge lengths and support values, the second has edge-lengths and node-labels. These are two different ways of writing tree data in a serialized format. Format 0 expects the internal node values to be integers or floats to represent support values, format 1 expects internal node values to be strings as node labels.
[5]:
# newick with edge-lengths & support values
newick = "((a:1,b:1)90:3,(c:3,(d:1, e:1)100:2)100:1)100;"
tre0 = toytree.tree(newick, tree_format=0)
# newick with edge-lengths & string node-labels
newick = "((a:1,b:1)A:3,(c:3,(d:1, e:1)B:2)C:1)root;"
tre1 = toytree.tree(newick, tree_format=1)
To parse either format you can tell toytree the format of the newick string following the tree parsing formats in ete. The default option, and most common format is 0. If you don’t enter a tree_format
argument the default format will usually parse it just fine. Toytree can also parse extended newick format (nhx) files, which store additional metadata, as well as mrbayes formatted files
(tree_format=10
) which are a variant of NHX. Any of these formats can be parsed from a NEXUS file automatically.
[6]:
# parse an NHX format string with node supports and names
nhx = "((a:3[&&NHX:name=a:support=100],b:2[&&NHX:name=b:support=100]):4[&&NHX:name=ab:support=60],c:5[&&NHX:name=c:support=100]);"
ntre = toytree.tree(nhx)
# parse a mrbayes format file with NHX-like node and edge info
mb = "((a[&prob=100]:0.1[&length=0.1],b[&prob=100]:0.2[&length=0.2])[&prob=90]:0.4[&length=0.4],c[&prob=100]:0.6[&length=0.6]);"
mtre = toytree.tree(mb, tree_format=10)
# parse a NEXUS formatted file containing a tree of any supported format
nex = """
#NEXUS
begin trees;
translate;
1 apple,
2 blueberry,
3 cantaloupe,
4 durian,
;
tree tree0 = [&U] ((1,2),(3,4));
end;
"""
xtre = toytree.tree(nex)
Accessing tree data¶
You can use tab-completion by typing the name of the tree variable (e.g., rtre
below) followed by a dot and then pressing <tab>
to see the many attributes of ToyTrees. Below I print a few of them as examples.
[7]:
rtre.ntips
[7]:
13
[8]:
rtre.nnodes
[8]:
25
[9]:
tre.is_rooted(), rtre.is_rooted()
[9]:
(False, True)
[10]:
rtre.get_tip_labels()
[10]:
['38362_rex',
'39618_rex',
'35236_rex',
'35855_rex',
'40578_rex',
'30556_thamno',
'33413_thamno',
'41478_cyathophylloides',
'41954_cyathophylloides',
'30686_cyathophylla',
'29154_superba',
'33588_przewalskii',
'32082_przewalskii']
[11]:
rtre.get_edges()
[11]:
array([[13, 0],
[13, 1],
[14, 2],
[15, 3],
[15, 4],
[17, 5],
[20, 6],
[18, 7],
[18, 8],
[19, 9],
[19, 10],
[23, 11],
[23, 12],
[14, 13],
[16, 14],
[16, 15],
[17, 16],
[20, 17],
[21, 18],
[21, 19],
[22, 20],
[22, 21],
[24, 22],
[24, 23]])
Tree Classes¶
The main Class objects in toytree exist as a nested hierarchy. The core of any tree is the TreeNode
object, which stores the tree structure in memory and allows fast traversal over nodes of the tree to describe its structure. This object is wrapped inside of ToyTree
objects, which provide convenient access to TreeNodes while also providing plotting and tree modification functions. And multiple ToyTrees can be grouped together into MultiTree
objects, which are useful for iterating
over multiple trees, or for generating plots that overlay and compare trees.
The underlying TreeNode object of Toytrees will be familiar to users of the ete3 Python library, since it is pretty much a stripped-down forked version of their TreeNode class object. This is useful since ete has great documentation. You can access the TreeNode of any ToyTree using its .treenode
attribute, like below. Beginner toytree user’s are unlikely to need to access TreeNode objects directly, and instead will mostly
access the tree structure through ToyTree objects.
[12]:
# a TreeNode object is contained within every ToyTree at .tree
tre.treenode
[12]:
<toytree.TreeNode.TreeNode at 0x7fa1903b5fd0>
[13]:
# a ToyTree object
toytree.tree("((a, b), c);")
[13]:
<toytree.Toytree.ToyTree at 0x7fa162387160>
[14]:
# a MultiTree object
toytree.mtree([tre, tre, tre])
[14]:
<toytree.Multitree.MultiTree at 0x7fa162387b38>
Drawing trees: basics¶
When you call .draw()
on a tree it returns three objects, a Canvas
, a Cartesian
axes object, and a Mark
. This follows the design principle of the toyplot
plotting library on which toytree is based. The Canvas describes the plot space, and the Cartesian coordinates define how to project points onto that space. One canvas can have multiple cartesian coordinates, and each cartesian object can have multiple Marks. This will be demonstrated more later.
As you will see below, I end many toytree drawing commands with a semicolon (;), this simply hides the printed return statement showing that the Canvas and Cartesian objects were returned. The Canvas will automatically render in the cell below the plot even if you do not save the return Canvas as a variable. Below I do not use a semicolon and so the three returned objects are shown as text (e.g., <toyplot.canvas.Canvas…>), and the plot is displayed.
[15]:
rtre.draw()
[15]:
(<toyplot.canvas.Canvas at 0x7fa16238eda0>,
<toyplot.coordinates.Cartesian at 0x7fa16239d4a8>,
<toytree.Render.ToytreeMark at 0x7fa16239dba8>)
[16]:
# the semicolon hides the returned text of the Canvas and Cartesian objects
rtre.draw();
[17]:
# or, we can store them as variables (this allows more editing on them later)
canvas, axes, mark = rtre.draw()
Drawing trees: styles¶
There are innumerous ways in which to style ToyTree drawings. We provide a number of pre-built tree_styles
(normal, dark, coalescent, multitree), but users can also create their own style dictionaries that can be easily reused. Below are some examples. You can use tab-completion within the draw function to see the docstring for more details on available arguments to toggle, or you can see which styles are available on ToyTrees by accessing their .style
dictionary. See the
Styling chapter for more details.
[18]:
# drawing with pre-built tree_styles
rtre.draw(tree_style='n'); # normal-style
rtre.draw(tree_style='d'); # dark-style
# 'ts' is also a shortcut for tree_style
rtre.draw(ts='o'); # umlaut-style
[19]:
# define a style dictionary
mystyle = {
"layout": 'd',
"edge_type": 'p',
"edge_style": {
"stroke": toytree.colors[2],
"stroke-width": 2.5,
},
"tip_labels_align": True,
"tip_labels_colors": toytree.colors[0],
"tip_labels_style": {
"font-size": "10px"
},
"node_labels": False,
"node_sizes": 8,
"node_colors": toytree.colors[2],
}
[20]:
# use your custom style dictionary in one or more tree drawings
rtre.draw(height=400, **mystyle);
Drawing trees: nodes¶
Plotting node values on a tree is a useful way of representing additional information about trees. Toytree tries to make this process fool-proof, in the sense that the data you plot on nodes will always be the correct data associated with that node. This is done through simple shortcut methods for plotting node features, as well as a convenience function called .get_node_values()
that draws the values explicitly from the same tree structure that is being plotted (this avoids making a list of
values from a tree and then plotting them on that tree only to find that a the order of tips or nodes in the tree has changed.) Finally, toytree also provides interactive features that allow you to explore many features of your data by simply hovering over nodes with your cursor. This is made possible by the HTML+JS framework in which toytrees are displayed in jupyter notebooks, or in web-pages.
[21]:
# hover over nodes to see pop-up elements
rtre.draw(height=350, node_hover=True, node_sizes=10, tip_labels_align=True);
In the example above the labels on each node indicate their “idx” value, which is simply a unique identifier given to every node. We could alternatively select one of the features that you could see listed on the node when you hovered over it and toytree will display that value on the node instead. In the example below we plot the node support values. You’ll notice that in this context no values were shown for the tip nodes, but instead only for internal nodes. More on this below.
[22]:
rtre.draw(node_labels='support', node_sizes=15);
You can also create plots with the nodes shown, but without node labels. This is often most useful when combined with mapping different colors to nodes to represent different classes of data. In the example below we pass a single color and size for all nodes.
[23]:
# You can do the same without printing the 'idx' label on nodes.
rtre.draw(
node_labels=None,
node_sizes=10,
node_colors='grey'
);
You can draw values on all the nodes, or only on non-tip nodes, or only on internal nodes (not tips or root). Use the .get_node_values
function of ToyTrees to build a list of values for plotting on the tree. Because the data are extracted from the same tree they will be plotted on the values will always be ordered properly.
[24]:
tre0.get_node_values("support", show_root=1, show_tips=1)
[24]:
array([100, 90, 100, 100, 0, 0, 0, 0, 0])
[25]:
tre0.get_node_values("support", show_root=1, show_tips=0)
[25]:
array(['100', '90', '100', '100', '', '', '', '', ''], dtype='<U21')
[26]:
tre0.get_node_values("support", show_root=0, show_tips=0)
[26]:
array(['', '90', '100', '100', '', '', '', '', ''], dtype='<U3')
[27]:
# show support values
tre0.draw(
node_labels=tre0.get_node_values("support", 0, 0),
node_sizes=20,
);
[28]:
# show support values
tre0.draw(
node_labels=tre0.get_node_values("support", 1, 1),
node_sizes=20,
);
Because .get_node_values()
returns values in node plot order, it is especially useful for building lists of values for color mapping on nodes. Here we map different colors to nodes depending on whether the support value is 100 or not.
[29]:
# build a color list in node plot order with different values based on support
colors = [
toytree.colors[0] if i==100 else toytree.colors[1]
for i in rtre.get_node_values('support', 1, 1)
]
# You can do the same without printing the 'idx' label on nodes.
rtre.draw(
node_sizes=10,
node_colors=colors
);
Drawing: saving figures¶
Toytree drawings can be saved to disk using the render
functions of toyplot. This is where it is useful to store the Canvas object as a variable when it is returned during a toytree drawing. You can save toyplot figures in a variety of formats, including HTML (which is actually an SVG figures wrapped in HTML with addition javascript to provide interactivity); or SVG, PDF, and PNG.
[30]:
# draw a plot and store the Canvas object to a variable
canvas, axes, mark = rtre.draw(width=400, height=300);
HTML rendering is the default format. This will save the figure as a vector graphic (SVG) wrapped in HTML with additional optional javascript wrapping for interactive features. You can share the file with others and anyone can open it in a browser. You can embed it on your website, or even display it in emails!
[31]:
# for sharing through web-links (or even email!) html is great!
toyplot.html.render(canvas, "/tmp/tree-plot.html")
Optional formats: If you want to do additional styling of your figures in Illustrator or InkScape (recommended) then SVG is likely your best option. You can save figures in SVG by simply importing this as an additional option from toyplot.
[32]:
# for creating scientific figures SVG is often the most useful format
import toyplot.svg
toyplot.svg.render(canvas, "/tmp/tree-plot.svg")
Despite the advantages of working with the SVG or HTML formats (e.g., vector graphics and interactive pop-ups), if you’re like me you still sometimes love to have an old-fashioned PDF. Again, you can import this from toyplot.
[33]:
import toyplot.pdf
toyplot.pdf.render(canvas, "/tmp/tree-plot.pdf")
Drawing: The Canvas, Axes, and coordinates¶
When you call the toytree.draw()
function it returns two Toyplot objects which are used to display the figure. The first is the Canvas, which is the HTML element that holds the figure, and the second is a Cartesian axes object, which represent the coordinates for the plot. You can store these objects when they are returned by the draw()
function to further manipulate the plot. Storing the Canvas is necessary in order to save the plot.
The Canvas and Axes¶
If you wish to combine multiple toytree figures into a single figure then it is easiest to first create instances of the toyplot Canvas and Axes objects and then to add the toytree drawing to this plot by using the .draw(axes=axes)
argument. In the example below we first define the Canvas size, then define two coordinate axes inside of this Canvas, and then we pass these coordinate axes objects to two separate toytree drawings.
[34]:
# set dimensions of the canvas
canvas = toyplot.Canvas(width=700, height=250)
# dissect canvas into multiple cartesian areas (x1, x2, y1, y2)
ax0 = canvas.cartesian(bounds=('10%', '45%', '10%', '90%'))
ax1 = canvas.cartesian(bounds=('55%', '90%', '10%', '90%'))
# call draw with the 'axes' argument to pass it to a specific cartesian area
style = {
"tip_labels_align": True,
"tip_labels_style": {
"font-size": "9px"
},
}
rtre.draw(axes=ax0, **style);
rtre.draw(axes=ax1, tip_labels_colors='indigo', **style);
# hide the axes (e.g, ticks and splines)
ax0.show=False
ax1.show=False
The Coordinates¶
Toytrees drawings are designed to use a set coordinate space within the axes to make it easy to situate additional plots to align with tree drawings. Regardless of whether the tree drawing is oriented ‘right’ or ‘down’ the farthest tip of the tree (not tip label but tip) will align at the zero-axis. For right-facing trees this means at x=0, for down-facing trees this means y=0. On the other axis, tree tips will be spaced from zero to ntips with a unit of 1 between each tip. For tips on aligning additional plotting methods (barplots, scatterplots, etc.) with toytree drawings see the Cookbook gallery. Below I add a grid to overlay tree plots in both orientations to highlight the coordinate space.
[35]:
# store the returned Canvas and Axes objects
canvas, axes, makr = rtre.draw(
width=300,
height=300,
tip_labels_align=True,
tip_labels=False,
)
# show the axes coordinates
axes.show = True
axes.x.ticks.show = True
axes.y.ticks.show = True
# overlay a grid
axes.hlines(np.arange(0, 13, 2), style={"stroke": "red", "stroke-dasharray": "2,4"})
axes.vlines(0, style={"stroke": "blue", "stroke-dasharray": "2,4"});
[36]:
# store the returned Canvas and Axes objects
canvas, axes, mark = rtre.draw(
width=300,
height=300,
tip_labels=False,
tip_labels_align=True,
layout='d',
)
# show the axes coordinates
axes.show = True
axes.x.ticks.show = True
axes.y.ticks.show = True
# overlay a grid
axes.vlines(np.arange(0, 13, 2), style={"stroke": "red", "stroke-dasharray": "2,4"})
axes.hlines(0, style={"stroke": "blue", "stroke-dasharray": "2,4"});
ToyTree objects¶
The main class object users interact with in toytree is called a ToyTree
. This object contains a number of useful functions for interacting with the underlying TreeNode
structure (e.g., rooting, dropping tips) and for drawing trees and adding data from the tree (e.g., support values) to the plots. The link between tree structure and the data used to build tree drawings is tightly linked in toytree with the goal of making it very difficult for users to accidentally plot tip or node labels
in an incorrect order. This section of the tutorial is primarily about how ToyTree objects store data, and how to access it easily using their functions.
[1]:
import toytree
import toyplot
import numpy as np
[2]:
# load a tree for this tutorial
tre = toytree.tree("https://eaton-lab.org/data/Cyathophora.tre")
Selecting parts of a tree¶
Toytree provides many functions for modifying the tree structure (e.g., rooting a tree, dropping tips), as well as methods for applying styles to specific parts of the tree (e.g., coloring edges differently). Both of these require an easy and reliable method for selecting specific parts of the tree while also minimizing the chance for user error.
Selecting subtrees with tip labels¶
In toytree we recommend using tip labels to select the location in the tree where it should be manipulated. Whey use tip labels instead of node names or indices? Well, using node indices (e.g., idx labels) would be a reasonable alternative, but it turns out this would likely be more error prone for users (although it is also allowed as an option). This is because if the tree is modified (e.g., if tips are dropped or the tree is re-rooted) the node indices will change. In contrast, the relationships among tips (i.e., who shares a more recent common ancestor with whom) does not change with any of these tree modifications. Node names are another option, but in most trees internal nodes are not named.
The plot below shows how node idx labels changes as the tree is modified. This is the reason why using idx labels as selectors is more error prone.
[3]:
# store a rooted copy of tre (more on this later...)
rtre = tre.root(['33588_przewalskii', '32082_przewalskii'])
[4]:
rtre.draw();
[25]:
# a multitree storing the unrooted and rooted toytrees
mtre = toytree.mtree([tre, rtre])
# plot shows that idx labels change with rerooting
mtre.draw(
node_labels='idx',
node_sizes=15,
);
Fuzzy tip label matching¶
Many toytree functions allow for a variety of input methods to select the list of tip labels to represent a clade. To create the name list without having to type each name out by hand, you can use fuzzy name matching. The three options are to write each name into a list using the names
argument; to select samples based on a shared unique string sequence in their names with wildcard
; or using a regex
(regular expression) statement to match samples using more complex name patterns.
Get node idx label from tip labels¶
In the example below I use the function .get_mrca_idx_from_tip_labels()
, which returns the correct node index of the mrca of the tips entered as arguments. You can see in the example below that the names, wildcard, and regex arguments return the correct node label for the clade that includes the two przewalskii samples (see the figure above) for each tree.
[26]:
# get an idx label of przewalskii clade using names, wildcard or regex
print('tre: ', tre.get_mrca_idx_from_tip_labels(names=['33588_przewalskii', '32082_przewalskii']))
print('tre: ', tre.get_mrca_idx_from_tip_labels(wildcard="prz"))
print('tre: ', tre.get_mrca_idx_from_tip_labels(regex="[0-9]*_przewalskii"))
# get an idx label of przewalskii clade using names, wildcard or regex
print('rtre:', rtre.get_mrca_idx_from_tip_labels(names=['33588_przewalskii', '32082_przewalskii']))
print('rtre:', rtre.get_mrca_idx_from_tip_labels(wildcard="prz"))
print('rtre:', rtre.get_mrca_idx_from_tip_labels(regex="[0-9]*_przewalskii"))
tre: 19
tre: 19
tre: 19
rtre: 23
rtre: 23
rtre: 23
Get TreeNode object from node idx label¶
[27]:
tre.idx_dict[19]
[27]:
<toytree.TreeNode.TreeNode at 0x7f2b105d0320>
Get tip labels from a node idx label¶
If you really want to select parts of the tree using nodes because maybe the tip names are very hard to match then this can be done using the get_tip_labels()
function to build a list of tip names from a node idx label. If you enter an idx
argument to this function it will return a list of names descended from the node. If no idx argument is entered then the root node idx is used so that all tip labels are returned.
[28]:
# get list of tips descended from a specific node in the tree
tre.get_tip_labels(idx=19)
[28]:
['33588_przewalskii', '32082_przewalskii']
[29]:
# get list of all tips in the tree
tre.get_tip_labels()
[29]:
['38362_rex',
'39618_rex',
'35236_rex',
'35855_rex',
'40578_rex',
'30556_thamno',
'33413_thamno',
'33588_przewalskii',
'32082_przewalskii',
'30686_cyathophylla',
'29154_superba',
'41478_cyathophylloides',
'41954_cyathophylloides']
The .get_tip_labels()
function can be combined with .get_mrca_idx_from_tip_labels()
function to get a list of names that are all descendend from a common ancestor. For example, in the rooted tree above if I wanted to get a list of all tip labels in the ingroup clade I could select just one sample from each of the two subclades in it with .get_mrca_idx_from_tip_labels()
to get the node idx of their common ancestor. Then pass this to .get_tip_labels()
to return the full list of
descendants. This is an efficient way to build a list of tip label names for large clade without having to write them all out by hand.
[30]:
# get node index (idx) of mrca
idx = rtre.get_mrca_idx_from_tip_labels(["29154_superba", "40578_rex"])
# get tip labels descended from node idx
rtre.get_tip_labels(idx=idx)
[30]:
['38362_rex',
'39618_rex',
'35236_rex',
'35855_rex',
'40578_rex',
'30556_thamno',
'33413_thamno',
'41478_cyathophylloides',
'41954_cyathophylloides',
'30686_cyathophylla',
'29154_superba']
Modifying ToyTrees¶
ToyTrees provide a number of functions for modifying the tree structure. All of these methods return a modified copy of the object – they do not change your original tree by modifying it in place. This is useful because you can reliably chain together multiple tree modification functions (e.g., see Chaining many functions and arguments). As discussed above, it is generally good practice to use tip name selectors to identify clades that should be modified on the tree. In some cases, if you are modifying a tree and using plotting styles that both rely on the tree structure, it may be easier and more clear to separate the code into multiple separate function calls. The process of chaining arguments together makes for elegant code, but use whichever method is most comfortable for you. See the Cookbook gallery for more examples.
Rooting trees¶
You can root toytrees using the .root()
function call. This takes as an argument either a single tip name, or a list of tip names. You can use the fuzzy name matching options to match multiple tip names, as shown below.
[31]:
# three ways to do the same re-rooting
rtre = tre.root(names=["32082_przewalskii", "33588_przewalskii"])
rtre = tre.root(wildcard="prz")
rtre = tre.root(regex="[0-9]*_przewalskii")
# draw the rooted tree
rtre.draw(node_labels='idx', node_sizes=15);
There is also a function .unroot()
to remove the root node from trees. This creates a polytomy at the root. Technically there still exists a point on the treenode structure that we refer to as the root, but it does not appear in drawings.
[32]:
# an unrooted tree
rtre.unroot().draw();
Drop tips¶
Dropping tips from a tree retains the structure of the remaining nodes in the tree. Here again you can use fuzzy name matching to select the tips you wish to drop from the tree. In this case the names that are selected with matching do not have to form a monophyletic clade, however, if you select to remove all tips in the tree then it will raise an error.
[33]:
rtre.drop_tips(wildcard="cyatho").draw();
Ladderize¶
By default toytrees are ladderized unless you change the tip order in some way, by either entering a fixed_order for tip labels, by dropping tips from the tree, or by rotating nodes. If you want to return a tree to being ladderized you can do so with the .ladderize()
function.
[34]:
# dropping tips unladderized the tree, so we re-ladderized it before plotting
rtre.drop_tips(wildcard="cyatho").ladderize().draw();
Rotate nodes¶
Rotating nodes of the tree does not affect the actual tree structure (e.g., the newick structure does not change), it simply affects the order of tips when the tree is drawn. You can rotate nodes by entering tip names as in the previous examples using either names, wildcard, or regex. The names must form a monophyletic clade for one of the descendants of the node you wish to rotate. Rotating nodes for plotting is usually done for some aesthetic reason, such as aligning tips better with geography or trait values plotted on the tips of the tree.
[35]:
rtre.rotate_node(wildcard="prz").draw();
Resolve polytomy¶
This method should generally not be used much unless needed. The problem is that you usually don’t know what to set the branch length to for the new edge when you split a polytomy. If the tree is unrooted then you should use .root()
instead to root it. If you have a hard polytomy in the tree and need to resolve it then this will resolve all polytomies in the tree. You can change what the default .dist and .support values will be on the new node.
[36]:
toytree.tree("((a,b,c),d);").resolve_polytomy(dist=1.).draw();
Chaining functions and arguments¶
Because the tree modification calls in toytrees always return a copy of the object, you can chain together many of these functions when building a plot. This is especially nice if you are only modifying the tree temporarily for the purpose of plotting (e.g., rotating nodes), and so you don’t need to store the intermediate trees. It’s kind of analagous to using pipes in bash programming.
When chaining many function calls and plotting styles together in toytree code it is best to use good coding practices. In the example below I split each function call and style option over a separate line. This makes the code more readable, and easier to debug, since you can comment out a line at a time to examine its effect without it breaking the rest of the command. The parentheses surrounding the main function calls makes this possible.
[37]:
# readable style for writing long draw functions
canvas, axes, mark = (
tre
.root(wildcard="prz")
.drop_tips(wildcard="superba")
.rotate_node(wildcard="30686")
.draw(
tip_labels_align=True,
edge_style={
"stroke": toytree.colors[3],
}
)
)
Attributes and functions¶
[38]:
rtre.get_tip_labels() # list of labels in node-plot order
rtre.get_tip_coordinates() # array of tip plot coordinates in idx order
rtre.get_node_values() # list in node-plot order
rtre.get_node_dict() # dict mapping idx:name for each tip
rtre.get_node_coordinates() # array of node plot coordinates in idx order
rtre.get_edge_values() # list of edge values in edge plot order
rtre.get_edge_values_mapped(); # list of edge values with mapped dict in edge plot order
[39]:
rtre.is_bifurcating() # boolean
rtre.is_rooted(); # boolean
[40]:
rtre.nnodes # number of nodes in the tree
rtre.ntips # number of tips in the tree
rtre.newick # the newick representation of the tree
rtre.features # list of node features that can be accessed
rtre.style; # dict of plotting style of tree
Saving/writing ToyTrees¶
[41]:
# if no file handle is entered then the newick string is returned
rtre.write()
[41]:
'((32082_przewalskii:0.00259326,33588_przewalskii:0.00247134)100:0.0179371,(((29154_superba:0.00634237,30686_cyathophylla:0.00669945)100:0.00237995,(41954_cyathophylloides:8.88803e-05,41478_cyathophylloides:5.28218e-05)100:0.00941021)100:0.00297626,(33413_thamno:0.00565358,(30556_thamno:0.00653218,((40578_rex:0.00335406,35855_rex:0.00339963)100:0.00223,(35236_rex:0.00580525,(39618_rex:0.000962081,38362_rex:0.00109218)100:0.00617527)96:0.0007389)99:0.000783365)100:0.0010338)100:0.00538723)100:0.0179371);'
[42]:
# the fmt (format) options write different newick formats.
rtre.write(tree_format=9)
[42]:
'((32082_przewalskii,33588_przewalskii),(((29154_superba,30686_cyathophylla),(41954_cyathophylloides,41478_cyathophylloides)),(33413_thamno,(30556_thamno,((40578_rex,35855_rex),(35236_rex,(39618_rex,38362_rex)))))));'
[43]:
# write to file
rtre.write("/tmp/mytree.tre", tree_format=0)
TreeNode objects¶
The .treenode
attribute of ToyTrees allows users to access the underlying TreeNode structure directly. This is where you can traverse the tree and query the parent/child relationships of nodes. While this is used extensively within the code of toytree, most users will likely not need to interact with TreeNodes in order do most things they want toytree for (i.e., drawing). However, for power users, the TreeNode structure of toytrees provides a lot of additional functionality especially for
doing scientific computation and research on trees. The TreeNode
object in toytree is a modified fork of the TreeNode in ete3. Thus, you can read the very detailed ete documentation if you want a detailed understanding of the object.
[1]:
import toytree
import toyplot
import numpy as np
# generate a random tree
tre = toytree.rtree.unittree(ntips=10, seed=12345)
TreeNode objects are always nested inside of ToyTree objects, and accessed from ToyTrees. When you use .treenode
to access a TreeNode from a ToyTree you are actually accessing the top level node of the tree structure, the root. The root TreeNode is connected to every other TreeNode in the tree, and together they describe the tree structure.
[2]:
# the .treenode attribute of the ToyTree returns its root TreeNode
tre.treenode
[2]:
<toytree.TreeNode.TreeNode at 0x7f2b1f711978>
[3]:
# the .idx_dict of a toytree makes TreeNodes accessible by index
tre.idx_dict
[3]:
{18: <toytree.TreeNode.TreeNode at 0x7f2b1f711978>,
17: <toytree.TreeNode.TreeNode at 0x7f2b1f7e14e0>,
16: <toytree.TreeNode.TreeNode at 0x7f2b2fc31198>,
15: <toytree.TreeNode.TreeNode at 0x7f2b1f71a710>,
14: <toytree.TreeNode.TreeNode at 0x7f2b1f71a278>,
13: <toytree.TreeNode.TreeNode at 0x7f2b1f71ada0>,
12: <toytree.TreeNode.TreeNode at 0x7f2b2fc26ac8>,
11: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7828>,
10: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7898>,
9: <toytree.TreeNode.TreeNode at 0x7f2b2fc2dc18>,
8: <toytree.TreeNode.TreeNode at 0x7f2b1f71ae48>,
7: <toytree.TreeNode.TreeNode at 0x7f2b2fbc78d0>,
6: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7908>,
5: <toytree.TreeNode.TreeNode at 0x7f2b2fbc77f0>,
4: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7940>,
3: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7978>,
2: <toytree.TreeNode.TreeNode at 0x7f2b2fbc7860>,
1: <toytree.TreeNode.TreeNode at 0x7f2b2fbc79b0>,
0: <toytree.TreeNode.TreeNode at 0x7f2b2fbc79e8>}
Traversing TreeNodes¶
To traverse a tree means to move from node to node to visit every node of the tree. In this case, we move from TreeNode to TreeNode. Depending on your reason for traversing the tree, the order in which nodes are visited may be arbitrary, or, it may actually be very important. For example, if you wish to calculate some new value on a node that depends on the values of its children, then you will want to visit the child nodes before you visit their parents. TreeNodes can be traversed in three ways. Below I print the order that nodes are visited for each. You can see the node index labels plotted on the tree which toytree uses to order nodes for plotting.
[4]:
print('levelorder:', [node.idx for node in tre.treenode.traverse("levelorder")])
print('preorder: ', [node.idx for node in tre.treenode.traverse("preorder")])
print('postorder: ', [node.idx for node in tre.treenode.traverse("postorder")])
tre.draw(node_labels=True, node_sizes=16);
levelorder: [18, 17, 16, 9, 15, 14, 13, 8, 12, 5, 11, 2, 10, 7, 6, 4, 3, 1, 0]
preorder: [18, 17, 9, 15, 8, 12, 7, 6, 16, 14, 5, 11, 4, 3, 13, 2, 10, 1, 0]
postorder: [9, 8, 7, 6, 12, 15, 17, 5, 4, 3, 11, 14, 2, 1, 0, 10, 13, 16, 18]
TreeNodes have a large number of attributes and functions available to them which you can explore using tab-completion in a notebook and from the ete3 tutorial. In general, only advanced users will need to access attributes of the TreeNodes directly. For example, it is easier to access node idx and name labels from ToyTrees than from TreeNodes, since ToyTrees will return the values in the order they will be plotted.
[5]:
# traverse the tree and access node attributes
for node in tre.treenode.traverse(strategy="levelorder"):
print("{:<5} {:<5} {:<5} {:<5}".format(
node.idx, node.name, node.is_leaf(), node.is_root()
)
)
18 18 0 1
17 17 0 0
16 16 0 0
9 r9 1 0
15 15 0 0
14 14 0 0
13 13 0 0
8 r8 1 0
12 12 0 0
5 r5 1 0
11 11 0 0
2 r2 1 0
10 10 0 0
7 r7 1 0
6 r6 1 0
4 r4 1 0
3 r3 1 0
1 r1 1 0
0 r0 1 0
Adding features to TreeNodes¶
For the purposes of plotting, there are cases where accessing TreeNode attributes can be particularly powerful. For example, when you want to build a list of values for plotting that are based on the tree structure itself (number of children, edge length, is_root, etc.). You can traverse through the tree and calculate these attributes for each node.
When doing so, I have a recommended best practice that once again is intended to help users avoid accidentally plotting values in an incorrect order. This recommended practice is to add new features to the TreeNodes by traversing the tree, but then to retrieve and plot the features from the TreeNodes using ToyTree, since ToyTrees are the objects that organize the coordinates for plotting.
[6]:
# see available features on a ToyTree
tre.features
[6]:
{'dist', 'height', 'idx', 'name', 'support'}
Let’s say we wanted to plot a value on each node of a toytree. You can use the toytree function .set_node_values()
to set a value to each node. This takes the feature name, a dictionary mapping values to idx labels, and optionally a default value that is assigned to all other nodes. You can modify existing features or set new features.
[18]:
# set a feature a few nodes with a new name
tre = tre.set_node_values(
feature="name",
values={0: 'tip-0', 1: 'tip-1', 2: 'tip-2'},
)
[19]:
# set a feature to every node of a random integer in 1-5
tre = tre.set_node_values(
feature="randomint",
values={idx: np.random.randint(1, 5) for idx in tre.idx_dict},
)
Another potentially useful ‘feature’ to access includes statistics about the tree. For example, we may want to measure the number of extant descendants of each node on a tree. Such things can be measured directly from TreeNode objects. Below I use get_leaves()
as an example. You can see the ete3 docs for more info on TreeNode functions and attributes.
[20]:
# set a feature to every node for the number of descendants
tre = tre.set_node_values(
feature="ndesc",
values={
idx: len(node.get_leaves())
for (idx, node) in tre.idx_dict.items()
}
)
The set_node_values()
function of toytrees operates similarly to the loop below which visits each TreeNode of the tree and adds a feature. The .traverse()
function of treenodes is convenient for accessing all nodes.
[21]:
# add a new feature to every node
for node in tre.treenode.traverse():
node.add_feature("ndesc", len(node.get_leaves()))
Modifying features of TreeNodes¶
Note: Use caution when modifying features of TreeNode objects because you can easily mess up the data that toytree needs in order to correctly plot trees and orient nodes, and tips, etc. This is why interacting with TreeNode objects directly should be considered an advanced method for toytree users. In contrast to ToyTree functions, which do not modify the tree structure in place, but instead return a copy, modification to TreeNodes do occur in place and therefore effect the current tree. Be
aware that if you modify the parent/child relationships in the TreeNode it will change the tree. Similarly, if you change the .dist
or .idx
values of nodes it will effect the edge lengths and the order in which nodes are plotted.
Accessing features from ToyTrees¶
The recommended workflow for adding features to TreeNodes and including them in toytree drawings is to use ToyTrees to retrieve the features, since ToyTree ensure the correct order. When you add a new feature to TreeNodes it can then be accessed by ToyTrees just like other default features: “height”, “idx”, “name”, etc. You can use .get_node_values()
to retrive them in the proper order, and to censor values for the root or tips if wanted. This also allows you to further build color mappings
based on these values, calculate further statistics, etc.
[22]:
# ndesc is now an available feature alongside the defaults
tre.features
[22]:
{'dist', 'height', 'idx', 'name', 'ndesc', 'randomint', 'support'}
[23]:
# it can be accessed from the ToyTree object using .get_node_values()
tre.get_node_values('ndesc', True, True)
[23]:
array([10, 4, 6, 3, 3, 3, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1])
[24]:
# and can be accessed by shortcut using just the feature name to 'node_labels'
tre.draw(node_labels=("ndesc", 1, 0), node_sizes=15);
Here is another example where color values are stored on TreeNodes and then retrieved from the ToyTree, and then used as draw argument to color nodes based on their TreeNode attribute. The nodes are colored based on whether the TreeNode was True or False for the .is_leaf()
. We use the default color palette of toytree accessed from toytree.colors
.
[25]:
# traverse the tree and modify nodes (add new 'color' feature)
for node in tre.treenode.traverse():
if node.is_leaf():
node.add_feature('color', toytree.colors[1])
else:
node.add_feature('color', toytree.colors[2])
# store color list with values for tips and root
colors = tre.get_node_values('color', show_root=1, show_tips=1)
# draw tree with node colors
tre.draw(node_labels=False, node_colors=colors, node_sizes=15);
Keep in mind that for many lists of attributes you wish to plot on nodes of a tree, or to use for color mapping, such as support values or names you likely will not need to add features to the tree since the features are already available by default. In that case you can get far using just the get_node_values()
function from ToyTrees.
MultiTree objects¶
Toytree supports the use of MultiTree objects to store lists of linked trees, such as bootstrap replicates or trees sampled from a posterior distribution. MultiTree objects can be generated from a list of Toytrees or newick strings, or by parsing a file, url, or string of text that includes newick trees separated by newlines. The convenience function for creating MuliTrees is toytree.mtree()
.
[1]:
import toytree
import toyplot
import numpy as np
Parsing data into MultiTrees¶
An example string or file representing multiple trees as newick strings:
[2]:
string = """\
(((a:1,b:1):1,(d:1.5,e:1.5):0.5):1,c:3);
(((a:1,d:1):1,(b:1,e:1):1):1,c:3);
(((a:1.5,b:1.5):1,(d:1,e:1):1.5):1,c:3.5);
(((a:1.25,b:1.25):0.75,(d:1,e:1):1):1,c:3);
(((a:1,b:1):1,(d:1.5,e:1.5):0.5):1,c:3);
(((b:1,a:1):1,(d:1.5,e:1.5):0.5):2,c:4);
(((a:1.5,b:1.5):0.5,(d:1,e:1):1):1,c:3);
(((b:1.5,d:1.5):0.5,(a:1,e:1):1):1,c:3);
"""
Create a MultiTree object by passing the input data to toytree.mtree()
. The treelist
attribute of mtree
objects provides a list of the trees in it. These can be indexed like any list and each individual item is a ToyTree, which can be drawn or manipulated like any normal ToyTree
class object.
[3]:
# create an mtree from a string, list of strings, url, or file.
mtre0 = toytree.mtree(string)
# access the treelist
mtre0.treelist
[3]:
[<toytree.Toytree.ToyTree at 0x7f433300aeb8>,
<toytree.Toytree.ToyTree at 0x7f4332f9a358>,
<toytree.Toytree.ToyTree at 0x7f4332f9a588>,
<toytree.Toytree.ToyTree at 0x7f4332f9a780>,
<toytree.Toytree.ToyTree at 0x7f4332f9a978>,
<toytree.Toytree.ToyTree at 0x7f4332f9ab70>,
<toytree.Toytree.ToyTree at 0x7f4332f9ad68>,
<toytree.Toytree.ToyTree at 0x7f4332f9af60>]
[4]:
# create an mtree from a list of ToyTrees (can be sliced like any list)
mtre1 = toytree.mtree(mtre0.treelist[:5])
# access the treelist
mtre1.treelist
[4]:
[<toytree.Toytree.ToyTree at 0x7f433300aeb8>,
<toytree.Toytree.ToyTree at 0x7f4332f9a358>,
<toytree.Toytree.ToyTree at 0x7f4332f9a588>,
<toytree.Toytree.ToyTree at 0x7f4332f9a780>,
<toytree.Toytree.ToyTree at 0x7f4332f9a978>]
[5]:
# access an individual toytree from the treelist
mtre1.treelist[0].draw();
Consensus trees¶
Before we get into the plotting features of MultiTrees, let’s first explore several useful functions that toytree provides for analyzing groups of trees. First, we can infer a majority-rule consensus tree from a group of input topologies. The .get_consensus_tree
function will return a ToyTree with the consensus topology and clade supports stored on nodes as the “support” feature.
[6]:
ctre = mtre0.get_consensus_tree().root("c")
ctre.draw(node_labels='support', use_edge_lengths=False, node_sizes=16);
Access ToyTrees in the treelist¶
You can access each item in a treelist and plot it to examine the variation in topologies individually. Or you can do any other calculations you wish using the underlying TreeNode objects. Below we iterate over all toytree objects in the treelist and extract the underlying TreeNode and pass each to the consensus tree’s TreeNode object which has a function to calculate Robinson-Foulds distances. This is a measure of the topological mismatch between the trees.
[7]:
# get Robinson-Foulds distances between consensus and each tree in list
[ctre.treenode.robinson_foulds(i.treenode)[0] for i in mtre0.treelist]
[7]:
[0, 4, 0, 0, 0, 0, 0, 4]
[8]:
# iterate over treelist and plot each tree on a separate canvas
for tre in mtre0.treelist[:4]:
tre.draw()
TreeGrid plot¶
A bit simpler still when you call .draw()
from a multitree it will return a grid drawing with multiple toytrees spaced on a canvas. A similar plot an be made by using toyplot Canvas and Axes arguments, as explained in the Quick Guide, but this is a quick shortcut for examining a number of trees.
[9]:
mtre0.draw(nrows=2, ncols=4, ts='o');
Fixing the tip order¶
With MultiTree plots the goal is often to view discordance among trees, which can be made more apparent by fixing the tip order so it is the same in all trees. You can fix the tip order for a toytree drawing by using the fixed_order
argument, and similarly for multitree drawings this will fix the order across multiple tree drawings. By default the spacing between tips is 1, so when you provide a list of five names it will arrange them onto the coordinates (0, 1, 2, 3, 4). Below I show the
axes so you can see the tip placement more clearly.
[10]:
# drop the first tree with fixed_order
canvas, axes, mark = mtre0.treelist[1].draw(
fixed_order=['e', 'd', 'b', 'a', 'c'],
ts='o',
);
# show the axes
axes.show = True
Similarly, you can also set an explicit spacing between tips using the fixed_position
argument to draw. This is less often used since we usually want the tips to be spaced evenly, but there are cases where you can create interesting drawings by providing a list of tip labels in fixed_order
and a corresponding list of positions in fixed_position
.
[11]:
canvas, axes, mark = mtre0.treelist[1].draw(
fixed_order=['e', 'd', 'b', 'a', 'c'],
fixed_position=[0, 1, 3, 6, 12],
edge_type='c',
);
axes.show = True
This applies similarly to multitree drawings. In the example below I use a boolean True argument to fixed_order
which will infer the consensus tree tip order and use that to order the tip labels for all of the tree drawings.
[12]:
mtre0.draw(nrows=2, ncols=4, ts='o', fixed_order=True);
CloudTree plot¶
It can be more informative still to plot a number of trees on top of each other. These are sometimes called “densitree” plots, or here, “cloud tree plots”.
[13]:
# draw cloud tree
canvas, axes, mark = mtre0.draw_cloud_tree(
edge_style={
"stroke-opacity": 0.1,
"stroke-width": 3,
},
);
Styling tip labels in cloud trees¶
In cloud tree plots a fixed order of the tips will always be enforced, which allows for the discordance among trees to be visualized. Because each tree within the multitree object may have different ordering of it’s tips, we only print the tip labels once. The order of the tips of the tree can be changed by using the fixed order
argument, otherwise a consensus tree is quickly inferred and used for the tip order. To style the tip labels or change them, like below, you can provide a list of
new names in the same order as in the first tree in the treelist.
[14]:
# draw cloud tree (here with some extra styling)
mtre0.draw_cloud_tree(
width=250,
fixed_order=['c', 'd', 'e', 'b', 'a'],
edge_style={"stroke-opacity": 0.1, "stroke-width": 2},
tip_labels=["tip-{}".format(i) for i in mtre0.treelist[0].get_tip_labels()],
);
Example: Xiphophorus fishes¶
Data set for reconstructing a densitree figure from Cui et al. (2013). I’ve taken the nexus file from the paper’s dryad repository and converted it to newick and saved it online so it can be easily downloaded. The file contains 160 trees representing mrbayes consensus trees inferred for different genomic regions.
[15]:
fish = toytree.mtree("https://eaton-lab.org/data/densitree.nex")
print(len(fish))
160
Tree grid styling¶
Trees in TreeGrid drawings can be styled individually by setting the style
dictionary attribute of each ToyTree in the treelist. Additionally, most styles can be applied as arguments to the draw_tree_grid()
function to apply styles to all trees at once.
[16]:
fish.draw(nrows=1, ncols=4, height=300);
[17]:
# make a copy of the multitree since we will modify the style of each tree
cfish = fish.copy()
# set different 'tip_labels_colors' for each tree
for tree in cfish.treelist[:4]:
tree.style.tip_labels_colors = next(toytree.icolors2)
# draw several trees
cfish.draw(1, 4, height=300);
Setting a fixed_order
to the tips of the tree makes it easier to see discordance among multiple trees. Here I first infer a consensus tree and then use the tip order of the consensus tree to order and plot the first few trees from the treelist.
[18]:
# get majority-rule consensus tree
consfish = fish.get_consensus_tree()
# draw tree grid and use consensus tree order as a fixed_order of tips
cfish.draw(
nrows=2,
ncols=3,
height=600,
width=600,
fixed_order=True,
edge_type='c',
shared_axes=True,
);
Fixed tip order¶
When drawing CloudTrees
the order of names at the tips is always fixed across individual ToyTrees, this is required in order to see disagreement among topologies. For example, we can fix the order to match the first tree in the treelist by using fixed_order=True
. But more often you will want to set it to some specific order, like we did above by using a consensus tree order, fixed_order=[list-of-names]
.
If you want to change the names of the tips of the CloudTree, or style them at the time of plotting, this can be done by providing a dictionary object mapping the old names to the new ones. In the example below I create a dictionary argument to tip_labels
that uses the labels from the consensus tree (cfish) as keys, and modifies those strings as the values of the dictionary.
[19]:
# draw a cloud tree which enforces a fixed tip order
fish.draw_cloud_tree(
fixed_order=consfish.get_tip_labels(),
tip_labels_style={"font-size": "11px"},
tip_labels=[
'{}. {}'.format(i[0], i[1:])
for i in fish.treelist[0].get_tip_labels()
],
);
Custom tip order¶
If the fixed_order
argument is provided as a list of names then tips of the tree will be ordered according to the list. Take note: the structure of the relationships in the tree (e.g., the newick representation) does not change with fixed_order, this is simply changing the order that tips are presented when plotting. For example, the tip order below was used in the published paper by Cui et al. since it shows the geographic distributions of clades nicely ordered from north to south. When
entering names as a list the order of names is plotted from bottom (x axis=0) to the top location on a right-facing tree.
[20]:
customorder = [
"Priapella",
"Psjonesii",
"Xmayae",
"Xalvarezi",
"Xhellerii",
"Xsignum",
"Xmonticolus",
"Xclemenciae_G",
"Xbirchmanni_GARC",
"Xmalinche_CHIC2",
"Xcortezi",
"Xnezahuacoyotl",
"Xmontezumae",
"Xcontinens",
"Xpygmaeus",
"Xmultilineatus",
"Xnigrensis",
"Xgordoni",
"Xmeyeri",
"Xcouchianus",
"Xxiphidium",
"Xvariatus",
"Xevelynae",
"Xmilleri",
"Xandersi",
"Xmaculatus_JpWild",
]
[21]:
# set fixed tip order
fish.draw(
height=300,
width=700,
fixed_order=customorder,
edge_type='c',
);
CloudTree Styling¶
CloudTree drawings can use the same style arguments as ToyTrees drawings. For example, the edge_style
dictionary can be used to modify the edge colors (stroke) and opacity. Here I fancy up the tip names a bit as well and add some additional points at the tips of the tree using the toyplot scatterplot function. The finished figure looks quite similar to the published figure in Cui et al. (2013).
[22]:
# draw the cloudtree
canvas, axes, mark = fish.draw_cloud_tree(
height=450,
edge_style={
'stroke': toyplot.color.brewer.palette("BlueGreen")[4],
'stroke-opacity': 0.05,
},
fixed_order=customorder,
tip_labels=[
"{}. {}".format(i[0], i[1:]) for i in customorder
]
);
# add colored nodes at the tips (x-axis=0) (y-axis=0-ntips)
xlocs = np.zeros(fish.ntips)
ylocs = np.arange(fish.ntips)
colors = np.concatenate([
[toytree.colors[2]] * 2,
[toytree.colors[1]] * 6,
[toytree.colors[5]] * 9,
[toytree.colors[0]] * 9,
])
axes.scatterplot(
xlocs + 0.05,
ylocs,
color=colors,
mstyle={"stroke": "#262626", "stroke-width": 0.75},
size=6,
);
Save to disk as PDF¶
[23]:
# write as PDF
import toyplot.pdf
toyplot.pdf.render(canvas, "/tmp/fish-cloudtree.pdf")
[24]:
# write the newick trees to file
fish.write("/tmp/fish-trees.tre")
Styling individual trees: color edges differently¶
MultiTrees objects are really useful for comparing tree topologies, especially when combined with CloudTree drawings and TreeNode calculations for comparing trees. In the examples below I create plots that color trees differently depending on their topology.
[25]:
# a linear color map
colormap = toyplot.color.brewer.map("BlueRed")
colormap
[25]:
Using the colormap above, we’ll color each tree as a continuous variable where darker indicates that the topology has a greater number of differences compared to the majority-rule consensus topology. We can calculate this using the Robinson-Foulds distance function.
[26]:
# get consensus tree
fishcons = fish.get_consensus_tree()
# calculate RF distances relative to consensus tree
rfdists = np.array([
fishcons.treenode.robinson_foulds(i.treenode, unrooted_trees=True)[0]
for i in fish.treelist
])
# print the first 10 values
print(rfdists[:10], '...')
# broadcast values into colorspace
colors = toyplot.color.broadcast(
colors=(rfdists, colormap),
shape=rfdists.shape,
)
# return the first ten values a colors
colors[:10]
[10 16 20 14 25 13 18 9 21 20] ...
[26]:
To set styles on individual ToyTrees in the cloud plot we will need to set the styles in the .style
attribute of each tree. In the example below we apply a linear colormap to the trees in which trees that have greater difference from the consensus tree are lighter in color. You could imagine many such interesting ways in which to investigate tree variation using color mapping.
[27]:
# set edge stroke (color) using colors list created above
for idx, tree in enumerate(fish.treelist):
tree.style.edge_style["stroke"] = colors[idx]
# draw fish tree with a list of arguments for edge_style
draw = fish.draw_cloud_tree(
height=450,
width=350,
edge_style={"stroke-opacity": 0.03}
);
Reset tree styles¶
In the example above we modified the tree style for the ToyTrees in the .treelist of the MultiTree object which will affect all drawings made using these same trees. To reset the tree styles of those Toytrees back to the defaults you can use the function reset_tree_styles()
. Alternatively you could also just reload the MultiTree object from a file.
Here we reset the tree styles and then draw a new plot where we color each tree’s edges based on whether a certain clade was observed (monophyletic) or not. We’ll ask whether the clade (clemencieae, monticolus) exists by using the .check_monophyly
function from the TreeNode of each tree in the treelist.
[28]:
# clear the styling that we added to the ToyTrees in the treelist above
fish.reset_tree_styles()
# get boolean list checking monophyly of clade I in each subtree
monophyly = [
i.treenode.check_monophyly(('Xmonticolus', 'Xclemenciae_G'), "name")[0]
for i in fish.treelist
]
# set tree edge stroke colors using monophyly list
for idx, tree in enumerate(fish.treelist):
if monophyly[idx]:
tree.style.edge_colors = toytree.colors[0]
else:
tree.style.edge_colors = toytree.colors[1]
# plot cloud tree using edge_styles
draw = fish.draw_cloud_tree(edge_style={"stroke-opacity": 0.025});
Xiphophorus consensus tree¶
You can order any normal Toytree as well by setting fixed_order
when initiating the tree. Below I order a consensus tree into a toytree with fixed_order to be the same custom order that we used above. You can see that the Xiphophorus cloud tree above has fairly low agreement across most subtrees. The consensus tree mostly matches the custom_order tree except for with X. Xiphidium which you can see has an edge crossing over with others.
[29]:
# get a consensus tree
fishcons = fish.get_consensus_tree()
# re-order consensus tree into the custom tip order used by Cui et al.
fishcons = toytree.tree(fishcons, fixed_order=customorder)
# plot the drawing
fishcons.draw(node_labels='support', use_edge_lengths=False, node_sizes=15);
Styling toytree drawings¶
The number of styling available in toytree is enormous and will continue to grow as development on the project continues. If you have a specific styling option that does not appear to be supported please raise a issue on GitHub and we can discuss adding support for it. Below I try to demonstrate the options and usage of each styling option with at least one example.
[1]:
import toytree
import toyplot
import numpy as np
# a tree to use for examples
url = "https://eaton-lab.org/data/Cyathophora.tre"
rtre = toytree.tree(url).root(wildcard='prz')
Tip label styling¶
tip_labels¶
[2]:
# hide tip labels
rtre.draw(tip_labels=False);
[3]:
# get tip labels from tree
tipnames = rtre.get_tip_labels()
# modify list so that html italic tags surround text
italicnames = ["<i>{}</i>".format(i) for i in tipnames]
# enter the list of names to tip_labels
rtre.draw(tip_labels=italicnames);
tip_labels_align¶
[4]:
rtre.draw(tip_labels_align=True);
tip_labels_colors¶
"#262626"
(near black).get_tip_labels()
returns the labels. This order (the plot order) is from the tip located on the zero-axis (e.g., x=0 for right-facing trees) and continuing until the last name. If both tip_labels_colors
and tip_labels_style["fill"]
are used
tip_labels_colors
overrides the other. In contrast to the fill style, only this option can be used to apply multiple colors.[5]:
# use color from favored toytree color scheme
rtre.draw(
tip_labels_align=True,
tip_labels_colors=toytree.colors[1],
);
# enter a list of colors by name
rtre.draw(
tip_labels_align=True,
tip_labels_colors=(['goldenrod'] * 11) + (["mediumseagreen"] * 2),
);
# make list of hex color values based on tip labels
colorlist = ["#d6557c" if "rex" in tip else "#5384a3" for tip in rtre.get_tip_labels()]
rtre.draw(
tip_labels_align=True,
tip_labels_colors=colorlist
);
tip_labels_style¶
[6]:
rtre.draw(
tip_labels_style={
"fill": "#262626",
"font-size": "11px",
"-toyplot-anchor-shift": "15px",
}
);
Node labels styling¶
node_labels¶
[7]:
# shows node idx labels on all nodes
rtre.draw(node_labels=True);
[8]:
# suppreses nodes
rtre.draw(node_labels=False);
[9]:
# suppresses node labels, sizes ensures nodes are still shown
rtre.draw(node_labels=False, node_sizes=10);
[10]:
# shortcut for 'default' features always present in TreeNodes, suppresses tip nodes.
rtre.draw(node_labels="support");
[11]:
# build a list of values in the correct node plot order
sups = rtre.get_node_values("support", show_root=True, show_tips=True)
rtre.draw(node_labels=sups);
node_labels_style¶
[12]:
rtre.draw(
node_labels='idx',
node_labels_style={
"fill": "#262626",
"font-size": "8px",
}
);
Node styling¶
node_sizes¶
[13]:
rtre.draw(
node_labels=False,
node_sizes=10,
);
[14]:
# draw random values to use for node sizes
np.random.seed(1234)
sizes = np.random.uniform(5, 15, rtre.nnodes)
rtre.draw(
node_labels=False,
node_sizes=sizes,
);
node_colors¶
[15]:
# set a single color for all nodes
rtre.draw(
node_labels=False,
node_sizes=10,
node_colors=toytree.colors[1],
);
[20]:
rtre.get_node_values("support", 1, 0)
[20]:
array(['100', '100', '100', '100', '100', '100', '100', '100', '99',
'100', '96', '100', '', '', '', '', '', '', '', '', '', '', '', '',
''], dtype='<U21')
[22]:
# get list of sizes and colors in node plot order with tip nodes suppressed
sizes = [10 if i else 0 for i in rtre.get_node_values('support', 1, 0)]
colors = ['black' if i=='100' else 'red' for i in rtre.get_node_values('support', 1, 0)]
# enter a lists of values
rtre.draw(
node_labels=None,
node_sizes=sizes,
node_colors=colors,
);
node_markers¶
See toyplot markers for available options.
[23]:
# enter a lists of values
rtre.draw(
node_labels="support",
node_markers="o"
);
[24]:
# enter a lists of values
rtre.draw(
height=350,
node_labels=None,
node_sizes=[0 if i else 8 for i in rtre.get_node_values(None, 1, 0)],
node_markers="s",
node_colors=toytree.colors[1],
);
Rectangular markers can be drawn in many dimensions. Designate “r2x1” for a box that is twice as wide as it is tall.
[25]:
# rectangles for nodes
rtre.draw(
width=600,
height=400,
node_labels="support",
node_labels_style={"font-size": "11px"},
node_markers="r2x1.25",
node_sizes=12,
);
node_style¶
[26]:
# the classic "ape" style tree plot look
rtre.draw(
width=600,
height=400,
node_labels="support",
node_labels_style={"font-size": "10px"},
node_markers="r2x1.25",
node_sizes=12,
node_style={
"fill": "lightgrey",
"stroke": "black",
"stroke-width": 0.75,
}
);
node_hover¶
Enables interactive hover over nodes so that you can see all features associated with each.
[27]:
rtre.draw(node_hover=True, node_labels=True, node_sizes=15);
Layout¶
layout¶
[28]:
ttre = toytree.rtree.coaltree(20, seed=123)
ttre.draw(
layout='d',
tip_labels_align=True,
node_sizes=[8 if i else 0 for i in ttre.get_node_values()],
node_style={"stroke": "black"},
width=350,
height=300,
);
[37]:
ttre = toytree.rtree.unittree(40, seed=123)
ttre.draw(
layout='c',
edge_type='c',
node_sizes=[8 if i else 0 for i in ttre.get_node_values()],
node_style={"stroke": "black"},
width=400,
height=400,
);
[38]:
ttre = toytree.rtree.unittree(40, seed=123)
ttre.draw(
layout='c',
edge_type='p',
width=400,
height=400,
);
Aligned Edge Styling¶
edge_align_style¶
[39]:
rtre.draw(
tip_labels_align=True,
edge_align_style={
"stroke": "violet",
"stroke-width": 1.5,
"stroke-dasharray": "2,5" # size of dash, spacing of dashes
});
Styles¶
[40]:
rtre.draw(tree_style='n');
[41]:
# generate a random coalescent tree and draw in 'coalescent' style
randomtre = toytree.rtree.coaltree(ntips=10, seed=666)
randomtre.draw(tree_style='c');
Scalebar¶
You can add a scalebar to any tree plot by simply adding scalebar=True
. Alternatively, you can add or customize axes by saving the returned variables from the .draw()
function and modifying the axes.
[42]:
rtre.draw(scalebar=True);
Cookbook gallery¶
This chapter is simply for displaying beautiful and creative plots made using toytree and toyplot. If you have one of your own please reach out (or make a github pull request) to contribute it. You can use simulated data (see examples below) or show examples with real data. For simulated data please limit data generation to the use of numpy and pandas. If real data, please make sure that the trees you use are available in an archived location (reliable URL) so that the plot can be easily re-created.
[1]:
import numpy as np
import toytree
import toyplot
1. ToyTree + barplot¶
Aligning a tree with data is sometimes easier on one axis versus two. See the (#1) and (#2) for comparison. Here when plotting on one axis the tree coordinates which map to treeheight and the number of tips can be difficult to align with data (e.g., a barplot) since the data values may be much greater than the treeheight. This can be fixed by tranforming the data and the axis labels. The example on a two axes is a bit easier in this case.
[36]:
# generate a random tree and data
ntips = 20
rseed = 123456
np.random.seed(rseed)
rtre = toytree.rtree.unittree(ntips=ntips, seed=rseed)
randomdata = np.random.uniform(20, 200, ntips)
# set up a toyplot Canvas with 2 axes: (x1, x2, y1, y2)
canvas = toyplot.Canvas(width=375, height=350)
ax0 = canvas.cartesian(bounds=(50, 200, 50, 300), padding=15, ymin=0, ymax=20)
ax1 = canvas.cartesian(bounds=(225, 325, 50, 300), padding=15, ymin=0, ymax=20)
# add tree to first axes
rtre.draw(axes=ax0);
ax0.show = False
# plot the barplot on the second axes
# (y-axis is range 0-ntips);
# (x-axis is bar values transformed to be 0-1)
# baseline is the space between tipnames and bars
ax1.bars(
np.arange(ntips),
randomdata,
along='y',
);
# style axes
ax1.show = True
ax1.y.show = False
ax1.x.ticks.show = True
[ ]:
2. Spacing tree vs. tip names¶
The ratio of tree to tipnames on a plot is automatically adjusted to try to fit the tip names depending on their font and size, but only to an extent before the tipnames are eventually cutoff. If you want to manually adjust this ratio by squeezing the tree to take up less space this can be done by using the shrink
parameter, as demonstrated below.
In the plot below I show the x-axis tick marks to highlight where the data are located on the x, and where the domain is by default and when extended. You can see that in both cases the treeheight is between 0 and -1 on the x-axis. But in the latter we extend the max domain from 1 to 3 which better accomodates the really long tipnames. Of course you can also increase the width
of the entire canvas as well to increase spacing.
[3]:
# generate a random tree and data
ntips = 20
rseed = 123456
rtre = toytree.rtree.unittree(ntips=ntips, seed=rseed)
# names of different lengths
names = ["".join(np.random.choice(list("abcd"), i + 1)) for i in range(ntips)]
# make a canvas and coords for two plots
canvas = toyplot.Canvas(width=600, height=350)
ax0 = canvas.cartesian(grid=(1, 2, 0), yshow=False)
ax1 = canvas.cartesian(grid=(1, 2, 1), yshow=False)
# plot the tree with its default spacing for tree and names
rtre.draw(tip_labels=names, axes=ax0);
# plot the tree on the second axis
rtre.draw(tip_labels=names, axes=ax1, shrink=10);
3. Node size/color from features¶
By default TreeNodes have a number of features associated with them (support, height, dist, idx, name) and these are often useful for styling nodes. You can also add custom features to nodes (see TreeNodes chapter). Here I set the size and color of nodes based on features of nodes in a random tree (the random node names).
[4]:
# generate a random tree
ntips = 20
rseed = 123
np.random.seed(rseed)
rtre = toytree.rtree.coaltree(ntips=ntips, seed=rseed)
# assign new feature 'ancstate' as the random integer
rtre = rtre.set_node_values(
feature="ancstate",
values={i: np.random.randint(5, 15) for i in rtre.idx_dict},
)
# get values in node plot order
sizes = rtre.get_node_values("ancstate", True, True)
# use a boolean of whether 'ancstate' is >10 to set color
colors = [toytree.colors[0] if (i and int(i)>10) else toytree.colors[1] for i in sizes]
# draw tree with styles
rtre.draw(
node_labels=False,
node_sizes=sizes,
node_colors=colors,
node_style={"stroke": "black"}
);
4. Variable edge colors and widths¶
The function .get_edge_values_from_dict()
is the most convenient way to apply a style value to parts of the tree. It returns a list with the value (e.g., color or width) mapped to the correct index of the list to apply to the correct edge when entered as an argument to draw. This is convenient for applying styles to clades. If alternatively you want to apply style to individual edges it is best to use .get_edge_values()
and use the ‘idx’ argument to return the index order in which edges
are plotted. You can then create a list of edge values based on this order. Both examples are shown below, as well as a way of shifting node labels so they are arranged over edges. The ‘idx’ label of nodes is used to refer to edges subtending nodes.
[12]:
rtre.get_edge_values('idx')
[12]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
[14]:
rtre.get_node_values('idx', 1, 1)
[14]:
array([12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
[20]:
# generate a random tree
ntips = 7
rseed = 12345
rtre = toytree.rtree.coaltree(ntips=ntips, seed=rseed)
# edge mapping 1: enter a dictionary mapping clade members to colors
ecolors = rtre.get_edge_values_mapped({
("r0", "r1", "r2"): toytree.colors[0], # <- using tips to define a clade
("r3"): toytree.colors[1], # <- using tips to define a clade
11: toytree.colors[2], # <- using node idx to define a clade
})
# edge mapping 2: map specific edges (here 3,6,10,11,12) to edge width value
elabels = rtre.get_edge_values('idx')
ewidths = [5.0 if i in (3, 6, 10, 11, 12) else 2.5 for i in elabels]
# draw tree with edge colors, edge_widths, and node idx labels shifted to edges
c, a, m = rtre.draw(
width=200,
edge_colors=ecolors,
edge_widths=ewidths,
node_labels=rtre.get_node_values('idx', True, False),
node_labels_style={
"-toyplot-anchor-shift": "-10px",
"baseline-shift": "5px",
"font-size": "10px",
},
);
5. Colored rectangles to highlight clades¶
The easiest way to add colored shapes to a plot is with the Toyplot .rectangle
or .fill()
functions of cartesian axes
objects. For this you simply need to know the coordinates of the area that you wish to fill (See Coordinates for details on this). The example below draws two rectangles in the coordinate space and then adds a tree on top of these. You could make more complex
polygon shapes using the fill
function (see Toyplot docs). Remember you can use axes.show=True
to see the axes coordinates if you need a reminder of how to set the x and y coordinates of the rectangles.
[22]:
# generate a random tree
rtre = toytree.rtree.unittree(20, seed=12345)
# make the canvas and axes
canvas = toyplot.Canvas(width=250, height=400)
axes = canvas.cartesian()
axes.show = True
# draw a rectangle (x1, x2, y1, y2)
axes.rectangle(
-0.75, 0.35, -0.5, 4.5,
opacity=0.25,
color=toytree.colors[0],
)
# draw a rectangle (x1, x2, y1, y2)
axes.rectangle(
-0.75, 0.35, 4.5, 8.5,
opacity=0.25,
color=toytree.colors[1],
)
# add tree to the axes
rtre.draw(axes=axes);
6. Plot histograms associated with tip trait values (ridge plot)¶
You can use the .hist()
or .fill()
functions of toytree to plot histograms. Here we will generate and plot a distribution of a data in order from top to bottom so that the histograms overlap in a “ridge plot” fashion. An analagous function in ggtree
seems to have merited an entire publication: https://academic.oup.com/mbe/article/35/12/3041/5142656.
[198]:
# we'll use scipy.stats to get prob. density func. of normal dist
import scipy.stats as sc
# generate a random tree with N tips
ntips = 40
tre = toytree.rtree.baltree(ntips).mod.node_slider(seed=123)
# generate a distribution between -10 and 10 for each tip in the tree
points = np.linspace(-10, 10, 50)
dists = {}
for tip in tre.get_tip_labels():
dists[tip] = sc.norm.pdf(points, loc=np.random.randint(-5, 5, 1), scale=2)
[199]:
# set up canvas for two panel plot
canvas = toyplot.Canvas(width=300, height=400)
# add tree to canvas
ax0 = canvas.cartesian(bounds=(50, 180, 50, 350), ymin=0, ymax=ntips, padding=15)
tre.draw(axes=ax0, tip_labels=False)
ax0.show = False
# add histograms to canvas
ax1 = canvas.cartesian(bounds=(200, 275, 50, 350), ymin=0, ymax=ntips, padding=15)
# iterate from top to bottom (ntips to 0)
for tip in range(tre.ntips)[::-1]:
# select a color for hist
color = toytree.colors[int((tip) / 10)]
# get tip name and get hist from dict
tipname = tre.get_tip_labels()[tip]
probs = dists[tipname]
# fill histogram with slightly overlapping histograms
ax1.fill(
points, probs / probs.max() * 1.25,
baseline=[tip] * len(points),
style={"fill": color, "stroke": "white", "stroke-width": 0.5},
title=tipname,
)
# add horizontal line at base
ax1.hlines(tip, opacity=0.5, color="grey", style={"stroke-width": 0.5})
# hide y axis, show x
ax1.y.show = False
ax1.x.label.text = "Trait value"
ax1.x.ticks.show = True
7. Plot tree with matrix/heatmap¶
[37]:
# load tree with variable name lengths
tree = toytree.tree("https://eaton-lab.org/data/Cyathophora.tre")
tree = tree.root(wildcard="prz")
Method 1:¶
The simplest method is to plot the tree and markers on shared coordinate axes. To make it easy to space items on the x-axis I set the tree to be 2X the width of the data (matrix), which allows me to use units of x=1 to space items on the x-axis. Then I generate a canvas and axes by drawing a tree, as usual, and here I add the data as square scatterplot markers with different opacities to represent the (randomly generated) data.
The only tricky thing here is that you need to use tip_labels_style
to offset the x-location of the tre tip labels, and also to extend the x-axis max domain if the names are long to prevent them from getting cut off.
[148]:
# generate some random data for this columns
spdata = np.random.randint(low=1, high=10, size=(tree.ntips, 5))
spdata
[148]:
array([[1, 5, 8, 9, 2],
[5, 4, 6, 1, 7],
[7, 7, 5, 7, 4],
[4, 6, 8, 5, 1],
[1, 1, 4, 9, 6],
[5, 8, 1, 1, 9],
[1, 3, 9, 9, 3],
[5, 1, 6, 3, 8],
[1, 7, 1, 8, 5],
[2, 4, 7, 7, 9],
[8, 3, 2, 6, 7],
[6, 1, 1, 9, 4],
[9, 8, 2, 8, 8]])
[153]:
# scale tree to be 2X length of number of matrix cols
ctree = tree.mod.node_scale_root_height(spdata.shape[1] * 2)
# get canvas and axes with tree plot
canvas, axes, mark = ctree.draw(
width=500,
height=300,
tip_labels_align=True,
tip_labels_style={"-toyplot-anchor-shift": "80px"}
);
# add n columns of data (here random data)
ncols = 5
xoffset = 1
for col in range(5):
# select the column of data
data = spdata[:, col]
# plot the data column
axes.scatterplot(
np.repeat(col, tree.ntips) + xoffset,
np.arange(tree.ntips),
marker='s',
size=10,
color=toytree.colors[col],
opacity=0.1 + data[::-1] / data.max(),
title=data,
);
# stretch domain to fit long tip names
axes.x.domain.max = 20
Method 2:¶
Using both a matrix and cartesian axes in toyplot. The key to aligning the two is that matrices have a margin of 50px by default. There aren’t as many options to style matrix cells as there are in the option above. Here I used the right-side matrix labels to add and align tip names.
[154]:
# a random rectangular matrix
matrix = np.arange(tree.ntips * 5).reshape(tree.ntips, 5)
matrix.shape
[154]:
(13, 5)
[196]:
# create a canvas
canvas = toyplot.Canvas(width=500, height=350);
# add tree
axes = canvas.cartesian(bounds=(50, 150, 70, 250))
tree.draw(axes=axes, tip_labels=False, tip_labels_align=True)
# add matrix
table = canvas.table(
rows=13,
columns=5,
margin=0,
bounds=(175, 250, 65, 255),
)
colormap = toyplot.color.brewer.map("BlueRed")
# apply a color to each cell in the table
for ridx in range(matrix.shape[0]):
for cidx in range(matrix.shape[1]):
cell = table.cells.cell[ridx, cidx]
cell.style = {
"fill": colormap.colors(matrix[ridx, cidx], 0, 100),
}
# style the gaps between cells
table.body.gaps.columns[:] = 3
table.body.gaps.rows[:] = 3
# hide axes coordinates
axes.show = False
[116]:
# create a canvas
canvas = toyplot.Canvas(width=500, height=350);
# add tree
axes = canvas.cartesian(bounds=(50, 150, 70, 250))
tree.draw(axes=axes, tip_labels=False, tip_labels_align=True)
# add matrix
colormap = toyplot.color.brewer.map("BlueRed")
table = canvas.matrix(
(matrix, colormap),
bounds=(120, 300, 25, 295),
tshow=True,
tlabel="Traits",
lshow=False,
rshow=True,
margin=0,
rlocator=toyplot.locator.Explicit(range(tree.ntips), tree.get_tip_labels()[::-1])
)
# hide axes coordinates
axes.show = False