📋 Tree Demonstration
Here are some codes to get started.
Construct Tree
Nodes can have attributes if they are initialized from Node
, dictionary, pandas DataFrame, or polars DataFrame.
1. From Node
Nodes can be linked to each other in the following ways:
- Using
parent
andchildren
setter methods - Directly passing
parent
orchildren
argument - Using bitshift operator with the convention
parent >> child
orchild << parent
- Using
.append(child)
or.extend([child1, child2])
methods
from bigtree import Node, tree_to_dot
root = Node("a")
b = Node("b")
c = Node("c")
d = Node("d")
root.children = [b, c]
d.parent = b
root.show()
# a
# ├── b
# │ └── d
# └── c
root.hshow()
# ┌─ b ─── d
# ─ a ─┤
# └─ c
graph = tree_to_dot(root, node_colour="gold")
graph.write_png("assets/demo/tree.png")
2. From str
Construct nodes only. Newick string notation supports parsing attributes.
3. From list
Construct nodes only. List can contain either full paths or tuples of parent-child names.
4. From nested dictionary
Construct nodes with attributes. Dictionary can be in a flat structure where key
is path and value
is
dictionary of node attribute names and values, or in a recursive structure where key
is node attribute
names and value
is node attribute values, and list of children (recursive).
from bigtree import nested_dict_to_tree
path_dict = {
"name": "a",
"age": 90,
"children": [
{
"name": "b",
"age": 65,
"children": [
{"name": "d", "age": 40},
],
},
{"name": "c", "age": 60},
],
}
root = nested_dict_to_tree(path_dict)
root.show(attr_list=["age"])
# a [age=90]
# ├── b [age=65]
# │ └── d [age=40]
# └── c [age=60]
5. From pandas/polars DataFrame
Construct nodes with attributes. DataFrame can contain either path column or parent-child columns. Other columns can be used to specify attributes.
import pandas as pd
from bigtree import dataframe_to_tree_by_relation
data = pd.DataFrame(
[
["a", None, 90],
["b", "a", 65],
["c", "a", 60],
["d", "b", 40],
],
columns=["child", "parent", "age"],
)
root = dataframe_to_tree_by_relation(data)
root.show(attr_list=["age"])
# a [age=90]
# ├── b [age=65]
# │ └── d [age=40]
# └── c [age=60]
import polars as pl
from bigtree import polars_to_tree_by_relation
data = pl.DataFrame(
[
["a", None, 90],
["b", "a", 65],
["c", "a", 60],
["d", "b", 40],
],
schema=["child", "parent", "age"],
)
root = polars_to_tree_by_relation(data)
root.show(attr_list=["age"])
# a [age=90]
# ├── b [age=65]
# │ └── d [age=40]
# └── c [age=60]
Note
If tree is already created, nodes can still be added using path string, dictionary, and pandas/polars DataFrame!
Attributes can be added to existing nodes using a dictionary or pandas/polars DataFrame.
View Tree
1. Print Tree
After tree is constructed, it can be viewed by printing to console using show
or hshow
method directly,
for vertical and horizontal orientation respectively.
Alternatively, the print_tree
or hprint_tree
method can be used.
from bigtree import Node, print_tree, hprint_tree
root = Node("a", alias="alias-a", age=90, gender="F")
b = Node("b", age=65, gender="M", parent=root)
c = Node("c", alias="alias-c", age=60, gender="M", parent=root)
d = Node("d", age=40, gender="F", parent=b)
e = Node("e", age=35, gender="M", parent=b)
print_tree(root) # (1)!
# a
# ├── b
# │ ├── d
# │ └── e
# └── c
hprint_tree(root) # (2)!
# ┌─ d
# ┌─ b ─┤
# ─ a ─┤ └─ e
# └─ c
- Alternatively,
root.show()
can be used - Alternatively,
root.hshow()
can be used
Other customizations for printing are also available, such as:
- Printing alias instead of node name, if present
- Printing subtree
- Printing tree with attributes
- Different built-in or custom style
root.show(attr_list=["age"])
# a [age=90]
# ├── b [age=65]
# │ ├── d [age=40]
# │ └── e [age=35]
# └── c [age=60]
root.show(attr_list=["age"], attr_bracket=["*(", ")"])
# a *(age=90)
# ├── b *(age=65)
# │ ├── d *(age=40)
# │ └── e *(age=35)
# └── c *(age=60)
root.show(all_attrs=True)
# a [age=90, gender=F]
# ├── b [age=65, gender=M]
# │ ├── d [age=40, gender=F]
# │ └── e [age=35, gender=M]
# └── c [age=60, gender=M]
2. Plot Tree
Tree can also be plotted using plot
method directly with the help of matplotlib
library.
Alternatively, the plot_tree
method can be used, but remember to run the reingold_tilford
algorithm
first to retrieve the x and y coordinates.
Arguments and keyword arguments can be passed in as long as they are compatible with the plt.plot()
function. A plt.Figure object is returned if you want to do further customizations such as add title or
save the figure to image.
from bigtree import Node, reingold_tilford, plot_tree
root = Node("a", age=90, gender="F")
b = Node("b", age=65, gender="M", parent=root)
c = Node("c", age=60, gender="M", parent=root)
d = Node("d", age=40, gender="F", parent=b)
e = Node("e", age=35, gender="M", parent=b)
reingold_tilford(root)
fig = plot_tree(root, "-ok") # (1)!
fig.axes[0].set_title("Tree Plot Demonstration")
fig.show() # Show figure
fig.savefig("assets/demo/tree_plot.png") # Save figure
- Alternatively,
root.plot("-ok")
can be used
Tree Attributes and Operations
Note that using BaseNode
or Node
as superclass inherits the default class attributes (properties)
and operations (methods).
from bigtree import str_to_tree
# Initialize tree
tree_str = """
a
├── b
│ ├── d
│ ├── e
│ └── f
│ ├── h
│ └── i
└── c
└── g
"""
root = str_to_tree(tree_str)
# Accessing children
node_b = root["b"]
node_e = root["b"]["e"]
Below are the tables of attributes available to BaseNode
and Node
classes.
Attributes wrt self | Code | Returns |
---|---|---|
Check if root | root.is_root |
True |
Check if leaf node | root.is_leaf |
False |
Check diameter of tree | node_b.diameter |
3 |
Check depth of node | node_b.depth |
2 |
Check depth of tree | node_b.max_depth |
4 |
Get root of tree | node_b.root |
Node(/a, ) |
Get node path | node_b.node_path |
(Node(/a, ), Node(/a/b, )) |
Get node name (only for Node ) |
node_b.node_name |
'b' |
Get node path name (only for Node ) |
node_b.path_name |
'/a/b' |
Attributes wrt structure | Code | Returns |
---|---|---|
Get child/children | root.children |
(Node(/a/b, ), Node(/a/c, )) |
Get parent | node_e.parent |
Node(/a/b, ) |
Get siblings | node_e.siblings |
(Node(/a/b/d, ), Node(/a/b/f, )) |
Get left sibling | node_e.left_sibling |
Node(/a/b/d, ) |
Get right sibling | node_e.right_sibling |
Node(/a/b/f, ) |
Get ancestors (lazy evaluation) | list(node_e.ancestors) |
[Node(/a/b, ), Node(/a, )] |
Get descendants (lazy evaluation) | list(node_b.descendants) |
[Node(/a/b/d, ), Node(/a/b/e, ), Node(/a/b/f, ), Node(/a/b/f/h, ), Node(/a/b/f/i, )] |
Get leaves (lazy evaluation) | list(node_b.leaves) |
[Node(/a/b/d, ), Node(/a/b/e, ), Node(/a/b/f/h, ), Node(/a/b/f/i, )] |
Below is the table of operations available to BaseNode
and Node
classes.
Operations | Code | Returns |
---|---|---|
Visualize tree (only for Node ) |
root.show() |
None |
Visualize tree (horizontally) (only for Node ) |
root.hshow() |
None |
Get node information | root.describe(exclude_prefix="_") |
[('name', 'a')] |
Find path from one node to another | root.go_to(node_e) |
[Node(/a, ), Node(/a/b, ), Node(/a/b/e, )] |
Add child to node | root.append(Node("j")) |
None |
Add multiple children to node | root.extend([Node("k"), Node("l")]) |
None |
Set attribute(s) | root.set_attrs({"description": "root-tag"}) |
None |
Get attribute | root.get_attr("description") |
'root-tag' |
Copy tree | root.copy() |
None |
Sort children | root.sort(key=lambda node: node.node_name, reverse=True) |
None |
Plot tree | root.plot("-ok") |
plt.Figure() |
Traverse Tree
Tree can be traversed using the following traversal methods.
from bigtree import (
levelorder_iter,
levelordergroup_iter,
postorder_iter,
preorder_iter,
str_to_tree,
zigzag_iter,
zigzaggroup_iter,
)
root = str_to_tree("""
a
├── b
│ ├── d
│ └── e
└── c
""")
[node.node_name for node in preorder_iter(root)]
# ['a', 'b', 'd', 'e', 'c']
[node.node_name for node in postorder_iter(root)]
# ['d', 'e', 'b', 'c', 'a']
[node.node_name for node in levelorder_iter(root)]
# ['a', 'b', 'c', 'd', 'e']
[[node.node_name for node in node_group] for node_group in levelordergroup_iter(root)]
# [['a'], ['b', 'c'], ['d', 'e']]
[node.node_name for node in zigzag_iter(root)]
# ['a', 'c', 'b', 'd', 'e']
[[node.node_name for node in node_group] for node_group in zigzaggroup_iter(root)]
# [['a'], ['c', 'b'], ['d', 'e']]
Modify Tree
Nodes can be shifted (with or without replacement) or copied (without replacement) from one path to another, this changes the tree in-place. Nodes can also be copied (with or without replacement) between two different trees.
There are various other configurations for performing copying/shifting, refer to code documentation for more examples.
from bigtree import list_to_tree, shift_nodes, shift_and_replace_nodes
root = list_to_tree(
["Downloads/Pictures", "Downloads/photo1.jpg", "Downloads/file1.doc"]
)
root.show()
# Downloads
# ├── Pictures
# ├── photo1.jpg
# └── file1.doc
shift_nodes(
tree=root,
from_paths=["photo1.jpg", "Downloads/file1.doc"],
to_paths=["Downloads/Pictures/photo1.jpg", "Downloads/Files/file1.doc"],
)
root.show()
# Downloads
# ├── Pictures
# │ └── photo1.jpg (1)
# └── Files
# └── file1.doc (2)
shift_and_replace_nodes(
tree=root,
from_paths=["Downloads/Files"],
to_paths=["Downloads/Pictures/photo1.jpg"],
)
root.show()
# Downloads
# └── Pictures
# └── Files (3)
# └── file1.doc
- The first shift to destination
Downloads/Pictures/photo1.jpg
- The second shift to destination
Downloads/Files/file1.doc
, this creates intermediate NodeFiles
as well - Shift and replace
photo1.jpg
withFiles
folder
from bigtree import list_to_tree, copy_nodes
root = list_to_tree(
["Downloads/Pictures", "Downloads/photo1.jpg", "Downloads/file1.doc"]
)
root.show()
# Downloads
# ├── Pictures
# ├── photo1.jpg
# └── file1.doc
copy_nodes(
tree=root,
from_paths=["photo1.jpg", "Downloads/file1.doc"],
to_paths=["Downloads/Pictures/photo1.jpg", "Downloads/Files/file1.doc"],
)
root.show()
# Downloads
# ├── Pictures
# │ └── photo1.jpg (1)
# ├── photo1.jpg (2)
# ├── file1.doc (4)
# └── Files
# └── file1.doc (3)
- The first copy to destination
Downloads/Pictures/photo1.jpg
- Original
photo1.jpg
still remains - The second copy to destination
Downloads/Files/file1.doc
, this creates intermediate NodeFiles
as well - Original
file1.doc
still remains
from bigtree import (
Node,
copy_nodes_from_tree_to_tree,
copy_and_replace_nodes_from_tree_to_tree,
list_to_tree,
str_to_tree,
)
root = list_to_tree(
["Downloads/Pictures", "Downloads/photo1.jpg", "Downloads/file1.doc"]
)
root.show()
# Downloads
# ├── Pictures
# ├── photo1.jpg
# └── file1.doc
root_other = Node("Documents")
copy_nodes_from_tree_to_tree(
from_tree=root,
to_tree=root_other,
from_paths=["Downloads/Pictures", "photo1.jpg", "file1.doc"],
to_paths=[
"Documents/Pictures",
"Documents/Pictures/photo1.jpg",
"Documents/Files/file1.doc",
],
)
root_other.show()
# Documents
# ├── Pictures (1)
# │ └── photo1.jpg (2)
# └── Files
# └── file1.doc (3)
root_other = str_to_tree("""
Documents
├── Pictures
│ └── photo2.jpg
└── file2.doc
""")
copy_and_replace_nodes_from_tree_to_tree(
from_tree=root,
to_tree=root_other,
from_paths=["Downloads/photo1.jpg", "Downloads/file1.doc"],
to_paths=["Documents/Pictures/photo2.jpg", "Documents/file2.doc"],
)
root_other.show()
# Documents
# ├── Pictures
# │ └── photo1.jpg (4)
# └── file1.doc (5)
- The first copy to destination
Documents/Pictures
- The second copy to destination
Documents/Pictures/photo1.jpg
- The third copy to destination
Documents/Files/file1.doc
, this creates intermediate NodeFiles
as well - The first copy and replace of
Documents/Pictures/photo2.jpg
withphoto1.jpg
- The second copy and replace of
Documents/file2.doc
withfile1.doc
Tree Search
One or multiple nodes can be searched based on name, path, attribute value, or user-defined condition. It is also possible to search for one or more child node(s) based on attributes, this search will be faster as it does not require traversing the whole tree to find the node(s).
from bigtree import Node, find, find_name, find_path, find_relative_path, find_full_path, find_attr
root = Node("a", age=90)
b = Node("b", age=65, parent=root)
c = Node("c", age=60, parent=root)
d = Node("d", age=40, parent=c)
root.show(attr_list=["age"])
# a [age=90]
# ├── b [age=65]
# └── c [age=60]
# └── d [age=40]
find(root, lambda node: node.age == 60)
# Node(/a/c, age=60)
find_name(root, "d")
# Node(/a/c/d, age=40)
find_relative_path(c, "../b") # relative path
# Node(/a/b, age=65)
find_path(root, "/c/d") # partial path
# Node(/a/c/d, age=40)
find_full_path(root, "a/c/d") # full path
# Node(/a/c/d, age=40)
find_attr(root, "age", 40)
# Node(/a/c/d, age=40)
from bigtree import Node, findall, find_names, find_relative_paths, find_paths, find_attrs
root = Node("a", age=90)
b = Node("b", age=65, parent=root)
c = Node("c", age=60, parent=root)
d = Node("c", age=40, parent=c)
root.show(attr_list=["age"])
# a [age=90]
# ├── b [age=65]
# └── c [age=60]
# └── c [age=40]
findall(root, lambda node: node.age >= 65)
# (Node(/a, age=90), Node(/a/b, age=65))
find_names(root, "c")
# (Node(/a/c, age=60), Node(/a/c/c, age=40))
find_relative_paths(c, "../*") # relative path
# (Node(/a/b, age=65), Node(/a/c, age=60))
find_paths(root, "/c") # partial path
# (Node(/a/c, age=60), Node(/a/c/c, age=40))
find_attrs(root, "age", 40)
# (Node(/a/c/c, age=40),)
from bigtree import Node, find_children, find_child, find_child_by_name
root = Node("a", age=90)
b = Node("b", age=65, parent=root)
c = Node("c", age=60, parent=root)
d = Node("c", age=40, parent=c)
root.show(attr_list=["age"])
# a [age=90]
# ├── b [age=65]
# └── c [age=60]
# └── c [age=40]
find_children(root, lambda node: node.age >= 60)
# (Node(/a/b, age=65), Node(/a/c, age=60))
find_child(root, lambda node: node.node_name == "c")
# Node(/a/c, age=60)
find_child_by_name(root, "c")
# Node(/a/c, age=60)
find_child_by_name(c, "c")
# Node(/a/c/c, age=40)
Helper Utility
1. Clone tree
Trees can be cloned to another Node type. If the same type is desired, use tree.copy()
instead.
from bigtree import BaseNode, Node, clone_tree
root = BaseNode(name="a")
b = BaseNode(name="b", parent=root)
clone_tree(root, Node) # clone from `BaseNode` to `Node` type
# Node(/a, )
2. Get subtree
Subtree refers to a smaller tree with a different tree root.
from bigtree import str_to_tree, get_subtree
root = str_to_tree("""
a
├── b
│ ├── d
│ └── e
└── c
└── f
""")
root_subtree = get_subtree(root, "b")
root_subtree.show()
# b
# ├── d
# └── e
3. Prune tree
Pruned tree refers to a smaller tree with the same tree root. Trees can be pruned by one or more of the following filters:
- Path: keep all descendants by default, set
exact=True
to prune the path exactly - Depth: prune tree by depth
4. Get tree differences
View the differences in structure and/or attributes between two trees. The changes reflected are relative to the first tree. By default, only the differences are shown. It is possible to view the full original tree with the differences.
To compare tree attributes:
(+)
: Node is added in second tree(-)
: Node is removed in second tree(~)
: Node has different attributes, only available when comparing attributes
For more details, (moved from)
, (moved to)
, (added)
, and (removed)
can
be indicated instead if (+)
and (-)
by passing detail=True
.
For aggregating the differences at the parent-level instead of having (+)
and
(-)
at every child node, pass in aggregate=True
. This is useful if
subtrees are shifted, and if you want to view the shifting at the parent-level.
Note
For more custom processing and handling of the tree differences, the interim
dataframe of the tree differences can be retrieved with get_tree_diff_dataframe
.
from bigtree import str_to_tree, get_tree_diff
root = str_to_tree("""
a
├── b
│ ├── d
│ └── e
└── c
└── f
""")
root_other = str_to_tree("""
a
├── b
│ └── d
└── c
└── g
""")
tree_diff = get_tree_diff(root, root_other, only_diff=False)
tree_diff.show()
# a
# ├── b
# │ ├── d
# │ └── e (-)
# └── c
# ├── f (-)
# └── g (+)
from bigtree import str_to_tree, get_tree_diff
root = str_to_tree("""
a
├── b
│ ├── d
│ │ └── g
│ └── e
└── c
└── f
""")
root_other = str_to_tree("""
a
├── b
│ └── h
└── c
├── d
│ └── g
└── f
""")
tree_diff = get_tree_diff(root, root_other, detail=True)
tree_diff.show()
# a
# ├── b
# │ ├── d (moved from)
# │ │ └── g (moved from)
# │ ├── e (removed)
# │ └── h (added)
# └── c
# └── d (moved to)
# └── g (moved to)
from bigtree import str_to_tree, get_tree_diff
root = str_to_tree("""
a
├── b
│ ├── d
│ │ └── g
│ └── e
└── c
└── f
""")
root_other = str_to_tree("""
a
├── b
│ └── h
└── c
├── d
│ └── g
└── f
""")
tree_diff = get_tree_diff(root, root_other, detail=True, aggregate=True)
tree_diff.show()
# a
# ├── b
# │ ├── d (moved from)
# │ │ └── g
# │ ├── e (removed)
# │ └── h (added)
# └── c
# └── d (moved to)
# └── g
from bigtree import Node, get_tree_diff
root = Node("a")
b = Node("b", parent=root)
c = Node("c", tags="original c", parent=b)
d = Node("d", tags="original d", parent=root)
root.show(attr_list=["tags"])
# a
# ├── b
# │ └── c [tags=original c]
# └── d [tags=original d]
root_other = Node("a")
b = Node("b", parent=root_other)
c = Node("c", tags="new c", parent=b)
e = Node("e", tags="new e", parent=b)
d = Node("d", tags="new d", parent=root_other)
root_other.show(attr_list=["tags"])
# a
# ├── b
# │ ├── c [tags=new c]
# │ └── e [tags=new e]
# └── d [tags=new d]
tree_diff = get_tree_diff(root, root_other, attr_list=["tags"])
tree_diff.show(attr_list=["tags"])
# a
# ├── b
# │ ├── c (~) [tags=('original c', 'new c')]
# │ └── e (+)
# └── d (~) [tags=('original d', 'new d')]
Export Tree
Tree can be exported to other data types:
- Newick string notation
- Nested dictionary (flat structure and recursive structure)
- pandas DataFrame
- polars DataFrame
- Dot (can save to .dot, .png, .svg, .jpeg files)
- Pillow (can save to .png, .jpg)
- Mermaid Flowchart (can display on .md)
from bigtree import Node
root = Node("a", age=90)
b = Node("b", age=65, parent=root)
c = Node("c", age=60, parent=root)
d = Node("d", age=40, parent=b)
e = Node("e", age=35, parent=b)
root.show()
# a
# ├── b
# │ ├── d
# │ └── e
# └── c
from bigtree import tree_to_dict
tree_to_dict(
root,
name_key="name",
parent_key="parent",
attr_dict={"age": "person age"}
)
# {
# '/a': {'name': 'a', 'parent': None, 'person age': 90},
# '/a/b': {'name': 'b', 'parent': 'a', 'person age': 65},
# '/a/b/d': {'name': 'd', 'parent': 'b', 'person age': 40},
# '/a/b/e': {'name': 'e', 'parent': 'b', 'person age': 35},
# '/a/c': {'name': 'c', 'parent': 'a', 'person age': 60}
# }
from bigtree import tree_to_nested_dict
tree_to_nested_dict(root, all_attrs=True)
# {
# 'name': 'a',
# 'age': 90,
# 'children': [
# {
# 'name': 'b',
# 'age': 65,
# 'children': [
# {
# 'name': 'd',
# 'age': 40
# },
# {
# 'name': 'e',
# 'age': 35
# }
# ]
# },
# {
# 'name': 'c',
# 'age': 60
# }
# ]
# }
from bigtree import tree_to_polars
tree_to_polars(
root,
name_col="name",
parent_col="parent",
path_col="path",
attr_dict={"age": "person age"}
)
# shape: (5, 4)
# ┌────────┬──────┬────────┬────────────┐
# │ path ┆ name ┆ parent ┆ person age │
# │ --- ┆ --- ┆ --- ┆ --- │
# │ str ┆ str ┆ str ┆ i64 │
# ╞════════╪══════╪════════╪════════════╡
# │ /a ┆ a ┆ null ┆ 90 │
# │ /a/b ┆ b ┆ a ┆ 65 │
# │ /a/b/d ┆ d ┆ b ┆ 40 │
# │ /a/b/e ┆ e ┆ b ┆ 35 │
# │ /a/c ┆ c ┆ a ┆ 60 │
# └────────┴──────┴────────┴────────────┘
- dot.png
- pillow.png
- Mermaid flowchart
%%{ init: { 'flowchart': { 'curve': 'basis' } } }%% flowchart TB 0("a") --> 0-0("b") 0-0 --> 0-0-0("d") 0-0 --> 0-0-1("e") 0("a") --> 0-1("c") classDef default stroke-width:1