Search code examples
pythonscipydendrogram

Save dendrogram to newick format


How can I save a dendrogram generated by scipy into Newick format?


Solution

  • You need the linkage matrix Z, which is the input to the scipy dendrogram function, and convert that to Newick format. Additionally, you need a list 'leaf_names' with the names of your leaves. Here is a function that will do the job:

    def get_newick(node, parent_dist, leaf_names, newick='') -> str:
        """
        Convert sciply.cluster.hierarchy.to_tree()-output to Newick format.
    
        :param node: output of sciply.cluster.hierarchy.to_tree()
        :param parent_dist: output of sciply.cluster.hierarchy.to_tree().dist
        :param leaf_names: list of leaf names
        :param newick: leave empty, this variable is used in recursion.
        :returns: tree in Newick format
        """
        if node.is_leaf():
            return "%s:%.2f%s" % (leaf_names[node.id], parent_dist - node.dist, newick)
        else:
            if len(newick) > 0:
                newick = "):%.2f%s" % (parent_dist - node.dist, newick)
            else:
                newick = ");"
            newick = get_newick(node.get_left(), node.dist, leaf_names, newick=newick)
            newick = get_newick(node.get_right(), node.dist, leaf_names, newick=",%s" % (newick))
            newick = "(%s" % (newick)
            return newick
    
    tree = hierarchy.to_tree(Z, False)
    get_newick(tree, tree.dist, leaf_names)