I'm trying to teach myself how to do phylogenetics for historical linguistics in R. I've found a public data set (https://www.cs.rice.edu/~nakhleh/CPHL/IEDATA_112603), and I want to get a Newick format tree from it, so that I can visualize it following these instructions: https://www.r-phylo.org/wiki/HowTo/InputtingTrees. I'm running R 3.4.1 on Max OS 10.12.6.
Here's what I've done so far. I copied the data and used R and a text editor to transform it into a Nexus data file. Since Nexus (as I understand it) can't distinguish between the individual characters 1 and 2, and the combined character 12, I turned all values in the original data set over 9 into letters of the alphabet, in sequence (a-q). Anyone can download it from here: https://ucla.box.com/s/i4fbeagcw8lombg3xuhczfk3h0y7v54m
The problem is, I can't find any instructions or code or guidance to interpret the raw data as a tree. I've found one Python script (Convert csv to Newick tree), but I don't know Python. Can anyone point me in the direction of the right software/library/tutorial, or otherwise help me figure out what my next step should be?
I finally found a colleague who could help me. I did not need to convert the data to Newick or Nexus to make a tree from it, I needed to convert it to phydat (see Phangorn package for R) to make a tree from it. What I did was to use the as.phydat() function in the Phangorn package for R to convert the linguistic data into "phylogenetic data." The way that I did this was by specifying "type = USER" in the function, which let me define my own levels for the data. There's a more detailed example at cran.r-project.org/web/packages/phangorn/vignettes/…. Then, I could create trees from it using the regular Phangorn functions.