Using MacOS terminal, is there away to take a directory of text files and have them all combined into one spreadsheet (either CSV or Numbers format) so:
Example 1: here is how my text files look before combining:
Example 2: here is how my text files should look in a spreadsheet after combining:
(These examples are a partial extract. I actually have 100s of files to combine).
Steps I have tried:
I searched Stack Overflow for an answer, but all the other questions about this task use Python or Panda. I would prefer a solution that could be done directly from MacOS terminal without needing to install packages like Python or Panda.
From researching, I believe the paste
command could be used:
paste -d '\t' *.txt > ^0-merged.csv
However, when I try this, it produces the following error message: paste: Too many open files
. It also produces a CSV file that is completely blank.
Ruby is part of MacOS.
Given:
head -n 3 *.txt
==> GOOD THINGS IN LIFE.txt <==
Art
Fun
Hugs
==> IN THE BACKYARD.txt <==
Hose
Tree
Soil
==> KITCHEN CUPBOARD ESSENTIALS.txt <==
Tea
Rice
Milk
==> KNITTING STITCHES.txt <==
Rib
Dip
Seed
# and the rest of your lines in each case...
You can do:
ruby -e '
a=[]
ARGV.sort.each{|fn|
a<<[fn]+File.open(fn).read.split(/\R/)
}
a.transpose.each{|sa|
puts sa.join(",")
}
' *.txt
Prints:
GOOD THINGS IN LIFE.txt,IN THE BACKYARD.txt,KITCHEN CUPBOARD ESSENTIALS.txt,KNITTING STITCHES.txt
Art,Hose,Tea,Rib
Fun,Tree,Rice,Dip
Hugs,Soil,Milk,Seed
Earth,Fence,Salt,Tile
Honor,Porch,Pesto,Linen
Space,Patio,Flour,Cable
Sport,Grass,Honey,Wicker
Intelligence,Wading Pool,Baking Powder,Knotted Boxes
Innovation,Welcome Mat,Vegetable Oil,Chinese Wave
Confidence,Back Stoop,Tomato Paste,Checkerboard
Good Deeds,Fruit Tree,Black Pepper,Herringbone
Creativity,Downspout,Baking Soda,Stockinette
Education,Birdbath,Ketchup,Garter
Kindness,Terrace,Surer,Waffle
Integrity,Planter,Sugar,Puri Ridge
Faith,Carport,Coffee,Netted
Friends,Flowerbed,Cinnamon,Elongated
Respect,Shovel,Cheese,Farrow Rib
People,Hedges,Bread,Plaited
Yourself,Rocks,Olive Oil,Clamshell
Happiness,Lawnmower,Crackers,Bamboo
Heart,Hot Tub,Pasta,English Rib
Religion,Garden,Scissors,Basket
Wisdom,Stoop,Garlic,Raspberry
If you want a 'proper' csv with quoted fields that works better with Excel, you can use the CSV module included with Ruby:
ruby -r csv -e '
a=[]
ARGV.sort.each{|fn|
a<<[fn]+File.open(fn).read.split(/\R/)
}
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
csv<<a[0]
a[1..].each{|row|
csv<<row
}
}
' *.txt
Prints:
"GOOD THINGS IN LIFE.txt","IN THE BACKYARD.txt","KITCHEN CUPBOARD ESSENTIALS.txt","KNITTING STITCHES.txt"
"Art","Hose","Tea","Rib"
"Fun","Tree","Rice","Dip"
"Hugs","Soil","Milk","Seed"
"Earth","Fence","Salt","Tile"
"Honor","Porch","Pesto","Linen"
"Space","Patio","Flour","Cable"
"Sport","Grass","Honey","Wicker"
"Intelligence","Wading Pool","Baking Powder","Knotted Boxes"
"Innovation","Welcome Mat","Vegetable Oil","Chinese Wave"
"Confidence","Back Stoop","Tomato Paste","Checkerboard"
"Good Deeds","Fruit Tree","Black Pepper","Herringbone"
"Creativity","Downspout","Baking Soda","Stockinette"
"Education","Birdbath","Ketchup","Garter"
"Kindness","Terrace","Surer","Waffle"
"Integrity","Planter","Sugar","Puri Ridge"
"Faith","Carport","Coffee","Netted"
"Friends","Flowerbed","Cinnamon","Elongated"
"Respect","Shovel","Cheese","Farrow Rib"
"People","Hedges","Bread","Plaited"
"Yourself","Rocks","Olive Oil","Clamshell"
"Happiness","Lawnmower","Crackers","Bamboo"
"Heart","Hot Tub","Pasta","English Rib"
"Religion","Garden","Scissors","Basket"
"Wisdom","Stoop","Garlic","Raspberry"
Comments:
It also puts the file name at the top of every column. Is there any way to omit the file name? Also, it seems to treat Uppercase A-Z and lowercase a-z as separate (e.g. so file names with A-Z will come first and then file names with a-z) Thanks!
If you have files of different length, you can pad the end of the shorter files so that you still have a proper matrix to transpose:
ruby -r csv -e '
a=[]
ARGV.sort_by{|s| s.downcase}.each{|fn|
a<<File.open(fn).read.split(/\R/)
}
max_length=a.max_by{|sa| sa.length}.length
a.each.with_index{|sa,i|
if sa.length<max_length then a[i].concat [""]*(max_length-sa.length) end }
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
csv<<a[0]
a[1..].each{|row|
csv<<row
}
}
' *.txt