Sorry if this doesn't make much sense, I'm not much of a programmer.
I am using PowerShell to concatenate all of the files within a folder into a single larger file, however when I do this, the text itself comes out 'corrupted'.
I have a folder of Ancient Greek texts that all end with a .tess
extension, these files come from https://github.com/cltk/grc_text_tesserae/tree/master/texts (I'm not sure how this extension works, but it opens fine in Notepad).
I used:
Get-Content *.tess | Set-Content greekcorpus.tess
However, the text would come out scrambled. For example:
Σιδὼν ἐπὶ θαλάττῃ πόλις
Comes out as:
Σιδὼν ἐπὶ θαλαÌττῃ ποÌλιÏ"
Anyone know what could be going wrong? Thanks!
This should do the work :
Get-Content *.tess -Encoding UTF8