Is there a trick to be able to use file paths with spaces in Mallet through the terminal on mac?
For example, all of the following give me errors:
escaping the space
./bin/mallet import-dir --input /Volumes/Macintosh\ HD/Users/MY_NAME/Desktop/en --output /Users/MY_NAME/Desktop/en.mallet --remove-stopwords TRUE --keep-sequence TRUE
double quotes, no escapes
./bin/mallet import-dir --input "/Volumes/Macintosh HD/Users/MY_NAME/Desktop/en" --output /Users/MY_NAME/Desktop/en.mallet --remove-stopwords TRUE --keep-sequence TRUE
and, with double quotes
./bin/mallet import-dir --input "/Volumes/Macintosh\ HD/Users/MY_NAME/Desktop/en" --output /Users/MY_NAME/Desktop/en.mallet --remove-stopwords TRUE --keep-sequence TRUE
and finally with single quotes
./bin/mallet import-dir --input '/Volumes/Macintosh\ HD/Users/MY_NAME/Desktop/en' --output /Users/MY_NAME/Desktop/en.mallet --remove-stopwords TRUE --keep-sequence TRUE
They all want to treat the folder as multiple folders, split on the space:
Labels =
/Volumes/Macintosh\
HD/Users/MY_NAME/Desktop/en
Exception in thread "main" java.lang.IllegalArgumentException: /Volumes/Macintosh\ is not a directory.
at cc.mallet.pipe.iterator.FileIterator.<init>(FileIterator.java:108)
at cc.mallet.pipe.iterator.FileIterator.<init>(FileIterator.java:145)
at cc.mallet.classify.tui.Text2Vectors.main(Text2Vectors.java:322)
Is there anyway around this, other than renaming all of my files with spaces to underscores? (I understand that I don't need to type /Volumes/Macintosh\ HD/... but can just start at /Users. This was just an example.)
The issue is that import-dir
is designed to take multiple directories as input. The argument parser would need a way to distinguish this use case from the "escaped space" use case, keeping in mind that Windows paths can end in \
.
The best way to support both cases might be to add a --single-input
option that would take its argument as a single string.
I also find that the spreadsheet-style import-file
command is almost always preferable to working with directories.