Search code examples
bashpattern-matchingwildcardunzip

Unzip Contents Based On Wildcard Negative Matching


I have the following folder contents in a zip archive:

ca_ES
cs_CZ
da_DK
de_CH
de_DE
el_GR
en_GB
es_ES
fi_FI
fr_FR
gl_ES
it_IT
lv_LV
mt_MT
nb_NO
nl_NL
pt_PT
ro_RO
sk_SK
sl_SI
sq_AL
sv_SE
tr_TR
vi_VN
get-new-strings.sh
lang_check.php
lang_check.txt
language_list.txt

I am unzipping the file using this command where I exclude directories and files I do not want:

unzip -q ${ZIP_FILE} -x ${FOLDER_PATH}/documentation/* ${FOLDER_PATH}/library/pdf/help/* ${FOLDER_PATH}/library/pdf/samples/*

Now I am trying to exclude ALL files and folders in the above directory except the directories en_GB and es_ES.

I tried the following after reading this:

unzip -q ${ZIP_FILE} -x ${FOLDER_PATH}/documentation/* ${FOLDER_PATH}/library/pdf/help/* ${FOLDER_PATH}/library/pdf/samples/* ${FOLDER_PATH}/lang/[!e][!ns]_*

But it ends up making the unzipped directory contents as:

cs_CZ
el_GR
en_GB
es_ES
get-new-strings.sh
lang_check.php
lang_check.txt
language_list.txt
  • Why is cs_CZ, el_GR and the files not matching that expression and being allowed?

Solution

  • cs and el both don't match [!e][!ns] pattern.

    Unzip excludes all files that match pattern [!e][!ns]_*. To match this pattern file (or directory path) must satisfy following conditions:

    • First symbol is not e
    • Second symbol is not n or s
    • Third symbol is _ (underscore)

    Only files that match all three requirements are excluded from unzip result. cs_CZ, for example matches only first and third.

    UPDATE (solution proposal): Ugly solution that may work is exclude with 3 patterns. Assuming that all of the folders are in the root of zip file:

    unzip -q ${ZIP_FILE} -x [!e]?* ?[!ns]* ??[!_]*
    

    UPDATE 2 (relative paths): This one should work:

    unzip -q ${ZIP_FILE} -x ${FOLDER_PATH}/documentation/* ${FOLDER_PATH}/library/pdf/help/* ${FOLDER_PATH}/library/pdf/samples/* ${FOLDER_PATH}/lang/[!e]?* ${FOLDER_PATH}/lang/?[!ns]* ${FOLDER_PATH}/lang/??[!_]*
    

    Note that this will leave all files and folders starting with en_ or es_ in ${FOLDER_PATH}/lang directory.