How to ignore certain files when branching / checking out?

I'd like to compare a few files from the bazaar branch lp:ubuntu/nvidia-graphics-drivers. I'm mainly interested in the debian subdirectory inside that branch, but due to the binary blob in http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/oneiric/nvidia-graphics-drivers/oneiric/files, it takes ages to get just the text files. I've already downloaded 555MB and it's still counting.

Is it possible to retrieve a bazaar branch, including or excluding certain files by one of the following properties:

file size
file extension
file name (include only debian/ for example)

I do not need to push back any changes, nor do I need to view the history of a file. I just want to compare two files in the debian/ directory, files with the .in extension and files without.

Solution

I ended up doing some dirty grep-ing on the HTTP response since bzr info "$branch" and bzr ls -d "$branch" "$directory" did not provide enough information to me.

The below Bash script relies on the working of Launchpads front-end Loggerhead. It recursively downloads from a given URL. Currently, it ignores *.run files. Save it as bzrdl in a directory available from $PATH and run it with bzrdl http://launchpad.net/~ubuntu-branches/ubuntu/oneiric/nvidia-graphics-drivers/oneiric/files/head:/debian/. All files will be saved in the current directory, be sure that it's empty to avoid conflicts.

#!/bin/bash
max_retries=5
rooturl="$1"
if ! [[ $rooturl =~ /$ ]]; then
    echo "Usage: ${0##*/} URL"
    echo "URL must end with a slash. Example URL:"
    echo "http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/oneiric/nvidia-graphics-drivers/oneiric/files/head:/"
    exit 1
fi
tmpdir="$(mktemp -d)"
target="$(pwd)"
# used for holding HTTP response before extracting data
tmp="$(mktemp)"
# url_filter reads download URLs from stdin (piped)
url_filter() {
    grep -v '\.run$'
}
get_files_from_dir() {
    local slash=/
    local dir="$1"
    # to avoid name collision: a/b/c/ -> a.d/b.d/c.d/
    local storedir="${dir//$slash/.d${slash}}"
    mkdir -p "$tmpdir/$storedir" "$target/$dir"
    local i subdir
    for ((i=0; i<$max_retries; i++ )); do
        if wget -O "$tmp" "$rooturl$dir"; then
            # store file list
            grep -F -B 1 '<img src="/static/images/ico_file_download.gif" alt="Download File" />' "$tmp" |\
                grep '^<a' | cut -d '"' -f 2 | url_filter \
                > "$tmpdir/$storedir/files"
            IFS=$'\n'
            for subdir in $(grep -F -B 1 '<img src="/static/images/ico_folder.gif" ' "$tmp" | \
                grep -F '<a ' | rev | cut -d / -f 2 | rev); do
                IFS=$' \t\n'
                get_files_from_dir "$dir$subdir/"
            done
            return
        fi
    done
    echo "Failed to download directory listing of: $dir" >> "$tmpdir/errors"
}
download_files() {
    local slash=/ 
    local dir="$1"
    # to avoid name collision: a/b/c/ -> a.d/b.d/c.d/
    local storedir="${dir//$slash/.d${slash}}"
    local done=false
    local subdir
    cd "$tmpdir/$storedir"
    for ((i=0; i<$max_retries; i++)); do  
        if wget -B "$rooturl$dir" -nc -i files -P "$target/$dir"; then
            done=true
            break
        fi
    done  
    $done || echo "Failed to download all files from $dir" >> "$tmpdir/errors"
    for subdir in *.d; do 
        download_files "$dir${subdir%%.d}/"
    done
}
get_files_from_dir ''
# make *.d expand to nothing if no directories are found
shopt -s nullglob
download_files ''
echo "TMP dir: $tmpdir"
echo "Errors : $(wc -l "$tmpdir/errors" 2>/dev/null | cut -d ' ' -f 2 || echo 0)"

The temporary directory and file is not removed afterwards, that must be done manually. Any errors (failures to download) will be written to $tmpdir/errors

It's confirmed to work with:

bzrdl http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/oneiric/nvidia-settings/oneiric/files/head:/debian/

Feel free to correct any mistakes or add improvements.