Search code examples
rustpath

std::fs::canonicalize for files that don't exist


I'm writing a program in Rust that creates a file at a user-defined path. I need to be able to normalize intermediate components (~/ should become $HOME/, ../ should go up a directory, etc.) in order to create the file in the right place. std::fs::canonicalize does almost exactly what I want, but it panics if the path does not already exist.

Is there a function that normalizes componenets the same way as std::fs::canonicalize but doesn't panic if the file doesn't already exist?


Solution

  • There are good reasons such a function isn't standard:

    1. there's no unique path when you're dealing with both links and non existing files. If a/b is a link to c/d/e, then a/b/../f could either mean a/f or c/d/f

    2. the ~ shortcut is a shell feature. You may want to generalize it (I do), but that's a non obvious choice, especially when you consider ~ is a valid file name in most systems.

    This being said, it's sometimes useful, in cases those ambiguities aren't a problem because of the nature of your application.

    Here's what I do in such a case:

    use {
        directories::UserDirs,
        lazy_regex::*,
        std::path::{Path, PathBuf},
    };
    
    /// build a usable path from a user input which may be absolute
    /// (if it starts with / or ~) or relative to the supplied base_dir.
    /// (we might want to try detect windows drives in the future, too)
    pub fn path_from<P: AsRef<Path>>(
        base_dir: P,
        input: &str,
    ) -> PathBuf {
        let tilde = regex!(r"^~(/|$)");
        if input.starts_with('/') {
            // if the input starts with a `/`, we use it as is
            input.into()
        } else if tilde.is_match(input) {
            // if the input starts with `~` as first token, we replace
            // this `~` with the user home directory
            PathBuf::from(
                &*tilde
                    .replace(input, |c: &Captures| {
                        if let Some(user_dirs) = UserDirs::new() {
                            format!(
                                "{}{}",
                                user_dirs.home_dir().to_string_lossy(),
                                &c[1],
                            )
                        } else {
                            warn!("no user dirs found, no expansion of ~");
                            c[0].to_string()
                        }
                    })
            )
        } else {
            // we put the input behind the source (the selected directory
            // or its parent) and we normalize so that the user can type
            // paths with `../`
            normalize_path(base_dir.join(input))
        }
    }
    
    
    /// Improve the path to try remove and solve .. token.
    ///
    /// This assumes that `a/b/../c` is `a/c` which might be different from
    /// what the OS would have chosen when b is a link. This is OK
    /// for broot verb arguments but can't be generally used elsewhere
    ///
    /// This function ensures a given path ending with '/' still
    /// ends with '/' after normalization.
    pub fn normalize_path<P: AsRef<Path>>(path: P) -> PathBuf {
        let ends_with_slash = path.as_ref()
            .to_str()
            .map_or(false, |s| s.ends_with('/'));
        let mut normalized = PathBuf::new();
        for component in path.as_ref().components() {
            match &component {
                Component::ParentDir => {
                    if !normalized.pop() {
                        normalized.push(component);
                    }
                }
                _ => {
                    normalized.push(component);
                }
            }
        }
        if ends_with_slash {
            normalized.push("");
        }
        normalized
    }
    

    (this uses the directories crate to get the home in a cross-platform way but other crates exist and you could also just read the $HOME env variable in most platforms)