Search code examples
pythonpathlibos.path

removing back dots dir from path's prefix


I have two paths.

path1 = /users/fida/data_lake/data/archived/09142023

path2 = /users/fida/data_lake/data/localpublished/SPTTES/End_to_End_Edit_Schedule/2022-08-03_11_22_03.kp

Output Im trying to get after combining both path is: /users/fida/data_lake/data/archived/09142023/localpublished/SPTTES/End_to_End_Edit_Schedule/2022-08-03_11_22_03.kp

I have tried os.path.relpath but I get dots in the prefix which mess the path.

..\..\localpublished\SPTTES\End_to_End_Edit_Schedule\2022-08-03_11_22_03.kp


Solution

  • Removing the "back dots" can be done by using the second parameter of os.path.relpath. It defaults to os.curdir, therefore you get the "back dots".


    You could use os.path.commonpath (introduced in Python 3.5) to determine the common parent path.

    Next, you leverage the second argument of os.path.relpath to get the relative path of path2 at the common parent path.

    And finally, you join path1 with the relative path of path2 using os.path.join.

    import os
    
    path1 = "/users/fida/data_lake/data/archived/09142023"
    path2 = "/users/fida/data_lake/data/localpublished/S/S/2022-08-03_11_22_03.kp"
    
    common_path = os.path.commonpath([path1, path2])
    
    relative_path2 = os.path.relpath(path2, common_path)
    
    new_path = os.path.join(path1, relative_path2)
    
    print(new_path)
    

    If you prefer to use pathlib, then you could also leverage Jean-François T's common_parent function.

    pathlib.Path.relative_to is the equivalent of os.path.relpath. Joining the paths is done by simply using a slash (/). new_path is a pathlib.Path object here and can be parsed to a string using str().

    from pathlib import Path
    
    path1 = Path("/users/fida/data_lake/data/archived/09142023")
    path2 = Path("/users/fida/data_lake/data/localpublished/S/S/2022-08-03_11_22_03.kp")
    
    def common_parent(p1: Path, p2: Path) -> Path:
    
        """ Find the common parent of two paths
    
            The author of function common_parent is Jean-François T. and
            the original can be found here 
            https://stackoverflow.com/a/76419489/42659 """
    
        common_paths = sorted(set(p1.parents).intersection(set(p2.parents)))
        if not common_paths:
            raise ValueError(f"No common parent found between {p1} and {p2}")
        return common_paths[-1]
    
    common_path = common_parent(path1, path2)
    
    relative_path2 = path2.relative_to(common_path)
    
    new_path = path1 / relative_path2
    
    print(new_path)  # parse with `str()` if necessary
    

    Note: path2 was shortened in both example to avoid horizontal scrolling on StackOverflow.