Search code examples
pythonlistedit

Edit a list to remove a defined variable and everything after and including a character


I have the following set up:

variable = /XXX/XXX/XXX/
list = [/XXX/XXX/XXX/INFO_RANDOM_STRING_HERE.file, etc...]

I want to copy the list but trim the starting variable, and everything other than the "INFO" segment (i.e. everything after and including the _ before _RANDOM). The info is different every time, as is the RANDOM_STRING_HERE, but the variable is constant.

How can I achieve this?

To clarify, I have:

variable = /users/me/folder/
list = [/users/me/folder/file1_001_134543_X5_6MGFS.txt, /users/me/folder/file2_231_234233_Y5_6MGFFAS.txt, etc...]

And I want list intact and a new list:

newlist = [file1, file2, etc...]

Solution

  • You could use split() inside a list comprehension:

    [x.split(variable)[1].split('_')[0] for x in the_list]
    

    See the full code:

    variable = "/users/me/folder/"
    the_list = ["/users/me/folder/file1_001_134543_X5_6MGFS.txt", "/users/me/folder/file2_231_234233_Y5_6MGFFAS.txt"]
    
    print [x.split(variable)[1].split('_')[0] for x in the_list]
    

    Outputs:

    ['file1', 'file2']
    

    I have prepared another example (with comments) in case you don't want to use list comprehensions but a simple for loop:

    variable = "/users/me/folder/"
    the_list = ["/users/me/folder/file1_001_134543_X5_6MGFS.txt", "/users/me/folder/file2_231_234233_Y5_6MGFFAS.txt"]
    
    results_list = list()
    
    for full_path in the_list:
        _, file_name = full_path.split(variable) # This splits "/users/me/folder/file1_001_134543_X5_6MGFS.txt" into "/users/me/folder/" and "file1_001_134543_X5_6MGFS.txt" so we take the last one as 'file_name'
        file_name = file_name.split('_')[0]      # This splits e.g. "file1_001_134543_X5_6MGFS.txt" in ["file1", "001", "134543", "X5", "6MGFS.txt"] so we take only the first one, which is in index 0
        results_list.append(file_name)           # Adding e.g. "file1" to our 'results_list'
    
    print results_list