Search code examples
pythonimageexifgopro

Copy images with EXIF(time) to new destination from several directories


I have a problem, below is my root tree: In my root I have that:

 ---dir1---sub1(images exif time 10:05:05 to 10:09:55)
        ---sub2(images exif time 10:11:15 to 10:15:42)
        ---sub3(images exif time 10:22:15 to 10:24:41)
        ---sub4(images exif time 10:28:15 to 10:35:40)


 ---dir2---sub1(images exif time 10:05:06 to 10:09:57)
        ---sub2(images exif time 10:11:15 to 10:15:40)
        ---sub3(images exif time 10:22:15 to 10:24:43)
        ---sub4(images exif time 10:28:15 to 10:35:40)
        ---sub5(images exif time 10:40:15 to 10:43:40)

 ---dir3---sub1(images exif time 10:05:05 to 10:09:54)
        ---sub2(images exif time 10:11:15 to 10:15:40)
        ---sub3(images exif time 10:22:15 to 10:24:41)
        ---sub4(images exif time 10:28:15 to 10:35:40)
        ---sub5(images exif time 10:40:15 to 10:43:42)

 ---dir4---sub1(images exif time 10:05:06 to 10:09:57)
        ---sub2(images exif time 10:11:15 to 10:15:40)
        ---sub3(images exif time 10:22:15 to 10:24:43)
        ---sub4(images exif time 10:28:15 to 10:35:40)
        ---sub5(images exif time 10:40:15 to 10:43:40)

 ---dir5---sub1(images exif time 10:05:05 to 10:09:54)
        ---sub2(images exif time 10:11:15 to 10:15:40)
        ---sub3(images exif time 10:22:15 to 10:24:41)
        ---sub4(images exif time 10:28:15 to 10:35:40)
        ---sub5(images exif time 10:40:15 to 10:43:42)

I have 5 dirs in my root and each contains sub-folders(with images) number of sub-folders it's not a same all the time, What I want to do is from first - dir1 get sub1 and put it to new destination folder after go to next dir (dir2) scan sub-folders to check exif(time) if its a same as sub1 from dir1 and copy it to same directory after go to next dir3 and do same for all others dir-s and subfolders, and after create newdir2 go and take sub2 from dir1 and do again same loop till end...

something like:

   ---newdir1---sub1(from dir1)
             ---sub1(from dir2)
             ---sub1(from dir3)
             ---sub1(from dir4)
             ---sub1(from dir5)

   ---newdir2---sub2(from dir1)
             ---sub2(from dir2)
             ---sub2(from dir3)
             ---sub2(from dir4)
             ---sub2(from dir5)

   ---newdir3---sub3(from dir1)
             ---sub3(from dir2)
             ---sub3(from dir3)
             ---sub3(from dir4)
             ---sub3(from dir5)

   ---newdir4---sub4(from dir1)
             ---sub4(from dir2)
             ---sub4(from dir3)
             ---sub4(from dir4)
             ---sub4(from dir5)

   ---newdir5---sub5(from dir2)
             ---sub5(from dir3)
             ---sub5(from dir4)
             ---sub5(from dir5)

I have a part of script with sort my images to dictionary by some time interval, how I can join it to my script?? to get my sub-s with same key to same dir ??:

import os
import exifread
from datetime import datetime, timedelta

TIME_RANGE = 2

src_root = 'F:\gopro_egouts\gopro_img_test\\2018-03-06'

dst_root = src_root + '-copie'

src_dirs_dict = {}

for cam_dir in os.listdir(src_root):
    laps_root = os.path.join(src_root, cam_dir)
    for lap_dir in os.listdir(laps_root):
        files_root = os.path.join(laps_root, lap_dir)
        min_time = None
        max_time = None
        for cam_file in os.listdir(files_root):
            with open(os.path.join(files_root, cam_file), 'rb') as f:
                tags = exifread.process_file(f, details=False, stop_tag="EXIF DateTimeOriginal")
                time_taken = tags.get("EXIF DateTimeOriginal")
                if time_taken:
                    file_time = datetime.strptime(str(time_taken), '%Y:%m:%d %H:%M:%S')
                    if min_time is not None:
                        if file_time < min_time:
                            min_time = file_time
                    else:
                        min_time = file_time

                    if max_time is not None:
                        if file_time > max_time:
                            max_time = file_time
                    else:
                        max_time = file_time


        is_key = None
        for key in src_dirs_dict.keys():
            if (min_time >= key[0] and min_time < key[1]) \
                    or (max_time >= key[0] and max_time < key[1]):
                is_key = key
                break
        min_time = min_time.replace(second=0)
        max_time = min_time + timedelta(minutes=TIME_RANGE)

        if is_key:

            key_min, key_max = is_key
            if min_time < key_min:
                key_min = min_time
            if max_time > key_max:
                key_max = max_time

            new_key = (key_min, key_max)

            if new_key == is_key:
                src_dirs_dict[new_key].append(files_root)
            else:

                src_dirs_dict[new_key] = src_dirs_dict.pop(is_key) + [files_root]
        else:
            new_key = (min_time, max_time)
            src_dirs_dict[new_key] = [files_root]


print(src_dirs_dict)

My print showing me that:

{(datetime.datetime(2018, 3, 6, 10, 31), datetime.datetime(2018, 3, 6, 10, 32)): ['F:\\gopro_egouts\\gopro_img_test\\2018-03-06\\CAM0101 1\\Time Lapse 3',...

I have a working script with working well but taking a sub-folders one by one , and when some Time-lapse is missing, there I have a problem, his mixing my sub-s(automatically taking next one from next dir with wrong time), where I have to add my exif script from above to here(how modify it)... how to join it together???
Any help will be appreciated.

from collections import defaultdict
import shutil
import os
import re
src_root = r'F:\gp\gp_test\\2018-03-06'

dst_root = src_root + '-copie'

#os.makedirs(dst_root, exist_ok=True)

src_dirname, src_folders, _ = next(os.walk(src_root))
src_folders = sorted(src_folders)

src_folders = [os.path.join(src_root, folder) for folder in src_folders]
print(src_folders)
job = defaultdict(list)

print('mes {} dossier cam'.format(len(src_folders)))

for folder in src_folders:
    print()
    dirname, src_sub_folders, _ = next(os.walk(os.path.join(src_dirname, folder)))
    src_sub_folders = sorted(src_sub_folders, key=lambda x: [re.search(r'(\D+)', x).group(1)] + list(map(int, re.findall(r'\d+', x))))
    print("mes 5 CAM avec {} time laps '{}'".format(len(src_sub_folders), folder))

    for index, sub_folder in enumerate(src_sub_folders, start=1):
        job['Time Lapse-{}'.format(index)].append(os.path.join(dirname, sub_folder))

#print()

for dst_folder, src_folders in sorted(job.items()):
    for index, src_folder in enumerate(src_folders, start=1):
        dst_new_folder = os.path.join(dst_root, dst_folder, 'CAM-{}'.format(index))
        print('{} -> {}'.format(src_folder, dst_new_folder))
        shutil.copytree(src_folder, dst_new_folder)
#shutil.rmtree(src_root)

for root, dirs, files in os.walk(dst_root):
    for f in files:
        prefix = os.path.basename(root)
        prefix1 = os.path.basename(src_root)
        os.rename(os.path.join(root, f), os.path.join(root, "{}-{}-{}".format(prefix1, prefix, f)))
        print("images rennomer ")

print("fini")
print("dossier supprimé")

I'm really sorry if that will be not to clear for users, but English it's not my strongest language ...


Solution

  • In a nutshell, you have images of the same set of events shot on a number of cameras.

    Currently, they are grouped first by camera, then by event:

    ├── Camera1
    │   ├── Event1
    │   ├── Event2
    │   ├── Event3
    │   ├── Event4
    │   └── Event5
    ├── Camera2
    │   ├── Event1
    │   ├── Event2
    │   ├── Event3
    │   ├── Event4
    │   └── Event5
    ├── Camera3
    │   ├── Event1
    │   ├── Event2
    │   ├── Event3
    │   ├── Event4
    │   └── Event5
    ├── Camera4
    │   ├── Event1
    │   ├── Event2
    │   ├── Event3
    │   ├── Event4
    │   └── Event5
    └── Camera5
        ├── Event1
        ├── Event2
        ├── Event3
        ├── Event4
        └── Event5
    

    ... where some events may be missing and the event numbering may not match because one or more events may not be recorded by all cameras.

    And you want the same set of images grouped first by event, then by camera:

    ├── Event1
    │   ├── Camera1
    │   ├── Camera2
    │   ├── Camera3
    │   ├── Camera4
    │   └── Camera5
    ├── Event2
    │   ├── Camera1
    │   ├── Camera2
    │   ├── Camera3
    │   ├── Camera4
    │   └── Camera5
    ├── Event3
    │   ├── Camera1
    │   ├── Camera2
    │   ├── Camera3
    │   ├── Camera4
    │   └── Camera5
    ├── Event4
    │   ├── Camera1
    │   ├── Camera2
    │   ├── Camera3
    │   ├── Camera4
    │   └── Camera5
    └── Event5
        ├── Camera1
        ├── Camera2
        ├── Camera3
        ├── Camera4
        └── Camera5
    

    Here's an idea... I am kind of "thinking aloud" in pseudo-code:

    Create the output directories {Event1..EventN}/Camera{1..N}
    MissingDirectory=false
    for each input directory Camera{1..N}
        if this directory has the full number of subdirectories
            copy all subdirectories to output area
        else
            MissingDirectory=true
        end if
    end for
    
    if MissingDirectory
        for each output Event directory
            get times of all files from all cameras for current event
            sort list and take median time of current event
        end for
        for each un-copied input directory
            get the mean time of all the files in it
            assign this directory's files to output directory with nearest median time
        end for
    endif
    

    You can convert your EXIF times to pure seconds since midnight (s) with:

    s = (hours*3600) + (minutes*60) + seconds
    

    Here's a way to get the time (in seconds since midnight) that an image was taken:

    import exifread
    
    def getImageTime(filename):
        "Read EXIF data of given file and return time in seconds since midnight"
        f=open(filename,'rb')
        tags=exifread.process_file(f)
        DateTime=tags["EXIF DateTimeOriginal"].printable
        # DateTime looks like: "2013:03:09 08:59:50"
        Time=DateTime.split()[-1]
        # Time looks like: "08:59:50"
        h,m,s=Time.split(":")
        # Return seconds since midnight: 32390
        return (int(h)*3600) + (int(m)*60) + int(s)
    
    s=getImageTime("image.jpg")
    print(s)
    

    After some more thought, this won't work very well if one of the cameras is set, say, 20 minutes different from the others, since all its images from all the sequences will tend to get put into the first or last directory. Needs some more thought...