Search code examples
mysqlsubstringfilenames

MySQL search for duplicate filenames with different extensions


I have a table with ,among other columns, a column called filename, which contains file names with their extension.

I need to find all filenames that have been inserted into the table with the same filename but with different extensions. For example I'm looking for files like

01_ceremony_10021.flv
01_ceremony_10021.mp4

I need the list of files that exist in both mp4 and flv formats.

I've tried several things like :

SELECT 
    archivo, 
    substr(archivo,1, locate(".flv", archivo)-1) as NomSinExt 
FROM videoscampus 
where (substr(archivo, -4) = '.flv' || substr(archivo, -4) = '.mp4' )
order by archivo asc

But that just gives me the whole list of videos regardless.

The structure of the table is as follows:

Column      Type
-----------------------------
id          int(11) Auto Incr    
filename    varchar(250)     
deleted     int(1)   
dateStart   datetime     
dateEnd     datetime    

And some examples of filenames that I need to detect:

agai_cpal_t6_cristinacoca.mp4    
agai_cpal_t7_ent_ricardomartino_pacoarango.mp4   
agai_cpal_t8_alejandrogonzalez.mp4   
agai_cpal_t9_ent_ricardomartino_barbaradefranceschi.mp4  
AHE_INF_MMijanco_T3_Macronutrientes.flv <---- DUPLICATE I'm looking for
AHE_INF_MMijanco_T3_Macronutrientes.mp4 <---- DUPLICATE I'm looking for
ahe_inf_prim_t10_maitemijancos.mp4   
ahe_inf_t1_maitemijancos.mp4     
ahe_inf_t10_maitemijancos.mp4    
ahe_inf_t10.1_maitemijancos.mp4  
ahe_inf_t10.2_maitemijancos.mp4  
ahe_inf_t11_maitemijancos.mp4    
ahe_inf_t12_javiertejedor.mp4

Thanks in advance for your time and help.

-Orallo


Solution

  • Give this a try. It splits the table into the mp4 and flv rows, then compares for matches on the first part of each filename.

    SELECT
        SUBSTR(a.`filename`,1,LOCATE('.flv',a.`filename`)) as `base_name`,
        a.`filename` as `flv_file`,
        b.`filename` as `mp4_file`
    FROM `videoscampus` a
    JOIN `videoscampus` b
    ON b.`filename` LIKE '%.mp4'
        AND SUBSTR(b.`filename`,1,LOCATE('.mp4',b.`filename`)) = SUBSTR(a.`filename`,1,LOCATE('.flv',a.`filename`))
    WHERE a.`filename` LIKE '%.flv'
        AND a.`deleted` = 0
        AND b.`deleted` = 0