Search code examples
mysqlsqlhql

Is there a better/faster way to compute an average delta between dates?


It's a work in medical records. Goal is computing average value in days between two medical consultations, per patient, per care-unit, per year. I'm stuck with big records : for small units with less than 50 patients / 200 consultations, the below HQL query (for one care-unit/one year) is functional and relatively quick, but for greater medical activity, there is a "combinatory explosion" with a heavy load on database ... And my wish is to analyze 10 years for some 80 care-units... in one launch. If you have any advice I would be very grateful!

SELECT
HB3 patient.pati_nip AS NIPP,
UPPER(cufm.cufm_libelle) AS CAT_UFM,
grp.unfo_libelle AS SECTEUR_DISP,
uf_ex.codeLibelle AS UNITE,
COUNT(DISTINCT raa.id) AS RAA,
COUNT(DISTINCT patient.id) AS PATIENTS,
ROUND(AVG(raa2.traa_date-raa.traa_date),1) AS DELAIMOY_J_INTER_RAA

FROM
Ide_patient AS patient
JOIN patient.pms_edgars AS redg
JOIN redg.bas_uf AS uf_ex
JOIN redg.pms_edgar_actes AS acte
JOIN acte.bas_catalogue_gen_by_Edgr_id_cage_nature AS type
JOIN acte.pms_raas as raa
JOIN patient.pms_edgars AS redg2
JOIN redg2.bas_uf AS uf_ex2
JOIN redg2.pms_edgar_actes AS acte2
JOIN acte2.bas_catalogue_gen_by_Edgr_id_cage_nature AS type2
JOIN acte2.pms_raas as raa2
JOIN uf_ex.bas_etablissement AS etab
JOIN uf_ex.bas_uf_by_Unfo_id_unfo_grp as grp
JOIN uf_ex.bas_categorie_ufm AS cufm

WHERE
etab.id = <ETAB>
AND raa.traa_date BETWEEN INVITE(D: Actes exportés effectués entre le ) AND INVITE(D: et le )
AND type.cage_code NOT LIKE 'R%'
AND uf_ex.id = INVITE(B:UF_MED_FILT_VAL: File active+nouveaux patients pour cette UF exécutante)
AND raa.traa_dat_export IS NOT NULL
AND raa2.traa_date = (SELECT MIN(raa3.traa_date)
    FROM patient.pms_edgars AS redg3
    JOIN redg3.bas_uf AS uf_ex3
    JOIN redg3.pms_edgar_actes AS acte3
    JOIN acte3.bas_catalogue_gen_by_Edgr_id_cage_nature AS type3
    JOIN acte3.pms_raas as raa3
    WHERE raa3.traa_dat_export IS NOT NULL
    AND raa3.traa_date > raa.traa_date
    AND uf_ex3.id = uf_ex
    AND type3.cage_code NOT LIKE 'R%')

ORDER BY
patient.pati_nip, UPPER(cufm.cufm_libelle), grp.unfo_libelle, uf_ex.codeLibelle

https://stackoverflow.com/users/1766831/rick-james, here is the minimal query, with no delta computing, no "agregate" functions

SELECT
HB3 patient.id AS PATI_ID,
uf_ex.codeLibelle AS UNITE,
raa.traa_date AS DATE_CONSULT_DATE

FROM
Ide_patient AS patient
JOIN patient.pms_edgars AS redg
JOIN redg.bas_uf AS uf_ex
JOIN redg.pms_edgar_actes AS acte
JOIN acte.bas_catalogue_gen_by_Edgr_id_cage_nature AS type
JOIN acte.pms_raas as raa
JOIN uf_ex.bas_etablissement AS etab

WHERE
etab.id = <ETAB>
AND raa.traa_date BETWEEN INVITE(D: consultations between ) AND INVITE(D: and )
AND type.cage_code NOT LIKE 'R%'
AND uf_ex.id = INVITE(B:UF_MED_FILT_VAL: consultations done in this care-unit)
AND raa.traa_dat_export IS NOT NULL

ORDER BY
GROUP BY uf_ex.codeLibelle, patient.id, raa.traa_date

=> First letter of type.cage_code means "type of consultations" IN ('E','D','G','A','R'), and 'R' is excluded because patient is not present (meeting of the medical team)

=> goal is computing, for all consultations (except R) of a same patient, the delta betwen two contiguous consultations in a time interval. Date Format for raa.traa_date includes hours,minutes,seconds.

=> uf_ex.id is the ID of the medical care-unit for the actual consultation


Solution

  • Step 1. CREATE TABLE tbl with pati_id, unite, and consult_date. Also, have a 4th column that is AUTO_INCREMENT PRIMARY KEY; let's call it id. (If you are using 8.0 or 10.2, use a "CTE" and WITH.)

    Step 2. Use the above 'minimal' query to populate the 3 columns, letting id populate itself. Be sure to include ORDER BY pati_id, consult_date. (and maybe unite?)

    Step 3. ALTER TABLE tbl ADD INDEX(pati_id, id)

    Step 4. Do a self-join of that table with itself, but offset the id:

    SELECT  pati_id,
            DATEDIFF(t2.consult_date, t1.consult_date) AS gap
        FROM tbl AS t1
        JOIN tbl AS t2  ON t2.pati_id = t1.pati_id
                       AND t2.id = t1.id + 1
    

    (I leave it to you to decide how UNITE fits in.)