It's a work in medical records. Goal is computing average value in days between two medical consultations, per patient, per care-unit, per year. I'm stuck with big records : for small units with less than 50 patients / 200 consultations, the below HQL query (for one care-unit/one year) is functional and relatively quick, but for greater medical activity, there is a "combinatory explosion" with a heavy load on database ... And my wish is to analyze 10 years for some 80 care-units... in one launch. If you have any advice I would be very grateful!
SELECT
HB3 patient.pati_nip AS NIPP,
UPPER(cufm.cufm_libelle) AS CAT_UFM,
grp.unfo_libelle AS SECTEUR_DISP,
uf_ex.codeLibelle AS UNITE,
COUNT(DISTINCT raa.id) AS RAA,
COUNT(DISTINCT patient.id) AS PATIENTS,
ROUND(AVG(raa2.traa_date-raa.traa_date),1) AS DELAIMOY_J_INTER_RAA
FROM
Ide_patient AS patient
JOIN patient.pms_edgars AS redg
JOIN redg.bas_uf AS uf_ex
JOIN redg.pms_edgar_actes AS acte
JOIN acte.bas_catalogue_gen_by_Edgr_id_cage_nature AS type
JOIN acte.pms_raas as raa
JOIN patient.pms_edgars AS redg2
JOIN redg2.bas_uf AS uf_ex2
JOIN redg2.pms_edgar_actes AS acte2
JOIN acte2.bas_catalogue_gen_by_Edgr_id_cage_nature AS type2
JOIN acte2.pms_raas as raa2
JOIN uf_ex.bas_etablissement AS etab
JOIN uf_ex.bas_uf_by_Unfo_id_unfo_grp as grp
JOIN uf_ex.bas_categorie_ufm AS cufm
WHERE
etab.id = <ETAB>
AND raa.traa_date BETWEEN INVITE(D: Actes exportés effectués entre le ) AND INVITE(D: et le )
AND type.cage_code NOT LIKE 'R%'
AND uf_ex.id = INVITE(B:UF_MED_FILT_VAL: File active+nouveaux patients pour cette UF exécutante)
AND raa.traa_dat_export IS NOT NULL
AND raa2.traa_date = (SELECT MIN(raa3.traa_date)
FROM patient.pms_edgars AS redg3
JOIN redg3.bas_uf AS uf_ex3
JOIN redg3.pms_edgar_actes AS acte3
JOIN acte3.bas_catalogue_gen_by_Edgr_id_cage_nature AS type3
JOIN acte3.pms_raas as raa3
WHERE raa3.traa_dat_export IS NOT NULL
AND raa3.traa_date > raa.traa_date
AND uf_ex3.id = uf_ex
AND type3.cage_code NOT LIKE 'R%')
ORDER BY
patient.pati_nip, UPPER(cufm.cufm_libelle), grp.unfo_libelle, uf_ex.codeLibelle
https://stackoverflow.com/users/1766831/rick-james, here is the minimal query, with no delta computing, no "agregate" functions
SELECT
HB3 patient.id AS PATI_ID,
uf_ex.codeLibelle AS UNITE,
raa.traa_date AS DATE_CONSULT_DATE
FROM
Ide_patient AS patient
JOIN patient.pms_edgars AS redg
JOIN redg.bas_uf AS uf_ex
JOIN redg.pms_edgar_actes AS acte
JOIN acte.bas_catalogue_gen_by_Edgr_id_cage_nature AS type
JOIN acte.pms_raas as raa
JOIN uf_ex.bas_etablissement AS etab
WHERE
etab.id = <ETAB>
AND raa.traa_date BETWEEN INVITE(D: consultations between ) AND INVITE(D: and )
AND type.cage_code NOT LIKE 'R%'
AND uf_ex.id = INVITE(B:UF_MED_FILT_VAL: consultations done in this care-unit)
AND raa.traa_dat_export IS NOT NULL
ORDER BY
GROUP BY uf_ex.codeLibelle, patient.id, raa.traa_date
=> First letter of type.cage_code means "type of consultations" IN ('E','D','G','A','R'), and 'R' is excluded because patient is not present (meeting of the medical team)
=> goal is computing, for all consultations (except R) of a same patient, the delta betwen two contiguous consultations in a time interval. Date Format for raa.traa_date includes hours,minutes,seconds.
=> uf_ex.id is the ID of the medical care-unit for the actual consultation
Step 1. CREATE TABLE tbl
with pati_id
, unite
, and consult_date
. Also, have a 4th column that is AUTO_INCREMENT PRIMARY KEY
; let's call it id
. (If you are using 8.0 or 10.2, use a "CTE" and WITH
.)
Step 2. Use the above 'minimal' query to populate the 3 columns, letting id
populate itself. Be sure to include ORDER BY pati_id, consult_date
. (and maybe unite
?)
Step 3. ALTER TABLE tbl ADD INDEX(pati_id, id)
Step 4. Do a self-join of that table with itself, but offset the id:
SELECT pati_id,
DATEDIFF(t2.consult_date, t1.consult_date) AS gap
FROM tbl AS t1
JOIN tbl AS t2 ON t2.pati_id = t1.pati_id
AND t2.id = t1.id + 1
(I leave it to you to decide how UNITE
fits in.)