I have stored some hierarchical data of categories where each category is related to others, the trick is a single category can have multiple parents (Maximum 3, Minumum 0).
The table structures are:
category table id - Primary Key
name - Name of the Category
ref_id - Reference ID that is being used for relationship
id | name | ref_id |
---|---|---|
1 | everything | -1 |
2 | computing | 0 |
3 | artificial intelligence | 1 |
4 | data science | 2 |
5 | machine learning (ML) | 3 |
6 | programming | 4 |
7 | web technologies | 5 |
8 | programming languages | 7 |
9 | content technologies | 8 |
10 | operating systems | 9 |
11 | algorithms | 10 |
12 | software development systems | 102 |
category_relation table
id | child_ref_id | parent_ref_id |
---|---|---|
1 | 0 | -1 |
2 | 1 | 0 |
3 | 2 | 0 |
4 | 3 | 1 |
5 | 3 | 2 |
6 | 4 | 102 |
7 | 5 | 0 |
8 | 7 | 4 |
9 | 8 | 0 |
10 | 9 | 0 |
11 | 10 | 0 |
12 | 10 | 4 |
13 | 102 | 0 |
as you can see in the diagram, the relationship is pretty complicated, algorithms has two parents computing and programming, similarly machine learning (ML) also has two parents artificial intelligence and data science
How can I retrieve all the children of a specific category, e.g. computing, I need to retrieve all the children till the third level, i.e. programming languages and algorithms.
MySQL dump of the database: https://github.com/codersrb/multi-parent-hierarchy/blob/main/taxonomy.sql
Assuming the data structure is fixed with a good PK, in MySQL 8.x you can do:
with recursive
n (id, name, ref_id, lvl) as (
select id, name, ref_id, 1 from category where id = 2 -- starting node
union all
select c.id, c.name, c.ref_id, n.lvl + 1
from n
join category_relation r on r.parent_ref_id = n.ref_id
join category c on c.ref_id = r.child_ref_id
)
select * from n where lvl <= 3
Result:
id name ref_id lvl
---- --------------------------------------- ------- ---
2 computing 0 1
3 artificial intelligence 1 2
4 data science 2 2
7 web technologies 5 2
9 content technologies 8 2
10 operating systems 9 2
11 algorithms 10 2
62 information science 61 2
103 software / systems development 102 2
165 scientific computing 165 2
296 image processing 316 2
297 text processing 317 2
301 Google 321 2
322 computer vision 343 2
5 machine learning (ML) 3 3
5 machine learning (ML) 3 3
6 programming 4 3
18 models 17 3
21 classification 20 3
27 data preparation 26 3
28 data analysis 27 3
29 imbalanced datasets 28 3
50 visualization 49 3
61 information retrieval 60 3
68 k-means 67 3
71 Random Forest algorithm 70 3
104 project management 103 3
105 software development methodologies 104 3
107 web development 106 3
113 kNN model 112 3
132 CRISP-DM methodology 131 3
143 data 142 3
153 SMOTE 153 3
154 MSMOTE 154 3
157 backward feature elimination 157 3
158 forward feature selection 158 3
176 deep feature synthesis (DFS) 177 3
196 unsupervised learning 197 3
210 mean-shift 211 3
212 DBSCAN 213 3
246 naïve Bayes algorithm 247 3
248 decision tree algorithm 249 3
249 support vector machine (SVM) algorithm 250 3
251 neural networks 252 3
252 artificial neural networks (ANN) 253 3
281 deep learning 300 3
281 deep learning 300 3
285 image classification 304 3
285 image classification 304 3
286 natural language processing (NLP) 305 3
286 natural language processing (NLP) 305 3
288 text representation 307 3
294 visual recognition 314 3
295 optical character recognition (OCR) 315 3
295 optical character recognition (OCR) 315 3
296 image processing 316 3
298 machine translation (MT) 318 3
299 speech recognition 319 3
300 TensorFlow 320 3
302 R 322 3
304 Android 324 3
322 computer vision 343 3
323 object detection 344 3
324 instance segmentation 345 3
325 edge detection 346 3
326 image filters 347 3
327 feature maps 348 3
328 stride 349 3
329 padding 350 3
335 text preprocessing 356 3
336 tokenization 357 3
337 case normalization 358 3
338 removing punctuation 359 3
339 stop words 360 3
340 stemming 361 3
341 lemmatization 362 3
342 Porter algorithm 363 3
350 word2vec 371 3
351 Skip-gram 372 3
364 convnets 385 3
404 multiplicative update algorithm 716 3
If you want to remove duplicates you can use DISTINCT
. For example:
with recursive
n (id, name, ref_id, lvl) as (
select id, name, ref_id, 1 from category where id = 2 -- starting node
union all
select c.id, c.name, c.ref_id, n.lvl + 1
from n
join category_relation r on r.parent_ref_id = n.ref_id
join category c on c.ref_id = r.child_ref_id
)
select distinct * from n where lvl <= 3
See running example at DB Fiddle.