Search code examples

PapersWithCode API - retrieving all areas-tasks-subtasks taxonomy

I am looking for the complete taxonomy of PapersWithCode: areas-tasks-subtasks. PaperswithCode website: PaperswithCode API:

I already tried to use the PapersWithCode-API Here's a python example of what I requested hoping to built the area-task-subtasks mapping.

import request
area_id = 'computer-vision'
q = f'{area_id}/tasks/?page=1&items_per_page=500'
res = requests.get(q).json()


[{'id': 'aesthetics-quality-assessment',
  'name': 'Aesthetics Quality Assessment',
  'description': 'Automatic assessment of aesthetic-related subjective ratings.'},
 {'id': 'user-constrained-thumbnail-generation',
  'name': 'User Constrained Thumbnail Generation',
  'description': 'Thumbnail generation is the task of generating image thumbnails from an input image.\r\n\r\n<span style="color:grey; opacity: 0.6">( Image credit: [User Constrained Thumbnail Generation using Adaptive Convolutions]( )</span>'},
 {'id': 'sensor-fusion',
  'name': 'Sensor Fusion',
  'description': '**Sensor Fusion** is the broad category of combining various on-board sensors to produce better measurement estimates. These sensors are combined to compliment each other and overcome individual shortcomings.\r\n\r\n\r\n<span class="description-source">Source: [Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements ](;/span&gt;'},
 {'id': 'lip-sync-1',
  'name': 'Constrained Lip-synchronization',
  'description': 'This task deals with lip-syncing a video (or) an image to the desired target speech. Approaches in this task work only for a specific (limited set) of identities, languages, speech/voice. See also: Unconstrained lip-synchronization -'},
 {'id': 'online-multi-object-tracking',
  'name': 'Online Multi-Object Tracking',
  'description': 'The goal of **Online Multi-Object Tracking** is to estimate the spatio-temporal trajectories of multiple objects in an online video stream (i.e., the video is provided frame-by-frame), which is a fundamental problem for numerous real-time applications, such as video surveillance, autonomous driving, and robot navigation.\r\n\r\n\r\n<span class="description-source">Source: [A Hybrid Data Association Framework for Robust Online Multi-Object Tracking ](;/span&gt;'},
 {'id': 'cross-domain-few-shot',
  'name': 'Cross-Domain Few-Shot',
  'description': ''}, ...]

I checked the entire response and there is no information on whether each task has any parent or child task.


  • As pointed out by @JYL the main resource to use can be found at

    From there it is possible to retrieve Task-Subtask information from the "Evaluation Tables".

    I was able to reconstruct the tree using the following python code:

    1. Loading the data

    ### retrieving all tasks hierarchy
    import pandas as pd
    import json
    import gzip
    with'data/evaluation-tables.json.gz', 'r') as fin:
        eval_tables = json.loads('utf-8'))

    1. Exploring the json

    def expand_tasks_tree(subtasks,parent,root,level):
        global index_dict
        r_tmp = []
        for subtask in subtasks:
            task =  subtask['task']
            if not task in index_dict.keys(): 
                index_dict[task] = max(index_dict.values())+1
            r_tmp += [{'level':level,'root':root,'parent':parent,'task':task,'id':index_dict[task],'parent_id':index_dict[parent]}]
            try:    r_tmp += expand_tasks_tree(subtask['subtasks'],task,root,level+1)
            except: print(subtask)
        return r_tmp
    index_dict = {'root':0}
    eval_all = [{'task':item['categories'][0] if item['categories'] else 'uncategorized','subtasks':[item]} for item in eval_tables]
    res = expand_tasks_tree(eval_all,'root','',0)

    1. And then parsing it into a dataframe following a parent-child schema:


    This results in the following dataframe:

    Results example and sample structure