Search code examples
shellcurlurlurlencodewebapi

Forward slash in Data retrieval using Web API and curl


I have a list of microRNA family names among which some are named in the form of "miR-10-5p". However, some microRNA family names have forward slash in them, e.g., "miR-1-3p/206". I want to use the Web API of ENCORI or starBase databse to obtain the ceRNA data for the miRNA families.

I used the following curl command in shell for retrieving the data of a microRNA families without slashes in their names:

curl 'https://rnasysu.com/encori/api/ceRNA/?assembly=hg38&geneType=mRNA&ceRNA=all&miRNAnum=2&family=miR-10-5p&pval=0.01&fdr=0.01&pancancerNum=0' >ENCORI_hg38_ceRNA-network_all.txt

This command works properly.

However for the microRNA families with slashes in their names, the following equivalent command does not work.

curl 'https://rnasysu.com/encori/api/ceRNA/?assembly=hg38&geneType=mRNA&ceRNA=all&miRNAnum=2&family=miR-1-3p/206&pval=0.01&fdr=0.01&pancancerNum=0' >ENCORI_hg38_ceRNA-network_all.txt

I tried replacing '/' with '%2F' but it still does not work.

curl 'https://rnasysu.com/encori/api/ceRNA/?assembly=hg38&geneType=mRNA&ceRNA=all&miRNAnum=2&family=miR-1-3p%2F206&pval=0.01&fdr=0.01&pancancerNum=0' >ENCORI_hg38_ceRNA-network_all.txt

How should I mention the microRNA families with slashes in their names in the curl command?

Edit: While searching for the solution, I found the following question on the stack overflow.

cURL wrap url containing token having forward slashes

Is it the same problem as mine? But then, how is URL encoding helpful here?


Solution

  • Given the table in hg38_all_fimaly.txt of the reference data, the ID of miR-1-3p/206 is 2:

    miRfamilyID  miRfamily       miRNAnum  miRNAcat
    1            let-7-5p/98-5p  11        hsa-let-7a-5p,hsa-let-7b-5p,hsa-let-7c-5p,hsa-let-7d-5p,hsa-let-7e-5p,hsa-let-7f-5p,hsa-let-7g-5p,hsa-let-7i-5p,hsa-miR-4458,hsa-miR-4500,hsa-miR-98-5p
    2            miR-1-3p/206    3         hsa-miR-1-3p,hsa-miR-206,hsa-miR-613
    3            miR-10-5p       2         hsa-miR-10a-5p,hsa-miR-10b-5p
    4            miR-101-3p.1    1         hsa-miR-101-3p
    ...
    

    You can use that ID in your query instead of the family string:

    curl 'https://rnasysu.com/encori/api/ceRNA/?assembly=hg38&geneType=mRNA&ceRNA=all&miRNAnum=2&family=2&pval=0.01&fdr=0.01&pancancerNum=0' > ENCORI_hg38_ceRNA-network_all.txt