Search code examples
pythonvcf-vcardzarr

Display all variants


I have a 2GB vcf DNA file and I am trying to use vcf_to_zarr() to print out all the variant with all fixed fields but I am getting the error KeyError: 'variants/*'

allel.vcf_to_zarr

import allel
import numcodecs
import zarr

def readVcf():

    allel.vcf_to_zarr('actual.vcf', 'example.zarr', fields='*', overwrite=True)
    callset = zarr.open_group('example.zarr', mode='r')
    allfield=callset['variants/*']

    for a in allfield:
         print(a)


Solution

  • To iterate over all variants fields, do:

    for a in callset['variants']:
        print(a)
    

    Zarr does not understand wild cards ('*') in hierarchy paths.