Search code examples
tensorflowhexagon-dsp

Shell gets stuck when Standalone graph_app in hexagon nnlib is ran


Shell was not responding when i issued this command.

/data/local/graph_app --flag 299 299 3 1 0 0 1 NULL 0 1 0 inputfile /data/local/tmp/img_299x299.bmp

got description info from helper optargs

Usage: testapp [--flag flagopt] [inputfile [inputfile...]]
           flag name      type   default  function
              height       int         0  Height of the input data. 0 == autodetect-square
               width       int         0  Width of the input data. 0 == autodetect-square
               depth       int         3  Depth of the input data
               iters       int         1  Number of times to run each input
            perfdump       int         0  Generate performance dump
                 pmu       int         0  Get Performance Monitor Unit information
         elementsize       int         1  Element Size (uint8==1,float==4)
       layer_reorder    string      NULL  Reorder depth layers. ("210" changes RGB to BGR)
       pprint_floats       int         0  Pretty-Print output as floats
     pprint_imagenet       int         1  Pretty-print output, getting top 5 values and use imagenet categories
               debug       int         0  Debug verbosity level. Higher numbers get more verbosity

Did i miss anything, let me know , I used graphinit_med.c just to check its working and there is no description on what this model does.

thanks,


Solution

  • There is no document on using standalone graph_app, After going through the code got to make it working:

    data/hvx_tf/graph_app --height 299 --width 299 --depth 3 --iters 1 --perfdump 0 --pmu 0 --elementsize 1 --pprint_floats 0 --pprint_imagenet 1 --debug 0 /data/local/tmp/keyboard_299x299.dat
    
    >> Generate *.dat from *.jpg using `./scripts/imagedump.py`
    

    There is still caveat as you can see below logs:

    return value from dspCV_initQ6() : 0 
    const node 1000b success
    const node 1000c success
    const node 1000d success
    const node 1000e success
    const node 1000f success
    const node 10010 success
    const node 10011 success
    const node 10012 success
    const node 10250 success
    nn @ fc72cf80: id=0x0 debug_level=0
    node @ fc733970: id=0x1000b type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733a20: id=0x1000c type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733a70: id=0x1000d type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733b20: id=0x1000e type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733c20: id=0x1000f type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733c70: id=0x10010 type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733d30: id=0x10011 type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733e20: id=0x10012 type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733e70: id=0x10250 type=0x3(Const) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733ec0: id=0x1024a type=0x0(INPUT) n_inputs=0 n_outputs=1 padding=0(WHATEVER)
    node @ fc733f60: id=0x1024b type=0xe(Flatten) n_inputs=2 n_outputs=1 padding=0(WHATEVER)
    node @ fc734040: id=0x1024c type=0x29(Min_f) n_inputs=2 n_outputs=1 padding=0(WHATEVER)
    node @ fc734120: id=0x1024d type=0x2b(Max_f) n_inputs=2 n_outputs=1 padding=0(WHATEVER)
    node @ fc734200: id=0x1024e type=0x2d(Quantize) n_inputs=3 n_outputs=3 padding=0(WHATEVER)
    node @ fc734350: id=0x1024f type=0xf(QuantizedConv2d_8x8to32) n_inputs=7 n_outputs=3 padding=2(VALID)
    node @ fc7344d0: id=0x10251 type=0x13(QuantizeDownAndShrinkRange_32to8) n_inputs=3 n_outputs=3 padding=0(WHATEVER)
    node @ fc734620: id=0x10252 type=0x23(QuantizedBiasAdd_8p8to32) n_inputs=6 n_outputs=3 padding=0(WHATEVER)
    node @ fc734790: id=0x10253 type=0x13(QuantizeDownAndShrinkRange_32to8) n_inputs=3 n_outputs=3 padding=0(WHATEVER)
    node @ fc7348e0: id=0x10254 type=0x15(QuantizedRelu_8) n_inputs=3 n_outputs=3 padding=0(WHATEVER)
    node @ fc734a30: id=0x10442 type=0x2f(Dequantize) n_inputs=3 n_outputs=1 padding=0(WHATEVER)
    node @ fc734b20: id=0x1044d type=0x1(OUTPUT) n_inputs=1 n_outputs=0 padding=0(WHATEVER)
    21 nodes total.
    Init graph done.Prepare fc72cf80 success!
    Using </data/local/tmp/keyboard_299x299.dat>
    filesize=268203 elementsize=1 height=299 width=299 depth=3
    Run!
    sum=37845659
    Executing!
    **execute got err: -1**
    hexagon/ops/src/op_output.c:58:output 0 too small
    output size=4096
    Rank,Softmax,index,string
    0,303036629674309094288042513882152960.000000,575,pick
    1,303036292954618408664607741320364032.000000,461,terrapin
    2,303036292954618408664607741320364032.000000,445,electric ray
    3,79327539388858010780491752432205824.000000,833,bulletproof vest
    4,78902425827607254052570293577187328.000000,936,volleyball
    AppReported: 4294967296
    

    I will update the answer, once I get the standalone app predict the sample image with highest accuracy.