Search code examples
androidspeech-recognitionvoice

Android: Google Voice Recognition Server


I am currently playing around with Google's Voice Recognition API for Android SDK. What I want to know is the URL for their voice recognition. You need a data connection to use the feature, so obviously they are parsing things server-side. Does anyone know this URL?


Solution

  • Okay, here's what I found, building off of this article.

    Here is a full TCP dump of the various HTTP and TCP operations that go into a voice search:

        POST /m/voice-search HTTP/1.1
    
    Content-Length: 117
    
    Content-Type: application/octet-stream
    
    Host: www.google.com
    
    Connection: Keep-Alive
    
    User-Agent: Mozilla/5.0 (Linux; U; Android 2.3.5; en-us; YP-G1) AppleWebKit/525.10+ (KHTML, like Gecko) Version/3.0.4 Mobile Safari/523.12.2
    
    
    
    
    .....intent-speech-apiR\
    !VS 2.1.3 os=[Android 2.3.5 YP-G1]........>.. .".-16508905153536551372.en-US:..... .(.EZL..HTTP/1.1 200 OK
    
    Content-Type: application/x-protobuffer
    
    Date: Tue, 26 Jun 2012 13:07:17 GMT
    
    Expires: Tue, 26 Jun 2012 13:07:17 GMT
    
    Cache-Control: private, max-age=0
    
    X-Content-Type-Options: nosniff
    
    X-Frame-Options: SAMEORIGIN
    
    X-XSS-Protection: 1; mode=block
    
    Content-Length: 133
    
    Server: GSE
    
    
    
    
    7
     6d72f95e7d9d47135f894ddb26d6303d....intent-speech-api..RH
    .
    .74.125.142.126.....0ZQjyPfTdtfxMBro66d72f95e7d9d47135f894ddb26d6303dPOST /m/voice-search?multiproto=true HTTP/1.1
    
    Content-Length: 3271
    
    Content-Type: application/octet-stream
    
    Host: www.google.com
    
    Connection: Keep-Alive
    
    User-Agent: Mozilla/5.0 (Linux; U; Android 2.3.5; en-us; YP-G1) AppleWebKit/525.10+ (KHTML, like Gecko) Version/3.0.4 Mobile Safari/523.12.2
    
    
    
    .s
    .....intent-speech-apiRZ
    !VS 2.1.3 os=[Android 2.3.5 YP-G1]........>.. .".-16508905153536551372.en-US:..... .EZL...d
    .....intent-speech-apibK
    .j.
    .free_form.............en-US(.0.:.com.rockwellcollins.voiceH.PVh.x......
    .....intent-speech-api....
    ..<4p(.X.....*.EaU..74-l.`..X.M...<..a>.y..ioZ~Gr...F2JDNH.8..@...<(r'..~
    a..(B....4..,XxE..T.+...<.>cN
    .p.....O.j.V.3]d5q....J...<${c;.j.A..K.e..Gv.$@...C.....f@<..Z..B..i..^T]....*.c.Q.Ak.k...<,}_..b.....f"..CI.....D..f>2...< n].W......W.^).6........l...A@<.[m...... . ..t.E.n.SC
    .D./`...< sx.tH.a...d#......U......[...@<.Co.........Ce.0,..
    ....}....:`<&p#......k...........?...!.... < U.......3:.H.L....<j....:<..n.<"n.>....i....c]..,.2.<O{.W.., .....
    .....intent-speech-api....
    ..<.{%....a..:..
    ..)......../.Mq.@<.[.6.A.a..z9|p.......$...D""...<.r]..2.a...\.J...:.*.y..-.j0..P<$[c..\....O......j$..U}........<Dd_......Z.....x....^.!.....{..<$.cD.....r.rc...xh....4:.......<..}.]).a..[.5..K
    ....d..=.=W..P<&..~.h...O..Qo..p~..'f!
    .n\.LN.<.._.
    .....
    ....3..D5.~..Ws~.L..<*Cx..j.ai.Z.`R..\......h.c.....<.d....H ...{zb..jo...L.o...$T.p<y.6.D@.Q.J.UY\.....]-.F.~.y....<C...5zQ.w.K
    ...x3k...-.]....o{.<U/S..........QY.a.........4.\......
    .....intent-speech-api....
    ..<,.c.......*.,......8....H.N."..<k.l.*{...k.....I$..?.B.....iQU.<%n$.m..t...2..Z......vT......{.<)...o..p.d<.../....`.H.v..=..~.<?%....R....Y. ..o..K.f.B.i|K...<Y.......T./h..z.g.@.......s\~. <a%..0R....?...........p..7..=Xp</...P^...
    .Y}......z.........B.<>..........s2.....*.6 ..%.!Q...<.._....!.._..........Y.....}...<.edG.&..s..j....j.......-......<..X.AQ...=.4.9...k..r...Bx..f2P<F['.u........J............/... <.3S..r..........,.V.%...]....O.....
    .....intent-speech-api....
    ..<(}q..C/P.9.......zm.6s...)..]4P<..(..P......N
    .6...c......k..".<..x...q..>.....QI$[~...>..Q.L0.<..c......<..*..I.`........Jf..@<.v#F.......0R..BZ.I'..-...q..IP<..dN.tP!_Z...0.+a.j.29Yc..g...`<C.{D......Z2......7.l.2.F0....@<D._....a.*........ +..5...N.yFP<D.d......}:.].-.%e......|-.p...<1
    ..^.1..... =....E.U.z...g...@<..g..l..._/z...SH.A)]q....../..<&.i.\..a.{..{...../....w...B3..<..g........O*..<c...y...YE)... <$....p.A........)..OP....Y;.s.P....
    .....intent-speech-api....
    ..<B}h....!.4.j(L.e.......F,..3#. <....
    N
    A...V........\V[.da....0<Deg......J.]Z..t...>@.$./...o
    .< ..N.Z%..J.....x(wJ.l.5Y.=.oTSP<&ra..`....J.[....?.i$..d......p< s.......d.-..d...=.....
    L..j.`<D~f.....y.J......B..!...c..D.c.<..d?V......1.%.........u..}..K.<L.x..2...i:.=q6.].`.9.{..A..".p<.zWGVt.A..O!R....L.w.O.(8....Y.<DgcGT.
    ..n.c.s...
    ...}.C..*.rf < {t>.......8b.[..\."....<@..B?.< ^$......,:Xl...%r..F..
    ..)O...< .#6.i)A...X.~Qc...TF.....[+.+.....
    .....intent-speech-api....
    ..<.nh.....h....h".......|wd..n.l.<(mk,.P.A....,...*.57..EG....p.0<....V............[ ..?]....lWQ@<0[q.VW.a.....n.....m.t..6..`...<.~y..r...d........}.....:...v..<.zb..c...-.2S@F.....oa.H.m..dD.<Rd_....A...%kr..d.g.8.m.. .... <.egF.......[.S...N..W.........`< |#..........'..4.....(........<(_i&C:C.........MK..kd@}Jh..~..<.z...(.!y.
    5:m^..r..........AW.<.OkF..4..i:.._0.....A._........<E.y..{. .*......M....=..!Q.A...<.Lg.^H.@...M7...d8.6.A3....q'.....~
    .....intent-speech-api..d
    `<0z]>. ......`]...7Vp.%...K..V_@<.Mg>........D.t...N........8/..<(......A.9j..r....7.....-9.4.......
    .....intent-speech-api...
    ...HTTP/1.1 200 OK
    
    Content-Type: application/x-protobuffer
    
    Date: Tue, 26 Jun 2012 13:07:22 GMT
    
    Expires: Tue, 26 Jun 2012 13:07:22 GMT
    
    Cache-Control: private, max-age=0
    
    X-Content-Type-Options: nosniff
    
    X-Frame-Options: SAMEORIGIN
    
    X-XSS-Protection: 1; mode=block
    
    Content-Length: 281
    
    Server: GSE
    
    
    
    .=
    7
     0571f20e9244116a243b05ee852a7831....intent-speech-api..R...
    .....intent-speech-api..b..
    ......
    .testing.....D..?..
    .test...........
    .testting...........
    .texting...........
    
    test email.........."0571f20e9244116a243b05ee852a7831-1.,
    &
     0571f20e9244116a243b05ee852a7831......Z.POST /m/voice-search?multiproto=true HTTP/1.1
    
    Content-Length: 257
    
    Content-Type: application/octet-stream
    
    Host: www.google.com
    
    Connection: Keep-Alive
    
    User-Agent: Mozilla/5.0 (Linux; U; Android 2.3.5; en-us; YP-G1) AppleWebKit/525.10+ (KHTML, like Gecko) Version/3.0.4 Mobile Safari/523.12.2
    
    
    
    .n
    .....voice-searchRZ
    !VS 2.1.3 os=[Android 2.3.5 YP-G1]........>.. .".-16508905153536551372.en-US:..... .EZL...B
    5
     6d72f95e7d9d47135f894ddb26d6303d..intent-speech-api.......Z. ..K
    7
     0571f20e9244116a243b05ee852a7831....intent-speech-api....
    ..0..R...Z. .HTTP/1.1 200 OK
    
    Content-Type: application/x-protobuffer
    
    Date: Tue, 26 Jun 2012 13:07:22 GMT
    
    Expires: Tue, 26 Jun 2012 13:07:22 GMT
    
    Cache-Control: private, max-age=0
    
    X-Content-Type-Options: nosniff
    
    X-Frame-Options: SAMEORIGIN
    
    X-XSS-Protection: 1; mode=block
    
    Content-Length: 230
    
    Server: GSE
    
    
    
    .8
    2
     7a4a9198ae0fd36445660bf963609842....voice-search..R..<
    5
     6d72f95e7d9d47135f894ddb26d6303d..intent-speech-api......>
    7
     0571f20e9244116a243b05ee852a7831....intent-speech-api......,
    &
     7a4a9198ae0fd36445660bf963609842......Z. 
    

    It's a lot to read, but basically what happens in the Android sends voice data over a TCP to 74.125.142.126 on port 19294 (in my case anyway). After that, a POST request is made to 74.125.225.176. Shortly thereafter, The Google Server (from the same IP) responds with a list of possible options, seen here as:

    .testing.....D..?..
    .test...........
    .testting...........
    .texting...........
    

    The two URL's that seems to be involved are www.google.com/m/appreq/vs and www.google.com/m/voice-search.

    That seems to be the gist of it. Basic HTTP action (GET/POST) requests. I'm not sure how the port 19294 fits into things, but I think it's the original transmission of voice data, which is processed and returned through HTTP. I hope this can help anyone who stumbles across the same problem.