Search code examples
google-custom-search

Searching the entire attribute in pagemap structured data with google cse for filtering


I am having trouble searching data from the pagemap I have set up. The pagemap is getting returned correctly when the containing page is a result, but I can only search the first ten words of an attribute like this:

<Attribute name="description">The smash is the most
    explosive and aggressive stroke in Badminton. Elite athletes can
    generate shuttlecock velocities of up to 370 km/h. To perform the
    stroke, one must understand the biomechanics involved, from the body
    positioning to the wrist flexion. </Attribute>

"Smash" (more:pagemap:document-description:smash) will match and return the page, but "badminton" will not. The Structured Data Testing Tool shows that each space creates a new data point and limits the number of points to ten:

more:pagemap:document-description
more:pagemap:document-description:aggressive
more:pagemap:document-description:and
more:pagemap:document-description:explosive
more:pagemap:document-description:in
more:pagemap:document-description:is
more:pagemap:document-description:most
more:pagemap:document-description:smash
more:pagemap:document-description:stroke
more:pagemap:document-description:the

I need to be able to filter through more than ten words in each attribute. Is there a way to get around this limit or am I going about filtering the wrong way?


Solution

  • According to google you only get ten tokens per attribute for filtering. I have not been able to get around it.

    To be more specific about my original problem, I was storing multiple doctor-page paths in a clinic-page that were getting tokenized on every /. I solved my problem by searching for the clinic-page in a doctor-page label, rather than reading the doctor-pages from my clinic-page search result. I used repeated attributes to be able to search, like

    //inside clinic-page
    <Attribute name="doctor">path/to/doc1</Attribute>
    <Attribute name="doctor">path/to/doc2</Attribute>
    ...
    

    But that does not allow you to read every doctor-page from a clinic-page result. This happened to work in my case. Google definitely restricts its tokens to ten per attribute.