The Google Cloud Natural Language API can be used to analyse text and return a syntactic parse tree with each word labeled with parts-of-speech tags.
Is there a way to deturmine if a noun is plural or not?
If Google Cloud NL is able to work out the lemma then perhaps the information is there but not returned through the API?
Update
With the NL API's GA launch, the annotateText
endpoint now returns a number
key for each token indicating whether word is singular, plural, or dual. For the sentence "There are some cats here," the API returns the following token data for 'cats' (notice that number
is PLURAL
):
{
"text": {
"content": "cats",
"beginOffset": -1
},
"partOfSpeech": {
"tag": "NOUN",
"aspect": "ASPECT_UNKNOWN",
"case": "CASE_UNKNOWN",
"form": "FORM_UNKNOWN",
"gender": "GENDER_UNKNOWN",
"mood": "MOOD_UNKNOWN",
"number": "PLURAL",
"person": "PERSON_UNKNOWN",
"proper": "PROPER_UNKNOWN",
"reciprocity": "RECIPROCITY_UNKNOWN",
"tense": "TENSE_UNKNOWN",
"voice": "VOICE_UNKNOWN"
},
"dependencyEdge": {
"headTokenIndex": 1,
"label": "DOBJ"
},
"lemma": "cat"
}
See the full documentation here.