is there a way to do gender detection from a list of European names in R. Thanks in advance As example I have this list of names surname couples:
namesurname<-c("Hassan Al-Khayr", "Flores Juberías Carlos" ,"Géza Lévai" , "Miklós Lipták" , "László Péter" , "László Váradi" , "Sándor Molnár" ,
"Csaba Attila Nemes" , "Zoltán Károly" , "István Bajza" )
The {genderizeR}
package wraps up calls to genderizer.io's API.
Genderizer.io estimates surnames out of a text string, and correlates them with gender values obtained from vast social media metadata, thus it is quite robust for current naming conventions.
library(tidyverse)
library(genderizeR)
namesurname<-c("Hassan Al-Khayr", "Flores Juberías Carlos","Géza Lévai", "Miklós Lipták", "László Péter" ,"László Váradi" , "Sándor Molnár", "Csaba Attila Nemes", "Zoltán Károly", "István Bajza")
df_gender <- findGivenNames(x = namesurname, textPrepare = TRUE)
genderize(x = namesurname, genderDB = df_gender)
text givenName gender genderIndicators
1: Hassan Al-Khayr hassan male 3
2: Flores Juberías Carlos carlos male 2
3: Géza Lévai <NA> <NA> 0
4: Miklós Lipták miklós male 1
5: László Péter lászló male 2
6: László Váradi lászló male 1
7: Sándor Molnár molnár male 2
8: Csaba Attila Nemes attila male 3
9: Zoltán Károly zoltán male 2
10: István Bajza istván male 2