here im trying to set the most frequent words with plot , but I'm facing a problem because the language is Arabic and it does not match the format
fig, ax = plt.subplots(figsize=(12, 10))
sns.barplot(x="word", y="freq", data=word_counter_df, palette="PuBuGn_d", ax=ax)
plt.show();
i have tried ast with decoding but it does not match them with plot
import ast
fig, ax = plt.subplots(figsize=(12, 10))
sns.barplot(x="word", y="freq", data=word_counter_df.apply(ast.literal_eval).str.decode("utf-8"), palette="PuBuGn_d", ax=ax)
plt.show();
word_counter_df looks like :
<class 'pandas.core.frame.DataFrame'>
word freq
0 الله 6829
1 علي 5636
2 ان 3732
3 اللهم 2575
4 انا 2436
5 صباح 2115
6 اللي 1792
7 الي 1709
8 والله 1645
9 الهلال 1520
10 الا 1394
11 الخير 1276
12 انت 1209
13 يارب 1089
14 يوم 1082
15 رتويت 1019
16 كان 1004
17 اذا 994
18 لله 982
19 اي 939
it reutrn empty graph with this erorr :
ValueError: ('malformed node or string: 0 الله \n1 علي \n2 ان \n3 اللهم \n4 انا \n5 صباح \n6 اللي \n7
الي \n8 والله \n9 الهلال\n10 الا \n11 الخير \n12
انت \n13 يارب \n14 يوم \n15 رتويت \n16 كان \n17
اذا \n18 لله \n19 اي \nName: word, dtype: object', 'occurred at index word')
You can use pandas' built-in plot.bar
function:
word_counter_df.plot.bar(x="word", y="freq")
plt.show()
import arabic_reshaper
from bidi.algorithm import get_display
word_counter_df['disp'] = word_counter_df.word.apply(arabic_reshaper.reshape).apply(get_display)
word_counter_df.plot.bar(x="disp", y="freq")
The same with seaborn (version 0.9.0) here.