Let's say I've df like this
Reproducable:
import pandas as pd
import io
TESTDATA="""All_services
All_services
Rehosting applications to AWS
Replacing flexible functionalities
Unaltered replatforming of underlying code structure, functionalities, features
Rebuilding broken applications/software segments
Optimize existing use of cloud(Cost saving)
Expand use of containers
Move on prem servers to Sass
Expanding public clouds
Implemenation CI/CD to clouds
Migration Evaluator
AWS Migration Hub
AWS Application Discovery Services
AWS Landing Zone
AWS Control Tower
AWS Management and Governance
AWS Database Migration Services
AWS Server Migration Service
AWS Database Migration Service
AWS Application Discovery Service
AWS Direct Connect
DB Migrations
open-source databases to AWS.
Oracle to Oracle
Oracle or Microsoft SQL Server to Amazon Aurora.
Migrating fileservers to Amazon S3
migrating commercial RDBMS or MySQL.
Optimize existing use of cloud(Cost saving)
Expand use of containers
Move on prem servers to Sass
Expanding public clouds
Implemenation CI/CD to clouds
Migration Evaluator
AWS Migration Hub
AWS Application Discovery Services
AWS Landing Zone
DB Migrations
Cloud Migration Planning
Replatforming Applications for Cloud
Cloud Application Development Services
From Monolith to Microservices
Cloud Infrastructure Automation
Implemenation CI/CD to clouds
DB Migrations
Optimize existing use of cloud(Cost saving)
Implemenation CI/CD to clouds
Migration Evaluator
AWS Migration Hub
AWS Application Discovery Services
AWS Direct Connect
DB Migrations
open-source databases to AWS.
Oracle to Oracle
Oracle or Microsoft SQL Server to Amazon Aurora.
Migrating fileservers to Amazon S3
Optimize existing use of cloud(Cost saving)
Amazon S3 Transfer Acceleration
AWS Snowball
AWS Direct Connect
EC2
AWS Server Migration Service
AWS Database Migration Service
VMWare Cloud on AWS
Optimize existing use of cloud(Cost saving)
Cloud Application Development Services
From Monolith to Microservices
Cloud Infrastructure Automation
Implemenation CI/CD to clouds
DB Migrations
Optimize existing use of cloud(Cost saving)
Implemenation CI/CD to clouds
Migration Evaluator
Optimize existing use of cloud(Cost saving)
AWS Application Discovery Services
AWS Direct Connect
DB Migrations
Rebuilding broken applications/software segments
Optimize existing use of cloud(Cost saving)
Expand use of containers
Move on prem servers to Sass
Expanding public clouds
Implemenation CI/CD to clouds
AWS Management and Governance
AWS Database Migration Services
AWS Server Migration Service
AWS Database Migration Service
AWS Application Discovery Service
AWS Direct Connect
DB Migrations
"""
df = pd.read_csv(io.StringIO(TESTDATA), sep=";")
df = df.replace(r"^ +| +$", r"", regex=True)
df.All_services.value_counts().sort_values().plot(kind = 'barh',figsize=(25, 15),linewidth=4)
I got graph like this
How can I add repitative string percentage to barplot using pandas??
There are similar answers but they are uisng matplotlib with pandas. I'm looking only with pandas with some preior hard coding. If it's not achivable I will go with matplotlib
Similar threads with matplotlib
pandas matplotlib labels bars as percentage
It's not possible to do what you want only with Pandas
. You have to use matplotlib
:
stats = (df['All_services'].value_counts(ascending=True).to_frame('count')
.assign(pct=lambda x: x['count'].div(x['count'].sum()).mul(100)))
ax = stats['count'].plot(kind='barh', figsize=(25, 15), linewidth=4)
ax.bar_label(ax.containers[0], labels=stats['pct'].round(2).astype(str) + '%')
plt.tight_layout()
plt.show()
Output: