I have around 45 csv files each containing two columns with 13,000 entries I want to plot all these csv files together in a single scatter plot and dont want the ticks of the scatter plot to overlap with each other i want all the ticks to be visible clearly in the scatter plot.
I am attaching the core below where i am combining all these scatter plots and plotting them in a single plot.the output is a scatter plot which is every clumsy to understand the reason i am asking for all the points to be clearly visible without overlapping is study them for my research.
import os
import matplotlib.pyplot as plt
import pandas as pd
csv_directory = "Allplots/graphs"
csv_files = [file for file in os.listdir(csv_directory) if file.endswith(".csv")]
plt.figure(figsize=(12, 8))
all_indices = []
all_accuracies = []
for csv_file in csv_files:
file_path = os.path.join(csv_directory, csv_file)
df = pd.read_csv(file_path)
bit_index = df[" Index"]
accuracy = df["Accuracy"]
all_bit_indices.extend(index)
all_accuracies.extend(accuracy)
plt.scatter(all_indices, all_accuracies, s=10)
plt.xlabel("Index (Millions)", fontsize=12)
plt.ylabel("Accuracy", fontsize=12)
plt.title("Scatter Plot of Accuracy vs. Bit Index", fontsize=14)
# Save the plot as a PNG file
output_path = os.path.join(csv_directory, "scatter_plot.png")
plt.savefig(output_path)
plt.show()
I also want to plot these CSV files serailly say the first portion of graph should have points from csv1 file then csv2 file then so on i dont want the points of all csv files to mix up.
You can play with the various marker parameters in pyplot.scatter
. I recommend lowering s
for smaller marker size, changing marker = '.'
(default is 'o') for a smaller marker shape, and/or adjusting edgecolors = None
or edgecolors = 'face'
so the marker doesn't have an obvious outline.
For example: plt.scatter(all_indices, all_accuracies, s=1, marker = '.', edgecolors = 'face')