Thank you for making time to answer this question.
I was recently working with spark and I read that it considers one partition from HDFS = one partition in spark. With that logic there are many cases where we might not use HDFS as source. So, if we use CSV or any other file-based format to read data from then how the partition is or rather how that data is partitioned since there is no explicit partitioning.
When you read a CSV file from spark the partitioning is defined by this config
spark.sql.files.maxPartitionBytes
which is by default according to [the spark documentation][1] 134217728
so for example if you set "spark.sql.files.maxPartitionBytes" ,"1024"
and read a CSV file of 1mb you will have 1000 partitions
[1]: https://spark.apache.org/docs/latest/sql-performance-tuning.html#other-configuration-options