Search code examples
rstatisticspython-2.7data-analysis

Binning the data & plotting the histogram


I have a list of values (these are positive as well as negative values). As an example say I have 35000 numbers (+ve and -ve both in it).

What I want to do is to bin them, i.e. the number values between 0-200 (also from -200 to 0), 201-400 (-400 to 201), .... and so on till 48,800-50000 (-50000 to 48,500).

Once I have these values, the plotting of histogram or any other representation is easier. I can take this to excel or plot it in python or PERL or R.

But first stage itself is bit tricky.

As an example, you may consider following data:

 -9030
   -75
  8005
  -251
 65994
-12111
-11643
 19749
-23324
 10012
   -77  

Thank you


Solution

  • set.seed(12345)
    n <- 35000
    dataset <- data.frame(Number = runif(n, min = -200, max = 50000))
    library(ggplot2)
    ggplot(dataset, aes(x = Number)) + geom_histogram(binwidth = 200)
    

    giving

    enter image description here