Search code examples
rgraphlinear-regressionecdfpower-law

How to plot CCDF graph on a logarithmic scale?


I want to plot a CCDF graph for some of my simulated power-law tail data on a log-log axis, below is my R code of plotting a CCDF graph on a normal axis, I used the code on the link: (How to plot a CCDF gragh?)

> load("fakedata500.Rda")
> x<-fakedata500
> f<-ecdf(x)
> f
Empirical CDF 
Call: ecdf(x)
 x[1:500] = 0.50174, 0.50307, 0.50383,  ..., 81.674, 140.63
> plot(f)

Below is the ECDF graph:

enter image description here

> plot(sort(x), 1-f(sort(x)), type="s", lwd=1)

and this command gives me the CCDF graph:

enter image description here

However, I would like to plot my CCDF graph on a log-log axis to produce a graph like the picture below: (graph from "Minimizing errors in identifying Lévy flight behaviour of organisms.")

enter image description here

Is there a way to do it in R?

If so, how to do a linear regression on the CCDF graph as well? I have tried using command below but that just not working for me.

a<-plot(sort(x), 1-f(sort(x)), type="s", lwd=1)
> a
NULL
> res=lm(a)
Error in terms.formula(formula, data = data) : 
  argument is not a valid model

Much appreciate.


UPDATE:

I used the code given by @BondedDust and successfully generated a CCDF graph:

(plot(sort(x) , 1-ecdf(x)(sort(x) ), log="xy"))

enter image description here

Below is the code of how I generated my dataset:

u<-runif(500)
fakedata500<-((2*(1-u))^(-1))

Solution

  • Here's another way using quantiles.

    library(VGAM)  # for rpareto(...)
    set.seed(1)    # for reproducible example
    X <- rpareto(1000,location=1,shape=1)
    p <- ppoints(100)
    par(mfrow=c(1,3))
    plot(quantile(X,p=p),p,type="l",ylab="P(X < x)",xlab="x",main="CDF")
    plot(quantile(X,p=p),1-p,type="l",ylab="P(X > x)",xlab="x",main="CCDF")
    plot(log(quantile(X,p=p)),log(1-p),
         ylab="log[P(X > x)]",xlab="log(x)",main="CCDF: log-log")
    

    And here's the regression.

    df  <- data.frame(x=log(1-p),y=log(quantile(X,p=p)))
    fit <- lm(y~x,df)
    summary(fit)
    # ...
    # Coefficients:
    #              Estimate Std. Error  t value Pr(>|t|)    
    # (Intercept)  0.039559   0.007584    5.216 1.02e-06 ***
    # x           -0.944380   0.005427 -174.028  < 2e-16 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # 
    # Residual standard error: 0.05317 on 98 degrees of freedom
    # Multiple R-squared:  0.9968,  Adjusted R-squared:  0.9967 
    # F-statistic: 3.029e+04 on 1 and 98 DF,  p-value: < 2.2e-16