How to get the second derivative/dip from the graph or generate the best eps value

Dataset is below

 id,revenue ,profit

Code is below

import pandas as pd;
from sklearn.cluster import DBSCAN
import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import StandardScaler
import seaborn as sns
from sklearn.neighbors import NearestNeighbors
df = pd.read_csv('1.csv',index_col=None)
df1 = StandardScaler().fit_transform(df)
dbsc = DBSCAN(eps = 2.5, min_samples = 20).fit(df1)
labels = dbsc.labels_

My shape of df is 1999

I got the dip value eps value from the below method, from graph its clear that eps=2.5

Below is the method to find the best eps value

ns = 5
nbrs = NearestNeighbors(n_neighbors=ns).fit(df3)
distances, indices = nbrs.kneighbors(df3)
distanceDec = sorted(distances[:,ns-1], reverse=True)
plt.plot(indices[:,0], distanceDec)
#plt.plot(list(range(1,2000)), distanceDec)
  • How to find the dip in the graph automatically by the system mean best eps is expected out? without looking in to graph, my system has to tell best eps


  • If I understand correctly, you are looking for the precise y value of the inflection point appearing in your ε(x) plot (it should be around 2.0), right?

    If this is correct, being ε(x) your curve, the problem is reduced to:

    1. Compute the second derivative of your curve: ε''(x).
    2. Find the zero (or zeroes) of such second derivative: x0.
    3. Recover the optimized ε value, just by plugging the zero into your curve: ε(x0).

    Here I attach my answer, based in this two other Stack Overflow answers: (Compute derivative of an array) (Find zero in array)

    import numpy as np
    import matplotlib.pyplot as plt
    # Generating x data range from -1 to 4 with a step of 0.01
    x = np.arange(-1, 4, 0.01)
    # Simulating y data with an inflection point as y(x) = x³ - 5x² + 2x
    y = x**3 - 5*x**2 + 2*x
    # Plotting your curve
    plt.plot(x, y, label="y(x)")
    # Computing y 1st derivative of your curve with a step of 0.01 and plotting it
    y_1prime = np.gradient(y, 0.01)
    plt.plot(x, y_1prime, label="y'(x)")
    # Computing y 2nd derivative of your curve with a step of 0.01 and plotting it
    y_2prime = np.gradient(y_1prime, 0.01)
    plt.plot(x, y_2prime, label="y''(x)")
    # Finding the index of the zero (or zeroes) of your curve
    x_zero_index = np.where(np.diff(np.sign(y_2prime)))[0]
    # Finding the x value of the zero of your curve
    x_zero_value = x[x_zero_index][0]
    # Finding the y value corresponding to the x value of the zero
    y_zero_value = y[x_zero_index][0]
    # Reporting
    print(f'The inflection point of your curve is {y_zero_value:.3f}.')

    In any case, keep in mind that the inflection point (around 2.0) does not match with the "dip" point appearing around 2.5.