plot time for scatter chart in log scale - plotly

I've got a wide range of values in seconds that I want to display in log scale. It works fine if i plot the raw data in seconds. But I'm trying to convert the y-axis to timestamps and keep the log scale.

df = pd.DataFrame({
    'Position':(1,2,3,4,5,6,7,8,9),
    'Value':(1,20,821,2300,4500,30000,405600,1023764,11256400),
})

# works
fig = px.scatter(x=df['Position'], y=df['Value'], log_y="True")

When trying to change the y-axis to datetime, the values aren't correct. When trying to insert the log scale, the values don;t appear at all.

fig = px.scatter(x=df['Position'], y=pd.to_datetime(df['Value'], unit = 's'))
fig = px.scatter(x=df['Position'], y=pd.to_datetime(df['Value'], unit = 's'), log_y="True")

The output y-axis should range from 0 to 130 days, 6 hours, 46 mins, 40 secs. I'm not fussed about being specific here, the broad range is fine.

Solution

The issue is that when you convert to datetime objects, Plotly can't apply a logarithmic scale directly to datetime objects.

To achieve this you can try not actually converting the data to datetime objects, which would break the log scale. Instead, we're keeping the numeric values (sec) for plotting, but displaying time-formatted labels.

import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
from datetime import timedelta

# Your sample data
df = pd.DataFrame({
    'Position': (1, 2, 3, 4, 5, 6, 7, 8, 9),
    'Value': (1, 20, 821, 2300, 4500, 30000, 405600, 1023764, 11256400),
})

# Create a figure with numeric values on a log scale
fig = px.scatter(x=df['Position'], y=df['Value'], log_y=True)

# Create custom time-formatted tick labels
# Get the current y-axis tick values (these are in log scale)
y_tick_vals = fig.layout.yaxis.tickvals

# If tickvals not automatically set, we can define our own
if y_tick_vals is None:
    # Create tick values at each order of magnitude
    powers = np.floor(np.log10(df['Value'].min())) - 1
    powers_max = np.ceil(np.log10(df['Value'].max())) + 1
    powers_range = np.arange(powers, powers_max)
    y_tick_vals = [10**p for p in powers_range]

# Create time-formatted labels for each tick value
y_tick_text = []
for val in y_tick_vals:
    if val > 0:  # Avoid negative or zero values
        # Convert seconds to a readable time format
        delta = timedelta(seconds=val)
        days = delta.days
        hours, remainder = divmod(delta.seconds, 3600)
        minutes, seconds = divmod(remainder, 60)
        
        if days > 0:
            time_str = f"{days}d {hours}h {minutes}m"
        elif hours > 0:
            time_str = f"{hours}h {minutes}m {seconds}s"
        elif minutes > 0:
            time_str = f"{minutes}m {seconds}s"
        else:
            time_str = f"{seconds}s"
            
        y_tick_text.append(time_str)
    else:
        y_tick_text.append("0s")

# Update the y-axis with custom formatted time labels
fig.update_layout(
    yaxis=dict(
        tickmode='array',
        tickvals=y_tick_vals,
        ticktext=y_tick_text,
        title="Time (log scale)"
    ),
    xaxis_title="Position"
)

fig.show()

The graph