I would like to plot a line plot (source: pandas dataframe) over a hvplot (source: xarray/ NetCDF).
The xarray looks like this:
dataDIR = 'ceilodata.nc'
DS = xr.open_dataset(dataDIR)
DS = DS.transpose()
print(DS)
<xarray.Dataset>
Dimensions: (range_hr: 32, range: 1024, layer: 3, time: 5760)
Coordinates:
* range_hr (range_hr) float32 0.001 4.995 9.99 ... 144.9 149.9 154.8
* range (range) float32 14.98 29.97 44.96 ... 1.533e+04 1.534e+04
* layer (layer) int32 1 2 3
* time (time) datetime64[ns] 2022-03-18 ... 2022-03-18T23:59:46
Data variables: (12/41)
zenith float32 ...
wavelength float32 ...
scaling float32 ...
range_gate_hr float32 ...
range_gate float32 ...
longitude float32 ...
... ...
cbe (layer, time) int16 ...
beta_raw_hr (range_hr, time) float32 ...
beta_raw (range, time) float32 ...
bcc (time) int8 ...
base (time) float32 ...
average_time (time) int32 ...
Attributes: (12/13)
comment:
software_version: 15.06.1 2.13 1.040 1
title: CHM15k Nimbus
wmo_id: 10865
month: 3
source: CHM160138
... ...
serlom: TUB160038
location: muenchen
year: 2022
device_name: CHM160138
institution: DWD
day: 18
The pandas dataframe source looks like this:
df = pd.read_csv('PTU.csv')
print(df)
Unnamed: 0 PTU
0 2022-03-18 07:38:56 451.839
1 2022-03-18 07:38:57 468.826
2 2022-03-18 07:38:58 469.093
3 2022-03-18 07:38:59 469.356
4 2022-03-18 07:39:00 469.623
... ... ...
6140 2022-03-18 09:21:16 31690.600
6141 2022-03-18 09:21:17 31694.700
6142 2022-03-18 09:21:18 31692.900
6143 2022-03-18 09:21:19 31712.000
6144 2022-03-18 09:21:20 31711.500
[6145 rows x 2 columns]
Both are time dependend datasets but have different time stamps and frequencies. Time is index in each data set.
I tried to plot them together with additional imports of holoviews. While each single plot is no problem, plotting them together seems not to work the way I tried it:
import hvplot.pandas
import holoviews as hv
# cmap of the xarray:
ceilo = (DS.b_r.hvplot(cmap="viridis_r", width = 850, height = 600, title = 'title', clim = (5, 80))
# line plot of the data frame
p = df.hvplot.line()
# add pressure line plot to pcolormeshplot using * which overlays the line on the plot
ceilo * p
but this ended in an error message with the following complete traceback:
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-10-2b1c6baca339> in <module>
24 p = df.hvplot.line()
25 # add pressure line plot to pcolormeshplot using * which overlays the line on the plot
---> 26 ceilo * df
c:\python38\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
68 other = item_from_zerodim(other)
69
---> 70 return method(self, other)
71
72 return new_method
c:\python38\lib\site-packages\pandas\core\arraylike.py in __rmul__(self, other)
118 @unpack_zerodim_and_defer("__rmul__")
119 def __rmul__(self, other):
--> 120 return self._arith_method(other, roperator.rmul)
121
122 @unpack_zerodim_and_defer("__truediv__")
c:\python38\lib\site-packages\pandas\core\frame.py in _arith_method(self, other, op)
6936 other = ops.maybe_prepare_scalar_for_op(other, (self.shape[axis],))
6937
-> 6938 self, other = ops.align_method_FRAME(self, other, axis, flex=True, level=None)
6939
6940 new_data = self._dispatch_frame_op(other, op, axis=axis)
c:\python38\lib\site-packages\pandas\core\ops\__init__.py in align_method_FRAME(left, right, axis, flex, level)
275 elif is_list_like(right) and not isinstance(right, (ABCSeries, ABCDataFrame)):
276 # GH 36702. Raise when attempting arithmetic with list of array-like.
--> 277 if any(is_array_like(el) for el in right):
278 raise ValueError(
279 f"Unable to coerce list of {type(right[0])} to Series/DataFrame"
c:\python38\lib\site-packages\holoviews\core\element.py in __iter__(self)
94 def __iter__(self):
95 "Disable iterator interface."
---> 96 raise NotImplementedError('Iteration on Elements is not supported.')
97
98
NotImplementedError: Iteration on Elements is not supported.
Is the different time frequency a problem here? The line plot should be orientated along the x- and the y-axis considering the right time stamp and altitude of the underlying cmap-(matplotlib)-plot.
To illustrate what I am aiming for, here is a picture of my goal:
Thanks for reading / helping.
I found a solution for this case:
Both dataset time columns have to have the same format. In my case it's: datetime64[ns] (to adopt to the NetCDF xarray). That is why I converted the dataframe time column to datetime64[ns]:
df.Datetime = df.Datetime.astype('datetime64')
Also I found the data to be type "object". So I transformed it to "float":
df.PTU = df.PTU.astype(float) # convert to correct data type
The last step was choosing hvplot as this helps in plotting xarray data
import hvplot.xarray
hvplot.quadmesh
And here is my final solution:
title = ('Ceilo data + '\ndate: '+ str(DS.year) + '-' + str(DS.month) + '-' + str(DS.day))
ceilo = (DS.br.hvplot.quadmesh(cmap="viridis_r", width = 850, height = 600, title = title,
clim = (1000, 10000), # set colorbar limits
cnorm = ('log'), # choose log scale
clabel = ('colorbar title'),
rot = 0 # degree rotation of ticks
)
)
# from: https://justinbois.github.io/bootcamp/2020/lessons/l27_holoviews.html
# take care! may take 2...3 minutes to be ploted:
p = hv.Points(data=df,
kdims=['Datetime', 'PTU'],
).opts(#alpha=0.7,
color='red',
size=1,
ylim=(0, 5000))
# add PTU line plot to quadmesh plot using * which overlays the line on the plot
ceilo * p