Search code examples
pythonpandasnumpydata-manipulationdata-cleaning

How to extract certain under specific condition in pandas? (Sentimental analysis)


enter image description here

The picture is what my dataframe looks like. I have user_name, movie_name and time column. I want to extract only rows that are first day of certain movie. For example, if movie a's first date in the time column is 2018-06-27, i want all the rows in that date and if movie b's first date in the time column is 2018-06-12, i only want those rows. How would i do that with pandas?


Solution

  • I assume that time column is of datetime type. If not, convert this column calling pd.to_datetime.

    Then run:

    df.groupby('movie_name').apply(lambda grp:
        grp[grp.time.dt.date == grp.time.min().date()])
    

    Groupby groups the source DataFrame into grops concerning particular films.

    Then grp.time.min().date() computes the minimal (first) date from the current group.

    And finally the whole lamda function returns only rows from this date (also from the current group).

    The same for other groups of rows (films).