Search code examples
pythonpandaslambda

I am trying to create a new column in python dataframe using apply, lambda and if elif else functions. I am getting the error of invalid syntex


This is the code I used.

B2B.FY= B2B.Date.apply(lambda x: '2017-18' if x>='2017-07-01' and x=<'2018-03-31' else ('2018-19' if x>='2018-04-01' and x=<'2019-03-31' else ('2019-20' if x>='2019-04-01' and x=<'2020-03-31' else ('2020-21' if x>='2020-07-01' and x=<'2021-03-31' else ('2021-22' if x>='2021-04-01' and x=<'2022-03-31' else '2022-23')))))



this is the sample of date column.
0       2017-07-07
1       2017-07-14
2       2017-07-14
3       2017-07-14
4       2017-07-25
5       2017-07-25
6       2017-07-31
7       2017-07-14
8       2017-07-25
9       2017-07-21
10      2017-07-12
11      2017-07-25
12      2017-07-31
13      2017-07-15
14      2017-07-15
15      2017-07-18
16      2017-07-18
17      2017-07-18
18      2017-07-18
19      2017-07-18
20      2017-07-18
21      2017-07-18

and this is the error I am getting.

File "C:\Users\Optimus\AppData\Local\Temp/ipykernel_9712/3702499445.py", line 1 B2B.FY= B2B.Date.apply(lambda x: '2017-18' if x>='2017-07-01' and x=<'2018-03-31' else ('2018-19' if x>='2018-04-01' and x=<'2019-03-31' else ('2019-20' if x>='2019-04-01' and x=<'2020-03-31' else ('2020-21' if x>='2020-07-01' and x=<'2021-03-31' else ('2021-22' if x>='2021-04-01' and x=<'2022-03-31' else '2022-23')))))

^ SyntaxError: invalid syntax


Solution

  • There are two things to improve:

    =< is not a valid operator.

    So one will have to use <=.

    • Apart from that, in order to create a new column named FY, as Pandas doesn't allow columns to be created via a new attribute name, instead of B2B.FY use B2B['FY'].

    With all the changes in place, the following will should do the work

    B2B['FY'] = B2B['Date'].apply(lambda x: '2017-18' if x>='2017-07-01' and x<='2018-03-31' else ('2018-19' if x>='2018-04-01' and x<='2019-03-31' else ('2019-20' if x>='2019-04-01' and x<='2020-03-31' else ('2020-21' if x>='2020-07-01' and x<='2021-03-31' else ('2021-22' if x>='2021-04-01' and x<='2022-03-31' else '2022-23')))))
    
    [Out]:
              Date       FY
    0   2017-07-07  2017-18
    1   2017-07-14  2017-18
    2   2017-07-14  2017-18
    3   2017-07-14  2017-18
    4   2017-07-25  2017-18
    5   2017-07-25  2017-18
    6   2017-07-31  2017-18
    7   2017-07-14  2017-18
    8   2017-07-25  2017-18
    9   2017-07-21  2017-18
    10  2017-07-12  2017-18
    11  2017-07-25  2017-18
    12  2017-07-31  2017-18
    13  2017-07-15  2017-18
    14  2017-07-15  2017-18
    15  2017-07-18  2017-18
    16  2017-07-18  2017-18
    17  2017-07-18  2017-18
    18  2017-07-18  2017-18
    19  2017-07-18  2017-18
    20  2017-07-18  2017-18
    21  2017-07-18  2017-18
    

    Notes:

    • Make sure that the column Date is not of datetime, else one will get a

    TypeError: '>=' not supported between instances of 'Timestamp' and 'str'