Search code examples
pythonexcelexcel-formulaxlsxwriter

Python XlsxWriter - Existing custom data validation formula is inactively not working


Background:

I'm attempting to capture valid dates that are in a custom Persian (Non-Gregorian) date format (yyyy/mm/dd)

Formula:

Luckily I got help from the below thread with a decent formula that does the requirement. I can XlsxWriter has parsed it, but it does not work in the generated Excel worksheet. The formula and link are discussed in the below SO thread:

Ensure Excel (Persian) date is valid date with specific format

Root cause:

I'm not sure about the reason, but I think it is because of an error prompt message that pops out. If I clicked ok, the formula works. Would this be the reason? Would this block XlsxWriter?

Issue reproduction

Creating a dummy data frame:

df = pd.DataFrame(data = {'date' : ['1369/05/18',
                                    '1386/05/1',
                                    '1369/5/18',
                                    '1369/05/',
                                    '1369//15',
                                    '/05/08',
                                    '1369/051/18',
                                    '1369/0/518',
                                    '136/5/18',
                                    '1369/13/18',
                                    '1369/05/31',
                                    '1369-05-31',
                                    '1369//05//18',
                                    None
                                   ],
                          'case' : ['valid format - correct',
                                    '1 digit day - correct',
                                    '1 digit month - correct',
                                    'No Day ',
                                    'No Month',
                                    'No Year',
                                    '3 digit month',
                                    '3 digit day',
                                    '3 digit year',
                                    'Invalid month',
                                    'Invalid day',
                                    '31st may - correct',
                                    'Invalid format',
                                    'Invalid format'
                                   ],
                          'gender' : '',
                          'date_input':''
                         }
                 )
     
  • Writing that dataframe to an Excel file.
  • Adding data validation for that file.
    • I'll add an irrelevant male/female list to show that validation works.
    • I'll add my formula to the excel.
writer = pd.ExcelWriter('dates_validation.xlsx', engine='xlsxwriter') # xlsx writer
workbook = writer.book # Workbook object
df.to_excel(writer, sheet_name = 'sheet_1' ,index=False) # Writing to file
worksheet = writer.sheets['sheet_1'] # Worksheet object
worksheet.data_validation(
    'C2:C1048576', {'validate': 'list',
                    'source': ['Male','Female'],
                    'dropdown': True
                   })
worksheet.data_validation(
    'D2:D1048576', {'validate': 'custom',
                    'value': '=LET(s,TEXTSPLIT(D2,"/"),y,AND(--INDEX(s,1)>=1278,--INDEX(s,1)<=9378),m,AND(--INDEX(s,2)>=1, --INDEX(s,2)<=12),d, AND(--INDEX(s,3)>=1, --INDEX(s,3)<=(30+(--INDEX(s,2)<=6))), AND(y,m,d))',
                    'ignore_blank': True
                   }
)
writer.close()

You can see that the formula is in-place but it doesn't work in cell D2. If you clicked ok in the next cell, it will work.

enter image description here


Solution

  • The LET() and TEXTSPLIT are new Future Functions in Excel and have some additional prefixes like _xlfn.LET(_xlpm.s,_xlfn.TEXTSPLIT(D2,"/") ....

    The actual formula you need to use is:

    _xlfn.LET(_xlpm.s,_xlfn.TEXTSPLIT(D2,"/"),_xlpm.y,AND(--INDEX(_xlpm.s,1)>=1278,--INDEX(_xlpm.s,1)<=9378),_xlpm.m,AND(--INDEX(_xlpm.s,2)>=1, --INDEX(_xlpm.s,2)<=12),_xlpm.d, AND(--INDEX(_xlpm.s,3)>=1, --INDEX(_xlpm.s,3)<=(30+(--INDEX(_xlpm.s,2)<=6))), AND(_xlpm.y,_xlpm.m,_xlpm.d))
    

    Here is a working example based on your code:

    import pandas as pd
    
    
    df = pd.DataFrame(data = {'date': ['1369/05/18',
                                        '1386/05/1',
                                        '1369/5/18',
                                        '1369/05/',
                                        '1369//15',
                                        '/05/08',
                                        '1369/051/18',
                                        '1369/0/518',
                                        '136/5/18',
                                        '1369/13/18',
                                        '1369/05/31',
                                        '1369-05-31',
                                        '1369//05//18',
                                        None
                                       ],
                              'case' : ['valid format - correct',
                                        '1 digit day - correct',
                                        '1 digit month - correct',
                                        'No Day ',
                                        'No Month',
                                        'No Year',
                                        '3 digit month',
                                        '3 digit day',
                                        '3 digit year',
                                        'Invalid month',
                                        'Invalid day',
                                        '31st may - correct',
                                        'Invalid format',
                                        'Invalid format'
                                       ],
                              'gender' : '',
                              'date_input':''
                             }
                     )
    
    
    writer = pd.ExcelWriter('dates_validation.xlsx', engine='xlsxwriter') # xlsx writer
    workbook = writer.book # Workbook object
    df.to_excel(writer, sheet_name = 'sheet_1' ,index=False) # Writing to file
    worksheet = writer.sheets['sheet_1'] # Worksheet object
    worksheet.data_validation(
        'C2:C1048576', {'validate': 'list',
                        'source': ['Male','Female'],
                        'dropdown': True
                       })
    worksheet.data_validation(
        'D2:D1048576', {'validate': 'custom',
                        'value': '_xlfn.LET(_xlpm.s,_xlfn.TEXTSPLIT(D2,"/"),_xlpm.y,AND(--INDEX(_xlpm.s,1)>=1278,--INDEX(_xlpm.s,1)<=9378),_xlpm.m,AND(--INDEX(_xlpm.s,2)>=1, --INDEX(_xlpm.s,2)<=12),_xlpm.d, AND(--INDEX(_xlpm.s,3)>=1, --INDEX(_xlpm.s,3)<=(30+(--INDEX(_xlpm.s,2)<=6))), AND(_xlpm.y,_xlpm.m,_xlpm.d))',
                        'ignore_blank': True
                       }
    )
    
    worksheet.autofit()
    
    writer.close()
    
    

    Output:

    enter image description here