Search code examples
feature-engineering

Feature Engineering in Ecommerce Web Analytics


I am very new to this forum and asking a question for the first time. I am working on a ecommerce dataset for a project- that include these two variables - page visited (0/1) and exits (num var with values -1,0,2,3......) which indicate number of times unique id has exited the particular page. There are 6 such page with information.

The -1 in page exits are the ones with no page visits. However, I am using the page exits to calculate other metric such as exit rate and I am not sure how to remove/replace -1 without losing information or put it in another manner. I cannot make it 0 - bcos that would mean there is no page exit/ vistor stayed on the page. Even if i remove and create a categorical variable - which indicates No visit, stayed , exited ..I would not still know what to replace the -1 with.

How do i go about this ...do I need to do any feature engineering here?


Solution

  • Create a feature that represents if the user as never visited the page as a binary 1/0 and then just have a column from 0 - n for the number of exits (if exits is -1 then 1 else 0). I'd set exits from -1 to 0 after creating the additional column.

    However, I think you need to consider the implication of the -1 more (or provide more information), are people currently still on the page when your intended algorithm will run? Does your data exist for multiple pages and the -1 means they never visited that page?