Search code examples
azure-blob-storagekqlkusto-explorerplayfab

How to remove unwanted nested columns?


I've been tasked to alter the company's Event Export from our PlayFab Environment into Azure. Initially, we've set it up to Export all events but after looking at the data we do have some data exported that we don't want for legal reasons. I was exploring the Use custom query method and was trying to build the query to get all data except the columns I want to exclude. The problem is that these columns are nested. I've tried using the project-away query to exclude one column for now but when I run the below query

['events.all'] 
| project-away EventData.ColumnToExclude
| limit 100

I get this error

enter image description here

I'm assuming it's because it is not supporting nested columns. Is there an easy way to exclude the column without having to flatten the data or list all my columns (our developers might create new events without notice so that won't work)?

UPDATE 1:

I've found that project-away is the syntax to remove a column from a table but what I needed is a way to remove a key from a json/dynamic object so found that using bag_remove_keys() is the correct approach

['events.all']
| project EventData=bag_remove_keys(EventData, dynamic(['Key1', 'Key2', '$.Key3.SubKey1'])

But now I am facing another issue. When I use the '$.' notation for subkeys I get the below error

Query execution has resulted in error (0x8000FFFF): Partial query failure: Catastrophic failure (message: 'PropertyBagEncoder::Add: non-contiguous entries: ', details: '').
[0]Kusto.Data.Exceptions.KustoDataStreamException: Query execution has resulted in error (0x8000FFFF): Partial query failure: Catastrophic failure (message: 'PropertyBagEncoder::Add: non-contiguous entries: ', details: '').
Timestamp=2022-01-31T13:54:56.5237026Z

If I don't list any subkeys I don't get this issue and I can't understand why

UPDATE 2:

I found that bag_remove_keys has a bug. On the below query I get the described error in UPDATE 1

datatable(d:dynamic)
[
dynamic(
{
  "test1": "val",
  "test2": {},
  "test3": "val"
}
)
]
| extend d1=bag_remove_keys(d, dynamic(['$.SomeKey.Sub1', '$.SomeKey.Sub2']))

However, if I move the "test2" key at the end I don't get an error but d1 will not show the "test2" key in the output. enter image description here

Also, if I have a key in bag_remove_keys() that matches a key from the input like | extend d1=bag_remove_keys(d, dynamic(['$.SomeKey.Sub1', '$.SomeKey.Sub2', 'test1'])) then, again it will not error but will remove "test2" from the output enter image description here


Solution

  • Thanks for reporting it Andrei, it is a bug and we are working on a fix.

    Update - fix had been checked in and will be deployed within two weeks, please open a support ticket if you need it earlier.