Search code examples
google-cloud-platformgoogle-cloud-dlp

DLP data scan from bigquery table showing start byte as null


I have scanned a Bigquery table from Google DLP Console. The scan results are saved back into a big query table. DLP has identified sensitive information, but the start byte is shown as null, can anyone help me understand why?

enter image description here

The source data looks as follows:

2,[email protected]  ,858-333-0333,333-33-3333,8
3,[email protected],858-222-0222,222-22-2222,8
4,[email protected]  ,858-444-0444,444-44-4444,1 

------------------------------

If I put the same data in Cloud storage bucket and then perform a scan using DLP, I get the start and end bytes for the sensitive data


Solution

  • Unfortunatelly this looks like a bug.

    I was able to reproduce your issue completely; I fallowed these steps:

    • screated a source csv file:
    1,[email protected],858-333-0333,333-33-3333,8
    2,[email protected],858-333-0334,333-33-3334,3
    3,[email protected],858-333-0335,333-33-3335,5
    4,[email protected],858-333-0336,333-33-3336,1
    5,[email protected],858-333-0337,333-33-3337,4
    
    • imported it to a BQ table - it looks like this: enter image description here

    • DLP'ed it and got the same result with null column: enter image description here

    In my opinion this is a bug (certainly looks like it) so my recommendation would be to go to Google's Issuetracker and report it here (with as much details as possible) and wait for an answer.