Search code examples
splunksplunk-query

Best method to keep lookup file value fresh


Say, I have to monitor users' activities from 3 specific departments: Science, History, and Math.

The goal is to send an alert if any of the users in any of those departments download a file from site XYZ.

Currently, I have a lookup file for all the users from those three departments.

users
----------------------
user1@organization.edu
user2@organization.edu
user3@organization.edu
user4@organization.edu
user5@organization.edu

One problem: users can join, leave, or transfer to another department anytime.

Fortunately, those activities (join and leave) are tracked and they are Splunk-able.

index=directory status=*
-----------------------------------------------
{
"username":"user1@organization.edu",
"department":"Science",
"status":"added"
}
{
"username":"user1@organization.edu",
"department":"Science",
"status":"removed"
}
{
"username":"user2@organization.edu",
"department":"History",
"status":"added"
}
{
"username":"user3@organization.edu",
"department":"Math",
"status":"added"
}
{
"username":"MRROBOT@organization.edu",
"department":"Math",
"status":"added"
}

In this example, assuming I forgot to update the lookup file, I won't get an alert when MRROBOT@organization.edu downloads a file, and at the same time, I will still get an alert when user1@organization.edu downloads a file.

One solution that I could think of is to update the lookup manually via using inputlookup and outputlook method like:

inputlookup users.csv | users!=user1@organization.edu | outputlookup users.csv

But, I don't think this is an efficient method, especially there's high likely I might miss a user or two.

Is there a better way to keep the lookup file up-to-date? I googled around, and one suggestion is to use a cronjob CURL to update the list. But, I was wondering if there's a simpler or better alternative than that.


Solution

  • Here's a search that should automate the maintenance of the lookup file using the activity events in Splunk.

    `comment("Read in the lookup file.  Force them to have old timestamps")`
    | inputlookup users.csv | eval _time=1, status="added"
    `comment("Add in activity events")`
    | append [ search index=foo ]
    `comment("Keep only the most recent record for each user")`
    | stats latest(_time) as _time, latest(status) as status by username
    `comment("Throw out users with status of 'removed'")`
    | where NOT status="removed"
    `comment("Save the new lookup")`
    | table username
    | outputlookup users.csv
    

    After the append command, you should have a list that looks like this:

    user1@organization.edu added
    user2@organization.edu added
    user3@organization.edu added
    user4@organization.edu added
    user5@organization.edu added
    user1@organization.edu added
    user1@organization.edu removed
    user2@organization.edu added
    user3@organization.edu added
    MRROBOT@organization.edu added
    

    The stats command will reduce it to:

    user4@organization.edu added
    user5@organization.edu added
    user1@organization.edu removed
    user2@organization.edu added
    user3@organization.edu added
    MRROBOT@organization.edu added
    

    with the where command further reducing it to:

    user4@organization.edu added
    user5@organization.edu added
    user2@organization.edu added
    user3@organization.edu added
    MRROBOT@organization.edu added