Search code examples
elasticsearchlogstashkibanafilebeat

How do I import a text log file into Elasticsearch local installation with multiline


I have a text log file of our Cypress Automation tests. File name cypress_tests.txt The log file appears in the following format, sample shown below

2021-05-08T14:04:24.6661291Z     Spec                       Pass Fail Pending Skipped 
2021-05-08T14:04:24.6662575Z   ┌────────────────────────────────────────────────────┐
2021-05-08T14:04:24.6762664Z   │ √  chart-color      00:10  1     -    -      -     │
2021-05-08T14:04:24.6763156Z   │    s.spec.ts                                       │
2021-05-08T14:04:24.6763941Z   ├────────────────────────────────────────────────────┤
2021-05-08T14:04:24.6764594Z   │ √  charts.spec      00:10  1     -    -      -     │
2021-05-08T14:04:24.6765313Z   │    .ts                                             │
2021-05-08T14:04:24.6765791Z   ├────────────────────────────────────────────────────┤
2021-05-08T14:04:24.6766445Z   │ √  enterprise-      00:05  -     1    -      -     │
2021-05-08T14:04:24.6767007Z   │    model.spec.ts                                   │
2021-05-08T14:04:24.6914477Z   └────────────────────────────────────────────────────┘
2021-05-08T14:04:24.6915160Z     √  All specs passed! 00:15 2     1   -       -

I would like to import this text file into my local installation of Elasticsearch and index it. I would like to search the number of passed and failed tests including test case name. I can then build a bar chart using Kibana visulation.

When I import the text file into Elasticsearch using Add Data import csv option and I create the index, give it an Index name The index fields pattern is shown below screenshot.
Elasticsearch index pattern from imported text log file

From Elasticsearch Discover, selecting the log file from the drop down and then it shows the available fields on the left hand side. id, index, timestamp there is a field called message. In the message field it shows all the data from the log file, All specs passed, the number of pass and fail tests, e.g the total 82 which is the pass column, 75 for Fail column. Everything is in the message field. Screenshot below

Elastsearch Discover fields from text log file

How do i parse the log file to map the fields like 82 would the Pass column (from the sample log screenshot above it would be the value 2), chart-colous.spec.ts is the spec column? 75 from the screenshot below is for the Fail column

If it cannot be done would I need to tell the owner of our automation framework to change the way the test results are written to the log file?

Any help, guidance much appreciated. Thank you.


Solution

  • You need to tell the owner of your automation framework to produce something more machine-readable. Cyprus can output json... use it.

    Here's a partial example of the grok nightmare of the tabular human-readable format:

    ^%{TIMESTAMP_ISO8601:start_time} +Spec +Pass +Fail +Pending +Skipped *\n(?<testblock>%{TIMESTAMP_ISO8601} +[┌├][^│]+(\n%{TIMESTAMP_ISO8601} +│ +[^ ][^│]+│){2}\n){3}(?<footer>%{TIMESTAMP_ISO8601} +└─*┘\n%{TIMESTAMP_ISO8601:end_time}.*specs.*)$
    

    This gets you an array of fields containing one test each, plus the header and footer. You would still have to use more split and grok filters (or a ruby filter if you prefer) to separate out the individual tests and parse their awesome multi-line wrapped tabulation style. Just don't... this format is not for machines.

    Better to use something cypress run --reporter json as documented here