TransWikia.com

CloudWatch InSights: how to extract/query all JSON array elements at once as a list

Server Fault Asked by Michael Oryl on November 7, 2021

My company has started using JSON logging in order to better support CloudWatch InSights queries on AWS. The queries are quite easy to work with, except when we are dealing with array data.

For example, if we have a log entries like the following:

{
  "id": 123,
  "method": "getRelatedPolicies",
  "policiesRetrieved": [
    "333g",
    "444q"
  ]
}
{
  "id": 222,
  "method": "getRelatedPolicies",
  "policiesRetrieved": [
    "123q",
    "234w",
    "345e",
    "456r"
  ]
}
{
  "id": 432,
  "method": "getRelatedPolicies",
  "policiesRetrieved": [
    "345e"
  ]
}

They flatten out in CloudWatch Insights like the following:

  id                    123,
  method                getRelatedPolicies
  policiesRetrieved.0   333g
  policiesRetrieved.1   444q


  id                    222,
  method                getRelatedPolicies
  policiesRetrieved.0   123q
  policiesRetrieved.1   234w
  policiesRetrieved.2   345e
  policiesRetrieved.3   456r


  id                    432,
  method                getRelatedPolicies
  policiesRetrieved.0   345e

But what can I do to search for any log entry where the policiesRetrieved array contains the value 345e? There could be any number of entries in the array, so I can’t just start adding filter lines like or policiesRetrieved.0 = "345e" or policiesRetrieved.1 = "345e"....

If I could collapse all of the values into a delimited string, then I could search for a match in the string PLUS I could also easily make use of that list if the user exported the data into CSV or some other non-AWS format for further analysis.

Can I somehow parse the array values into a string? I’ve looked through all of the available helper functions available in the queries, and nothing struck me as viable.

Any solutions would be appreciated.

One Answer

So the solution for my particular case was simple enough since the array in question contained only strings. I just parsed the content of the array inside the [ and ] as a single string. That works for an array of strings or numbers or booleans. It would not be so pretty if I wanted to extract the IDs of an array of objects.

In any case, here is a sample query parsing out the strings in the array:

fields @timestamp, id, method # you don't need to put the 'policyNumbers' up here - it is added automatically
| parse @message '"policyNumbers":[*]' as policyNumbers
#| filter policyNumbers like '234w' # Uncomment to show only entries that mention a specific policy

That will parse the following line:

{"timestamp":"2020-07-21T12:03:46.970Z","id":222,"method": "getRelatedPolicies","dataAccess":{"policyNumbers":["123q", "234w", "345e", "456r"]}}}

With id being 222, method being getRelatedPolicies, and policyNumbers having a value of "123q", "234w", "345e", "456r"

Answered by Michael Oryl on November 7, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP