TransWikia.com

Anomaly Detection in distributed system using generated log file

Artificial Intelligence Asked on December 30, 2021

I am developing an AI tool for anomaly detection in a distributed system.  The system supports an interface that combines several individual logs into a single log file generating approx. 7000 entries/min. The logs entries are partially system generated (d-Bus, IPC, ….)  and human written statements (Status not received, initialized successfully, ….). The developers use the generated log for debugging. The entries have been configured to have a similar format depending on the generated system (timestamp, ids, component, context, verbosity level, description, ….). 

Background:
1. The history of the identified anomalies is minimal and not archived.
2. Not many similar event templates in log files.
3. Software execution rules are not clearly documented.
4. The log events are co-related.

What are the recommended algorithms (Statistical, NLP, ML, Neural networks) that can be used to efficiently perform pattern extraction on the entries and identify existing and new anomalous behavior?

One Answer

In the paper "Unsupervised real-time anomaly detection for streaming data" (by Subutai Ahmad, Alexander Lavin, Scott Purdy and Zuha Agha), 2017, an algorithm for anomaly detection (particularly suited for cases where a stream of data is continuously provided) is described. This algorithm is based on Numenta's Hierarchical Temporal Memory model.

I've actually never used it, but I know that Numenta's work is particularly suited for anomaly detection. You can have a look at it and see if it fits your needs. Have also a look at the Numenta Anomaly Benchmark (NAB).

Answered by nbro on December 30, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP