Skip to content

Add filter operator #10

@MasWag

Description

@MasWag

Overview

I propose a syntax for filtering logs in SyMon. The tentative syntax is as follows.

filter ( event1: field1; event2: field2; ... eventn: fieldn ) {
    expr
}

Intuitively, this expression restricts attention to the events based on the value of the specified field. Consider the following expression.

filter ( request: id; response: id ) {
    request( id );
    response( id )
}

This expression matches the prefixes of length three and four of the following log, for id = bar and id = foo, respectively.

request	foo	0.5
request	bar	1.0
response	bar	2.0
response	foo	2.5

For id = foo, SyMon ignores the events request and response with id != foo, and the monitored log is handled as follows, and request( id ); response( id ) matches this "filtered" log.

request	foo	0.5
response	foo	2.5

Potential assumptions

In the filtering mechanism above, it is natural to assume that all the specified fields have the same type. Perhaps we can assume that they have string values. To make the implementation simple, it is reasonable to allow filter only at the beginning of the main expression.

Idea of the implementation

In the current implementation, the set of current configurations is implemented as unordered_set. One reasonable implementation of the above filtering mechanism would be to implement the current configurations for the filtered value as unordered_map<string, unordered_set>. Such an idea should work for three monitoring algorithms. Since the current implementation of the matching algorithms does not take the expression, it takes an automaton. Probably this interface needs some updates, such as giving the fields for filtering when it is specified.

When this would be useful

It is common to monitor a log, where each entry is labeled with an identifier, such as an ID of a transaction. The introduction of filter simplifies the specification and also makes the monitoring more efficient.

Semantics

Mathematically, the semantics of filter would be as follows:

A log $a_1 a_2 a_3 \cdots a_n$ is matched by

filter ( event1: field1; event2: field2; ... eventn: fieldn ) {
    expr
}

if and only if the (filtered) log $b_1 b_2 b_3 \cdots b_m$ satisfying the following conditions is matched by expr for a value $v$.

  • Each $b_i$ is such that $b_i = a_{i'}$ for some $i'$ and for $b_i = a_{i'}$ and $b_j = a_{j'}$, $i &lt; j$ implies $i' &lt; j'$.
    • Since each entry in the log is associated with a strictly increasing timestamp, such a correspondence is unique.
  • For any $b_i$, such $a_{i'}$ exists if and only if one of the following conditions holds.
    • The name of $a_i$ is none of event1, event2, …, eventn, i.e., we do not filter events that are not in the given list.
    • For some k, the name of $a_i$ is eventk and the value of fieldk in $a_i$ is $v$, i.e., if the event is specified in the list, the value of the specified field must be $v$.
  • The name of the last event $a_m$ must be one of event1, event2, …, eventn. This removes suffixes that consist only of unfocused events from the results.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions