Skip to content

Analysis Configuration

Cristina Alonso edited this page Apr 18, 2018 · 9 revisions

Contents

Introduction

RAGE Analytics provides a way of defining the type of analysis that is performed over the received data and configuring the resulting analysis. The rage-analytics frontend provides a user interface to facilitate this process, described below.

Select the Analysis

This step allows the developer to define a customized analysis package that will be used to process the analysis data.

1_select_analysis Figure 1.: developers can select either the a personalized analysis or the default analysis.

The Analysis package

An analysis takes data from any source (for example, interaction data from games) and stores it for later visualization. The analysis package is a .zip package composed of the following files:

  1. realtime.jar (for analysis). A jar file with the analysis topology in a correct Storm & Flux format.
  2. flux.yml (for configuration). A configuration file for the analysis.

realtime.jar

The realtime.jar contains the classes (and dependencies) referenced within the 'flux.yml' configuration file. It may also contain the complete topology used in the analysis and launched by Flux, more information here.

The rage-analytics back-end transforms the received xAPI Statements into data that will be sent to the realtime analysis. The received data from the tracker is transformed to the following JSON format:

{
    // Useful information for aggregation analysis, if needed
    "versionId": "123213...",
    "gameplayId": "1122233...",

    // Actor.name
    "name": "wild-grey-fox",

    // Verb.id
    "event": "preferred",
    // Object.id
    "target": "Menu_Start",
    // Object.definition.type
    "type": "Alternative",
    // Result.response
    "response": "Tutorial Mode",

    // Result.extensions parsing
    "performanceAnalysis": 0.7,
    "hasInventory": false,
    "previousResponse": "Job completed!",

    // Timestamp field used by the visualizations
    "timestamp": "2016-12-31T13:05:12Z"
}

This transformation process is performed here. Some tests demonstrating this process can be found here.

flux.yml

The flux.yml file can contain some variables that might be useful for the analysis:

  • zookeeperURL: used to connect to the kafka queue to consume traces.
  • sessionId: this is a unique identifier that must be used as the index name when writing the analysis documents to elasticsearch.
  • elasticsearchUrl: used to connect to elasticsearch without the port (the default elasticsearch port is 9200) and persist the results.

More information about the Storm-Flux specification can be found here.

The basic information contained inside the flux.yml file is:

# required, otherwise it wont work
name: "{{sessionId}}"                          # Topology name, must be uniquely identified using the "sessionId" value (e.g. "57d95801c5b55edc3259b450")

# values substituted when the topology will be launched
config:
  topology.workers: 1
  topology.max.spout.pending: 1000
  zookeeperUrl: "{{zookeeperURL}}"             # Connects to Kafka (e.g. "localhost:2181")
  sessionId: "{{sessionId}}"                   # Value used for the ElasticSearch indices naming (e.g. "57d95801c5b55edc3259b450")
  elasticsearchUrl: "{{elasticsearchUrl}}"     # Connects ElasticSearch host (without the port e.g. "localhost", default port 9200)

# spout, bolts and streams definitions. Example: https://github.com/apache/storm/blob/master/external/flux/README.md#basic-word-count-example.
# or topology Source. Example: https://github.com/apache/storm/blob/master/external/flux/README.md#existing-topologies
topologySource:
  className: "es.eucm.rage.realtime.Analysis"  # Name of the Storm Trident main class inside the "realtime.jar"

The Default Package

A default analysis package is provided.

  • An example of the default flux.yml file can be found here: https://github.com/e-ucm/rage-analytics-backend/blob/master/default-flux.yml
  • The default realtime.jar file is generated from rage-analytics real-time project and the topology used can be found here.

The default topology uses the Storm Trident API. The default topology is launched using the Flux Trident API Support. The default topology does the following operations:

  1. Connect to Kafka queue using the zookeeperURL provided by the flux.yml file.
  2. Start consuming data from Kafka, filter and transform it depending on the received format (Completables, Reachables, Variables and Alternatives). More info. about the tracking-model.
  3. Persist the analyzed data to Elasticsearch so that we can start configuring visualizations.

Select the Index

Kibana uses this index to know where it should store the data. We can create a file with our own index and upload it or directly write it inside the JSON text area. This text area has the default index as an example.

2_select_index Figure 2.: developers can select an index template or use the default index.

Kibana and Elastic Search use "Index Patterns" to describe the structure of the data that they display or query. Your analysis should bundle an Index Pattern.

The Default Index

A default index is registered when the system starts. The default index is configured to connect Kibana default visualizations with the results from the Storm Flux realtime analysis. An example of the default index can be found here.

Select Visualizations

We have to upload the visualizations that we want to use to show the results of the analysis. For that, we can upload a file with the visualization or write it in the text-area. The visualization is stored in the server as a template so they can be used in several games. All added visualizations will be displayed as a list. We can choose what visualizations we want to use in the current game.

3_select_visualizations Figure 3.: developers can select a visualizations template or some of the default visualizations.

Visualization templates describe families of graphics and plots; a Visualization Template, when, combined with fields from an Index Pattern, fully describes a visualization, which can then be populated with data.

How to Create Your Own Visualizations

We use Kibana as our dashboard engine. Kibana is a powerful dashboard configuration tool and we can use it to define our own customized visualizations. Kibana has a user guide about visualizations configuration. Once we have configured our dashboard, we will have to export the information related with the attached index and the visualization as JSON objects and import them into RAGE.

It is also possible to use the set of default visualizations for the teacher or developer.

Configure Fields

Once we have selected the index and visualizations that you want to display to the teacher, we have to choose which fields each visualization will use. Displayed as a table, with a visualization field list, we can select the fields of our analysis (index). We can upload a single display for different games, and configure fields that must be used in each game.

4_configure_fields Figure 4.: fields configuration.

Test the Configuration

Following the previous steps, we have correctly configured the dashboards available to a user with the teacher role. If we wish to test the visualizations and the available analysis we can add a statements file with xAPI traces (JSON-formatted). The format of the JSON file is defined here. You can obtain an example file with data to test your visualizations here. A dashboard with the selected visualizations will be displayed:

all_session Figure 5.: configuration test result.

Note that if the visualizations display zero results, it can be caused because of the lack of data to fill the given visualization, in which case we will need to use a bigger testing set:

no_results

Figure 6.: no results message.

It can also be caused because the testing data uses a different time-frame that the established by the dashboard. In order to set a personalized time-frame, we will have to define the time range used by the dashboard and, possibly, the auto-refresh rate:

time_frame

Figure 7.: set a time-frame in Kibana.

Once a game has correctly been configured, the defined visualizations (and dashboards) will be available for the teacher when a class (session) is created.

Clone this wiki locally