Skip to content

hrchlhck/kubemon

Repository files navigation

kubemon

A tool for distributed container monitoring over Kubernetes.

Translations

Table of contents

Citation

@inproceedings{Horchulhack2023,
  series = {SBSeg Estendido 2023},
  title = {Kubemon: extrator de métricas de desempenho de sistema operacional e aplica\c{c}ões conteinerizadas em ambientes de nuvem no domínio do provedor},
  url = {http://dx.doi.org/10.5753/sbseg_estendido.2023.233247},
  DOI = {10.5753/sbseg_estendido.2023.233247},
  booktitle = {Anais Estendidos do XXIII Simpósio Brasileiro de Seguran\c{c}a da Informa\c{c}ão e de Sistemas Computacionais (SBSeg Estendido 2023)},
  publisher = {Sociedade Brasileira de Computa\c{c}ão - SBC},
  author = {Horchulhack,  Pedro and Viegas,  Eduardo K. and Santin,  Altair O. and Ramos,  Felipe V.},
  year = {2023},
  month = sep,
  collection = {SBSeg Estendido 2023}
}

Environment requirements

  • Ubuntu 18.04
  • Kubernetes v1.19
  • Docker v19.03.13
  • Python 3.8
  • GNU Make 4.2.1

Application requirements

Illustrations

Basic diagram Kubemon diagram

Main functionalities

  • Collect data within the provider domain
  • The data are collected within Kubernetes Pods
  • Can be configured through Kubernetes environment variables
  • Collects metrics from operating system, Docker containers and processes created by the container
  • Send the collected metrics to the collector module, which saves the data in a CSV file
  • Can be controlled remotely by either a basic CLI or Python API

Collected metrics

For more information about the collected metrics, please refer to:

Operating System

Type Unit Metric
CPU Quantity
Quantity
Quantity
Quantity
Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
Context Switches
Interrupts
Soft Interrupts
Syscalls
Times User
Times System
Times Nice
Times Softirq
Times IRQ
Times IOWait
Times Guest
Times Guest Nice
Times Idle
Memory Quantity
Quantity
Quantity
Quantity
Quantity
KB
KB
Quantity
Quantity
Quantity
Quantity
Active (Anon)
Inactive (Anon)
Inactive (file)
Active (file)
Mapped Pages
KB Paged In Since Boot (pgpgin)
KB Paged Out Since Boot (pgpgout)
Pages Free (pgfree)
Page Faults (pgfault)
Major Page Faults (pgmajfault)
Pages Reused (pgreuse)
Disk Requests
Requests
Sectors
Milliseconds
Requests
Requests
Sectors
Milliseconds
Requests
Milliseconds
Milliseconds
Requests
Requests
Sectors
Milliseconds
Requests
Milliseconds
Read I/O
Read I/O Merged with In-queue I/O
Read Sectors
Total Wait Time for Read Requests
Write I/O
Write I/O Merged with In-Queue I/O
Write Sectors
Total Wait Time for Write Requests
I/O in Flight
Total Time This Block Device Has Been Active
Total Wait Time for All Requests
Discard I/O Processed
Discard I/O Processed with In-Queue I/O
Discard Sectors
Total Wait Time for Discard Requests
Flush I/O Processed
Total Wait Time for Flush Requests
Network Bytes
Bytes
Packets
Packets
Sent
Received
Sent
Received

Docker Processes

Type Unit Metric
CPU Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
Clock Ticks
User Time
System Time
Children User
Children System
IOWait
Memory Pages
Pages
Pages
Pages
Pages
Pages
Pages
Total Program Size (size)
Resident Set Size (resident)
Resident Shared Pages (shared)
Text (text)
Library (lib)
Data + Stack (data)
Dirty Pages (dt)
Disk Requests
Requests
Bytes
Bytes
Chars
Chars
Read
Write
Read
Write
Read
Write
Network Bytes
Bytes
Packets
Packets
Sent
Received
Sent
Received

Docker

Type Unit Metric
CPU Clock Ticks
Clock Ticks
Quantity
Quantity
Clock Ticks
User
System
Periods
Throttled
Throttled Time
Memory Pages
Pages
Pages
Pages
Pages
Pages
Pages
Pages
Pages
Pages
Pages
Pages
Resident Set Size (rss)
Chached
Mapped (mapped_file)
Paged In (pgpgin)
Paged Out (pgpgout)
Page Faults (pgfault)
Major Page Faults (pgmajfault)
Active (active_anon)
Inactive (inactive_anon)
Active File (active_file)
Inactive File (inactive_file)
Unevictable
Disk Bytes
Bytes
Bytes
Bytes
Bytes
Bytes
Read
Write
Sync
Async
Discard
Total
Network Bytes
Bytes
Packets
Packets
Sent
Received
Sent
Received

Installation

Before installing Kubemon, make sure Kubernetes and Docker are properly installed in the system.

  1. Download the latest version here: kubemon

  2. Extract the zip file and go on the extracted directory

  3. Update the nodeName field in kubernetes/04_collector.yaml to your the name of your Kubernetes control-plane node.

  4. Apply the Kubernetes objects within kubernetes/:

    $ kubectl apply -f kubernetes/
    namespace/kubemon created
    configmap/kubemon-env created
    persistentvolume/kubemon-volume created
    persistentvolumeclaim/kubemon-volume-claim created
    service/collector created
    service/monitor created
    pod/collector created
    daemonset.apps/kubemon-monitor created

The following subsection will detail about how to configure and execute the data collecting process.

Configuration

Kubemon has a few variables that can be defined by the user. For instance, some of the required fields to be configured before running the tool is NUM_DAEMONS, which denotes the expected amount of client instances should be connected to the collector component. In addition, the Kubemon components are configured through environment variables inside the Kubernetes pods.

The configuration file is at kubernetes/01_configmap.yaml. At the current version of Kubemon, the configmap lists all the configurable variables. You can update according to your needs.

The collected metrics will be saved in the Kubernetes control-plane node by default, in /mnt/kubemon-data. This setting can be changed in ./kubernetes/02_volumes.yaml by updating the hostPath field.

Example:

# Before
...
hostPath:
    path: "/mnt/kubemon-data"
    
# After
...
hostPath:
    path: "/home/user/data"

Running

Starting

To start the collecting process, you can either start the CLI or execute commands within Python.

Example with the CLI:

$ make cli host=10.0.1.2
Waiting for collector to be alive
Collector is alive!
>>> start test000
Starting 2 daemons and saving data at 10.0.1.2:/home/kubemon/output/data/test000

Example by using the CLI API within Python:

>>> from kubemon.collector import CollectorClient
>>> from kubemon.settings import CLI_PORT
>>> 
>>> cc = CollectorClient('10.0.1.2', CLI_PORT)
>>> cc.start('test000')
Starting 2 daemons and saving data at 10.0.1.2:/home/kubemon/output/data/test000

Stopping

Within the CLI:

>>> stop
Stopped collector

Using the API:

...
>>> cc.stop()
Stopped collector

All the CLI commands:

You can retrieve all the implemented commands by either typing help within the CLI prompt or by running .help() method from the API.

All the commands:

'start': Start collecting metrics from all connected daemons in the collector.

    Args:
        - Directory name to be saving the data collected. Ex.: start test000
    
'instances': Lists all the connected monitor instances.
    
'daemons': Lists all the daemons (hosts) connected.
    
'stop': Stop all monitors if they're running.
    
'help': Lists all the available commands.
    
'alive': Tells if the collector is alive.

References

About

A tool for distributed container monitoring over Kubernetes.

Topics

Resources

License

Stars

Watchers

Forks

Languages