This temperature monitor system has been tested on Ubuntu 18.04LTS. Other distributions are not supported by this tool. Use it at your own risks.
The following package need to be installed:
- lm-sensors
- mailutils
In addition, Python >= 3.6 should be available on the system. If available, use
a virtual environment (venv or conda). Make sure pip
is available as well.
Download this repository with git clone https://github.com/pdessauw/temp-ms.git
.
The script install.sh
will install the script in the desired location. By default,
this location is defined at:
INSTALL_DIR="/usr/local/bin/monitor"
Once correctly configured, type:
sudo ./install.sh
pip install -r requirements.txt
The file settings.py
contains the needed configuration keys for the app. If you do not
have a NVidia Graphic Card, you can change the COMMANDS
key as this:
COMMANDS = [
SensorsCommand,
# NvidiaSmiCommand
]
Configure your email address with the ALERTS['recipient']
key. Formatting of the email
can also be changed by editing the files in the mail/ folder.
There are 3 main scripts in this repository:
report.py
, logging the values gathered by lm-sensors,alerting.py
, notfiying the admin by mail if values are over their baseline,baseline.py
, to configure the baseline for every metric.
To log an entry, simply type python ./report.py
. This will append the log file already
existing or create new ones if it needs. Setup a crontab
entry to automatically log
the temperature reports.
Alerting takes two optional parameters:
--time
, which default to the settingALERT_TIMEFRAME_MINUTES
, is the number of minutes to look at for reporting.--all
, defines if the mail should be sent with values in the normal range.
Here are some examples to launch the alerting script:
# Send a mail summarizing the last 60 minutes with all metrics
python ./alerting.py --time 60 --all
# Send a mail summarizing the last 15 minutes only with over the limit metrics
python ./alerting.py --time 15
The baseline script allows to define baseline for the metrics logged on the system. To
initialize the metrics based on the log files type python ./baseline.py --init
.
These baseline metrics are used by the alerting system to test if metrics are too high.
All the collected baseline can be listed with python ./baseline.py --list
.
Some baseline metrics might not be well defined. They can be tweaked using:
python ./baseline --set baseline_1.name_1=value_1 ... --set baseline_n.name_n=value_n