Fix memory leak in chan::tick cf. BurntSushi/chan#11 and branch mem_leak_chan_tick.
This is known behavior of chan::tick. The memory leak led to 236 KB allocated memory during 64 hours. That doesn't hurt too much especially because we restart the process every 24 for log rotation. Skipping this for now.
Transform all wsrep values to metrics
Add metadata for all wsrep values
Check additional state for metrics
Handle tags
Automate packaging for Ubuntu
- Ansible Role
- Update Readme: Link to package and Ansible role
Support TLS and basic auth for Bosun
Add auth and transport encryption to Galera collector
Add auth and transport encryption to Mongo collector
lib version check -- reduce multiple versions of dependent crates
Release 0.1
Move to serde; cf. Galera collector
Redo collectors as real state machine
[+] Failure Modes
- Reinitialize collector if collection fails.
- Reconnect Logic for Galera Collector
- Remove collector if too many collection failures.
- Remove collector if collection thread does not respond anymore.
- Reinitialize collector if collection fails.
Add timestamps to log messages
Clean up
Make it safe
- Clippy-fy
- Fix Todos
- Eliminate unwraps
Rust documentation
Enhance deb package
- Don't overwrite changed config files
Move project to Rheinwerk
Extend bosun_emitter to send multiple data points
Support multiple Galera Collectors -- also change in Ansible role
Make threads resilient against panics (current workaround: abort on panic so that no thread dies unknowingly)
- Check for IP bound to interface -- keepalived VIP side effect
- [+] Postfix metrics
- Queue len
- Send / Recv stats
- [+] MongoDB
- [+] replication metrics -- cf. replSetGetStatus
- myState (A)
- [+] Oplog replication lag (A)
- Explain lag spikes due to idle times -- cf. Mongo documentation
- Show alert example
- Heartbeat latency = lastHeartbeatRecv - lastHeartbeat (A)
- roundtrip time = pingMs
- uptime = uptime -> Rate
- health = health only from point of view of primary (A)
- Balancer Status
- other metrics?
- [+] replication metrics -- cf. replSetGetStatus
- Internal metrics
- Version -- can also be used to check liveliness and as heartbeat
- Number of transmitted samples -- can also be used to check liveliness and as heartbeat
- RSS cf. procinfo -- can also be used to check liveliness and as heartbeat
- Docker
- Use rust-docker
- ifconfig / network inferface frame metrics
- Serial numbers of all authoritive servers
- Ceph metrics
- MySQL performance metrics
- MongoDB performance metrics
- Tomcat management servlet metrics
- LACP / interface bond metrics