Skip to content
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.

Plugin loads successfully, but no metrics available #32

Closed
dgarth opened this issue Nov 17, 2016 · 7 comments
Closed

Plugin loads successfully, but no metrics available #32

dgarth opened this issue Nov 17, 2016 · 7 comments
Assignees

Comments

@dgarth
Copy link

dgarth commented Nov 17, 2016

Environment:

  • Snap version: v0.16.1-beta
  • OS: CentOS 7
  • Kernel: 3.10.0-327.36.3.el7.x86_64
  • Perf: 3.10.0-327.36.3.el7

Perf list outputs

$ perf list

List of pre-defined events (to be used in -e):

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  bus-cycles                                         [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]
  ref-cycles                                         [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]

  alignment-faults                                   [Software event]
  context-switches OR cs                             [Software event]
  cpu-clock                                          [Software event]
  cpu-migrations OR migrations                       [Software event]
  dummy                                              [Software event]
  emulation-faults                                   [Software event]
  major-faults                                       [Software event]
  minor-faults                                       [Software event]
  page-faults OR faults                              [Software event]
  task-clock                                         [Software event]

  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-prefetch-misses                          [Hardware cache event]
  L1-dcache-store-misses                             [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-prefetches                                     [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  iTLB-loads                                         [Hardware cache event]

  branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
  branch-misses OR cpu/branch-misses/                [Kernel PMU event]
  bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
  cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
  cache-references OR cpu/cache-references/          [Kernel PMU event]
  cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
  instructions OR cpu/instructions/                  [Kernel PMU event]
  mem-loads OR cpu/mem-loads/                        [Kernel PMU event]
  mem-stores OR cpu/mem-stores/                      [Kernel PMU event]
  power/energy-cores/                                [Kernel PMU event]
  power/energy-gpu/                                  [Kernel PMU event]
  power/energy-pkg/                                  [Kernel PMU event]
  ref-cycles OR cpu/ref-cycles/                      [Kernel PMU event]
  stalled-cycles-backend OR cpu/stalled-cycles-backend/ [Kernel PMU event]
  stalled-cycles-frontend OR cpu/stalled-cycles-frontend/ [Kernel PMU event]
  uncore_cbox_0/clockticks/                          [Kernel PMU event]
  uncore_cbox_1/clockticks/                          [Kernel PMU event]
  uncore_cbox_2/clockticks/                          [Kernel PMU event]
  uncore_cbox_3/clockticks/                          [Kernel PMU event]
  uncore_imc/data_reads/                             [Kernel PMU event]
  uncore_imc/data_writes/                            [Kernel PMU event]

  rNNN                                               [Raw hardware event descriptor]
  cpu/t1=v1[,t2=v2,t3 ...]/modifier                  [Raw hardware event descriptor]
   (see 'man perf-list' on how to encode it)

  mem:<addr>[/len][:access]                          [Hardware breakpoint]

I started snap via:

# export SNAP_PATH=/opt/snap/snap-v0.16.1-beta
# snapd

And then in another session:

$ snapctl plugin load /opt/snap/snap-v0.16.1-beta/plugin/snap-plugin-collector-perfevents
snap-plugin-collector-perfevents
Plugin loaded
Name: perfevents
Version: 8
Type: collector
Signed: false
Loaded Time: Thu, 17 Nov 2016 15:18:12 CET

But after this snapctl metric list outputs no metrics related to this plugin:

$ snapctl metric list | grep perf
$
@andrzej-k
Copy link
Contributor

Hi @dgarth This plugin collects only per cgroup metrics. Do you have any cgroups in the system? If not, then, for testing, you may start some docker container, reload perfevents plugin and check if metrics become available.

@dgarth
Copy link
Author

dgarth commented Jan 3, 2017

Hi @andrzej-k,
thanks for your answer. I now started a VM (in my use case the plugin is supposed to monitor OpenStack nodes) to have a cgroup before loading the plugin. It seems to find metrics now, but it errors on load:

$ snapctl plugin load /opt/snap/snap-v0.16.1-beta/plugin/snap-plugin-collector-perfevents
Error loading plugin:
Metric namespace /intel/linux/perfevents/cgroup/cycles/machine_slice:machine-qemu\x2d1\x2dinstance\x2d00000001_scope contains not allowed characters. Avoid using  brackets [( ) [ ] { }], spaces [ ], punctuations [. , ; ? !], slashes [| \ /], carets [^], quotations [" ` ']

It seems that there is a problem in conjunction with the libvirt naming scheme for cgroups.

@andrzej-k
Copy link
Contributor

@dgarth I'm assigning @lmroz to investigate and fix this bug. We'll try to keep you informed about the progress.

@IzabellaRaulin
Copy link
Contributor

IzabellaRaulin commented Jan 16, 2017

hello @dgarth, I verified what happens and this error message:

Metric namespace /intel/linux/perfevents/cgroup/cycles/machine_slice:machine-qemu\x2d1\x2dinstance\x2d00000001_scope contains not allowed characters. Avoid using  brackets [( ) [ ] { }], spaces [ ], punctuations [. , ; ? !], slashes [| \ /], carets [^], quotations [" ` ']

that you received came from Snap's framework (see here; this error message was denifed in line 78). However, this checking was removed in snap/commit/e182c1cf and you should not receive this error any more for Snap in version 0.17 or higher.

@dgarth, there were many changes since version: v0.16.1-beta. I highly recommend using the latest release Snap 1.0 for which this issue does not occur.

What is more, I prepared the pull request #39 to remove the fragment of code which is no longer needed to make this plugin more aligned with the latest Snap. Also, your case has been added in medium tests to be sure that it works as expected. It was already merged, so you can just download the latest perfevents collector (ver9)

@dgarth
Copy link
Author

dgarth commented Jan 16, 2017

Thanks @IzabellaRaulin, I can confirm that this bug does not occur in Snap 1.0.
The first problem persists though: I need to create the cgroup before even loading the plugin to get the metrics.

@IzabellaRaulin
Copy link
Contributor

@dgarth, yes, this plugin discovers cgroups during loading and based on that exposes appropriate metrics - this the way how it works currently what causes that you need to create cgroups before loading the plugin.

I cannot agree more that perfevents metrics should be exposed as dynamic metrics what means there is no cgroup discovery and metrics' namespaces contain a wildcard for a varied element. Therefore, even if some cgroups just appear (not matter when), metrics from them will be available and might be collected.

I do not know how well acquainted you are with the concept of a dynamic metric in Snap, but https://github.com/intelsdi-x/snap-plugin-collector-docker is a good example of using it - docker container ID is defined as a dynamic element, so that allows monitoring all available docker containers existing on system, not only this one which was available before loading the plugin.

This enhancement is addressed by issue #38 - I will put your comment also there to highlight how important is exposing dynamic metrics for this plugin.

@dgarth, are you ok with closing this issue based on adding dynamic metrics is taken into account in #38?

@dgarth
Copy link
Author

dgarth commented Jan 16, 2017

Thanks.
Yes, I know dynamic metrics from e.g. the libvirt plugin.
I will close this issue and subscribe to #38.

@dgarth dgarth closed this as completed Jan 16, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants