Skip to content
This repository has been archived by the owner on Dec 15, 2020. It is now read-only.

Very log run issues #79

Open
sphaero opened this issue Jun 14, 2015 · 0 comments
Open

Very log run issues #79

sphaero opened this issue Jun 14, 2015 · 0 comments

Comments

@sphaero
Copy link
Contributor

sphaero commented Jun 14, 2015

No idea how this is related to any issues in Pyre or ZOCP but at least it shows where we might add some conditionals. This is from running 5 threads(actor/nodes) on a ubuntu linux machine:

Peer None isn't ready
Thread4: fps: 0.05272019373681729
Peer None isn't ready
Thread3: fps: 0.6297142004372849
Thread2: fps: 0.04867682388059979
Peer None isn't ready
Peer None isn't ready
Peer None isn't ready
Peer None isn't ready
Peer <pyre.pyre_peer.PyrePeer object at 0x7f660a940940> isn't ready
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.4/threading.py", line 868, in run
    self._target(*self._args, **self._kwargs)
  File "/home/people/arnaud/src/pyre/pyre/zactor.py", line 57, in run
    self.shim_handler(*self.shim_args, **self.shim_kwargs)
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 52, in __init__
    self.run()
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 503, in run
    self.recv_peer()
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 359, in recv_peer
    zmsg.recv(self.inbox)
  File "/home/people/arnaud/src/pyre/pyre/zre_msg.py", line 74, in recv
    self.address = uuid.UUID(bytes=self.address[1:])
  File "/usr/lib/python3.4/uuid.py", line 148, in __init__
    raise ValueError('bytes is not a 16-char string')
ValueError: bytes is not a 16-char string

Exception in thread Thread-8:
Traceback (most recent call last):
  File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.4/threading.py", line 868, in run
    self._target(*self._args, **self._kwargs)
  File "/home/people/arnaud/src/pyre/pyre/zactor.py", line 57, in run
    self.shim_handler(*self.shim_args, **self.shim_kwargs)
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 52, in __init__
    self.run()
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 503, in run
    self.recv_peer()
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 359, in recv_peer
    zmsg.recv(self.inbox)
  File "/home/people/arnaud/src/pyre/pyre/zre_msg.py", line 74, in recv
    self.address = uuid.UUID(bytes=self.address[1:])
  File "/usr/lib/python3.4/uuid.py", line 148, in __init__
    raise ValueError('bytes is not a 16-char string')
ValueError: bytes is not a 16-char string
Exception in thread Thread-5:
Traceback (most recent call last):
  File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.4/threading.py", line 868, in run
    self._target(*self._args, **self._kwargs)
  File "/home/people/arnaud/src/pyre/pyre/zactor.py", line 57, in run
    self.shim_handler(*self.shim_args, **self.shim_kwargs)
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 52, in __init__
    self.run()
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 503, in run
    self.recv_peer()
  File "/home/people/arnaud/src/pyre/pyre/pyre_node.py", line 359, in recv_peer
    zmsg.recv(self.inbox)
  File "/home/people/arnaud/src/pyre/pyre/zre_msg.py", line 74, in recv
    self.address = uuid.UUID(bytes=self.address[1:])
  File "/usr/lib/python3.4/uuid.py", line 148, in __init__
    raise ValueError('bytes is not a 16-char string')
ValueError: bytes is not a 16-char string

I don't know what happened but somewhere (either before or after the error messages) it starts leaking memory:

[112950.723732] python3 invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[112950.723740] python3 cpuset=session-c2.scope mems_allowed=0
[112950.723750] CPU: 4 PID: 29085 Comm: python3 Tainted: P           OE  3.19.0-18-generic #18-Ubuntu
[112950.723754] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890GM Pro3 R2.0, BIOS P1.50 10/04/2011
[112950.723757]  0000000000000000 ffff8802df957898 ffffffff817c27cd 0000000000000007
[112950.723764]  ffff880408a26bf0 ffff8802df957918 ffffffff817c0543 0000000000000000
[112950.723770]  0000000000000000 0000000000000000 0000000000000000 ffff880409d6bae0
[112950.723776] Call Trace:
[112950.723787]  [<ffffffff817c27cd>] dump_stack+0x45/0x57
[112950.723794]  [<ffffffff817c0543>] dump_header+0x7f/0x1e7
[112950.723803]  [<ffffffff8117d07b>] oom_kill_process+0x22b/0x390
[112950.723811]  [<ffffffff8107ea9e>] ? has_capability_noaudit+0x1e/0x30
[112950.723817]  [<ffffffff8117d5ed>] out_of_memory+0x24d/0x500
[112950.723822]  [<ffffffff8118353a>] __alloc_pages_nodemask+0xaba/0xba0
[112950.723830]  [<ffffffff811c9c61>] alloc_pages_current+0x91/0x110
[112950.723835]  [<ffffffff81179597>] __page_cache_alloc+0xa7/0xd0
[112950.723840]  [<ffffffff8117bcaf>] filemap_fault+0x1af/0x400
[112950.723845]  [<ffffffff811a683d>] __do_fault+0x3d/0xc0
[112950.723850]  [<ffffffff811a90ef>] do_read_fault.isra.55+0x1df/0x2f0
[112950.723856]  [<ffffffff811aafbe>] handle_mm_fault+0x86e/0xff0
[112950.723861]  [<ffffffff810ef518>] ? get_futex_key+0x238/0x2b0
[112950.723867]  [<ffffffff810ef7d1>] ? futex_wake+0x71/0x140
[112950.723872]  [<ffffffff81062bdd>] __do_page_fault+0x1dd/0x5b0
[112950.723877]  [<ffffffff810f2537>] ? do_futex+0x107/0x5d0
[112950.723883]  [<ffffffff81062fe1>] do_page_fault+0x31/0x70
[112950.723888]  [<ffffffff817cba68>] page_fault+0x28/0x30
[112950.723892] Mem-Info:
[112950.723894] Node 0 DMA per-cpu:
[112950.723898] CPU    0: hi:    0, btch:   1 usd:   0
[112950.723901] CPU    1: hi:    0, btch:   1 usd:   0
[112950.723903] CPU    2: hi:    0, btch:   1 usd:   0
[112950.723905] CPU    3: hi:    0, btch:   1 usd:   0
[112950.723907] CPU    4: hi:    0, btch:   1 usd:   0
[112950.723910] CPU    5: hi:    0, btch:   1 usd:   0
[112950.723912] Node 0 DMA32 per-cpu:
[112950.723915] CPU    0: hi:  186, btch:  31 usd:  46
[112950.723918] CPU    1: hi:  186, btch:  31 usd: 171
[112950.723920] CPU    2: hi:  186, btch:  31 usd:  29
[112950.723923] CPU    3: hi:  186, btch:  31 usd:  24
[112950.723925] CPU    4: hi:  186, btch:  31 usd:  30
[112950.723927] CPU    5: hi:  186, btch:  31 usd:  40
[112950.723929] Node 0 Normal per-cpu:
[112950.723932] CPU    0: hi:  186, btch:  31 usd:   0
[112950.723935] CPU    1: hi:  186, btch:  31 usd: 175
[112950.723937] CPU    2: hi:  186, btch:  31 usd:  43
[112950.723939] CPU    3: hi:  186, btch:  31 usd:  48
[112950.723941] CPU    4: hi:  186, btch:  31 usd:  31
[112950.723944] CPU    5: hi:  186, btch:  31 usd:  44
[112950.723950] active_anon:3574586 inactive_anon:432027 isolated_anon:0
 active_file:203 inactive_file:3054 isolated_file:32
 unevictable:8 dirty:0 writeback:73 unstable:0
 free:33269 slab_reclaimable:7542 slab_unreclaimable:8546
 mapped:4716 shmem:164 pagetables:21254 bounce:0
 free_cma:0
[112950.723957] Node 0 DMA free:15900kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[112950.723964] lowmem_reserve[]: 0 3482 16008 16008
[112950.723969] Node 0 DMA32 free:64588kB min:14684kB low:18352kB high:22024kB active_anon:2870400kB inactive_anon:583044kB active_file:44kB inactive_file:2708kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3647556kB managed:3567688kB mlocked:0kB dirty:0kB writeback:0kB mapped:2472kB shmem:0kB slab_reclaimable:5464kB slab_unreclaimable:5612kB kernel_stack:1328kB pagetables:21944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:217172 all_unreclaimable? yes
[112950.723977] lowmem_reserve[]: 0 0 12526 12526
[112950.723982] Node 0 Normal free:52588kB min:52832kB low:66040kB high:79248kB active_anon:11428140kB inactive_anon:1144808kB active_file:768kB inactive_file:9508kB unevictable:32kB isolated(anon):0kB isolated(file):128kB present:13090812kB managed:12827288kB mlocked:32kB dirty:0kB writeback:292kB mapped:16392kB shmem:656kB slab_reclaimable:24704kB slab_unreclaimable:28572kB kernel_stack:5648kB pagetables:63072kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:288820 all_unreclaimable? yes
[112950.723990] lowmem_reserve[]: 0 0 0 0
[112950.723995] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15900kB
[112950.724064] Node 0 DMA32: 160*4kB (UEMR) 124*8kB (UEMR) 145*16kB (UEMR) 123*32kB (UEMR) 64*64kB (EMR) 44*128kB (UEMR) 27*256kB (ER) 14*512kB (UER) 4*1024kB (EM) 6*2048kB (UMR) 4*4096kB (M) = 64464kB
[112950.724091] Node 0 Normal: 381*4kB (E) 296*8kB (UE) 410*16kB (UE) 355*32kB (UEM) 160*64kB (UEM) 71*128kB (UE) 28*256kB (EM) 6*512kB (UEM) 1*1024kB (M) 0*2048kB 0*4096kB = 52404kB
[112950.724118] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[112950.724121] 3752 total pagecache pages
[112950.724124] 93 pages in swap cache
[112950.724127] Swap cache stats: add 4319487, delete 4319394, find 67007/113625
[112950.724130] Free swap  = 0kB
[112950.724132] Total swap = 16752636kB
[112950.724135] 4188588 pages RAM
[112950.724137] 0 pages HighMem/MovableOnly
[112950.724139] 85869 pages reserved
[112950.724141] 0 pages cma reserved
[112950.724143] 0 pages hwpoisoned
[112950.724146] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[112950.724155] [  279]     0   279     8312       15      20       82             0 systemd-journal
[112950.724161] [  291]     0   291     8913        1      17      213         -1000 systemd-udevd
[112950.724167] [  487]     0   487     8259       16      22       76             0 rpcbind
[112950.724173] [  621]     0   621   107886      175      95      474             0 NetworkManager
[112950.724178] [  624]   104   624    80551        0      56      414             0 rsyslogd
[112950.724181] [  632]     0   632     3828        0      11       39             0 cgmanager
[112950.724185] [  649]     0   649    25254       18      52      246             0 systemd-logind
[112950.724189] [  659]     0   659     5650       19      17       43             0 cron
[112950.724192] [  664]   107   664    24663        0      51      268             0 avahi-daemon
[112950.724196] [  666]     0   666    85830       19      68      367             0 accounts-daemon
[112950.724200] [  671]     0   671    84104        0      66      830             0 ModemManager
[112950.724203] [  674]     0   674     4860       24      14       32             0 irqbalance
[112950.724207] [  675]   110   675   109697       40      77      362             0 whoopsie
[112950.724211] [  684]   105   684    27628        1      58      466          -900 dbus-daemon
[112950.724215] [  700]   107   700    24621        0      48      235             0 avahi-daemon
[112950.724218] [  727]     0   727    85256        1      70      702             0 polkitd
[112950.724222] [  728]   117   728    74518        1      48     1100             0 colord
[112950.724226] [  729]     0   729    37446        0      41      756             0 cups-browsed
[112950.724229] [  787]     0   787    13865        1      31      166         -1000 sshd
[112950.724233] [  800]     0   800   102857        5      68      362             0 lightdm
[112950.724237] [  810]     0   810    89560     4073     174    30504             0 Xorg
[112950.724241] [  813]     0   813     5866        0      16     1723             0 dhclient
[112950.724244] [  816] 65534   816     7440       18      18       45             0 dnsmasq
[112950.724248] [ 1031]     0  1031     1099        1       8       43             0 acpid
[112950.724252] [ 1045]     0  1045    36701        1      74      321             0 login
[112950.724256] [ 1132]   114  1132    40656        7      17       45             0 rtkit-daemon
[112950.724259] [ 1162]     0  1162    57006        1      79      380             0 lightdm
[112950.724263] [ 1183]   111  1183     8848       25      21       55             0 kerneloops
[112950.724267] [ 1194] 10002  1194    28788        6      59      330             0 systemd
[112950.724270] [ 1195] 10002  1195    35906        0      68      599             0 (sd-pam)
[112950.724274] [ 1201] 10002  1201    26304       23      55      239             0 i3
[112950.724278] [ 1255] 10002  1255     2687        8       8       74             0 ssh-agent
[112950.724281] [ 1258] 10002  1258    24805        0      48      266             0 dbus-launch
[112950.724285] [ 1259] 10002  1259    27542       93      53      279             0 dbus-daemon
[112950.724288] [ 1276] 10002  1276   106664     3645      72      938             0 ibus-daemon
[112950.724292] [ 1279] 10002  1279    48310        1      30      165             0 gvfsd
[112950.724295] [ 1283] 10002  1283    67422        0      31      184             0 gvfsd-fuse
[112950.724299] [ 1287] 10002  1287    66557        0      32      183             0 ibus-dconf
[112950.724303] [ 1288] 10002  1288   110584        1     103     3075             0 ibus-ui-gtk3
[112950.724306] [ 1291] 10002  1291    68335        1      68      526             0 ibus-x11
[112950.724310] [ 1296] 10002  1296     1117        0       7       22             0 sh
[112950.724314] [ 1299] 10002  1299    98369        1      58      577             0 dunst
[112950.724317] [ 1300] 10002  1300     1117        0       6       23             0 sh
[112950.724321] [ 1301] 10002  1301    66008        1      30      155             0 at-spi-bus-laun
[112950.724325] [ 1303] 10002  1303   119875      167     126     1085             0 nm-applet
[112950.724328] [ 1310] 10002  1310     1118        0       7       24             0 sh
[112950.724332] [ 1312] 10002  1312    20996       25      44      327             0 i3bar
[112950.724335] [ 1316] 10002  1316    27436        1      58      265             0 dbus-daemon
[112950.724339] [ 1317] 10002  1317     1118        0       7       25             0 sh
[112950.724343] [ 1318] 10002  1318     7023       35      20       39             0 i3status
[112950.724346] [ 1327] 10002  1327    24867       24      49      274             0 screen
[112950.724350] [ 1328] 10002  1328     3131       43      11       29             0 syncdir.sh
[112950.724354] [ 1332] 10002  1332    30824        1      29      166             0 at-spi2-registr
[112950.724358] [ 1339] 10002  1339    47624        0      30      230             0 ibus-engine-sim
[112950.724361] [ 1344] 10002  1344    24867       24      47      274             0 screen
[112950.724365] [ 1345] 10002  1345     3132       45      11       29             0 syncdir.sh
[112950.724368] [ 1352] 10002  1352    29334       29      60      284             0 gconfd-2
[112950.724372] [ 1386]   109  1386    25476       28      51      303             0 ntpd
[112950.724376] [ 1395] 10002  1395     3111        0      12       53             0 bash
[112950.724379] [ 1397] 10002  1397   255037     1076     285     4639             0 pidgin
[112950.724383] [ 1501] 10002  1501     3111        0      12       52             0 bash
[112950.724387] [ 1504] 10002  1504   198279     1614     144     2255             0 geany
[112950.724391] [ 1509] 10002  1509    22878        1      48      226             0 gnome-pty-helpe
[112950.724394] [ 1510] 10002  1510    24605        3      51      961             0 bash
[112950.724398] [ 1552] 10002  1552     1118        0       7       24             0 sh
[112950.724402] [ 1553] 10002  1553   140301      645     126     1796             0 x-terminal-emul
[112950.724405] [ 1555] 10002  1555    22878        1      48      226             0 gnome-pty-helpe
[112950.724409] [ 1574] 10002  1574    24864        1      51     1217             0 bash
[112950.724413] [ 1588] 10002  1588    24869        1      52     1221             0 bash
[112950.724416] [ 1611] 10002  1611    24654        1      52     1005             0 bash
[112950.724420] [ 1626] 10002  1626    24872        1      51     1230             0 bash
[112950.724423] [ 1686] 10002  1686     3111        0      12       52             0 bash
[112950.724427] [ 1688] 10002  1688   263974     1213     151     2874             0 geany
[112950.724431] [ 1694] 10002  1694    22878        1      47      227             0 gnome-pty-helpe
[112950.724434] [ 1695] 10002  1695    24605        1      51      962             0 bash
[112950.724438] [ 2703]     0  2703    67483        0      49      320             0 upowerd
[112950.724442] [ 2878] 10002  2878   120212        1      75      796             0 pulseaudio
[112950.724445] [ 6154]     0  6154    20337        0      43      278             0 cupsd
[112950.724449] [ 7130] 10002  7130    28017        0      26      135             0 gvfsd-metadata
[112950.724452] [ 7319] 10002  7319     9393        0      22       89             0 xfconfd
[112950.724456] [ 7327] 10002  7327    87499      204      70      307             0 gvfs-udisks2-vo
[112950.724460] [ 7329]     0  7329    91010        0      44      507             0 udisksd
[112950.724463] [ 7338] 10002  7338    80545        0      46      269             0 gvfs-afc-volume
[112950.724467] [ 7343] 10002  7343    46402        0      28      654             0 gvfs-mtp-volume
[112950.724471] [ 7347] 10002  7347    49445        0      32      192             0 gvfs-gphoto2-vo
[112950.724474] [10959]     0 10959   788127        1      80      451             0 console-kit-dae
[112950.724478] [11032] 10002 11032    24574        4      51      928             0 bash
[112950.724482] [12531] 10002 12531    44623        1      24      127             0 dconf-service
[112950.724486] [20988] 10002 20988     3111        0      11       52             0 bash
[112950.724489] [20990] 10002 20990   374912    52161     582    81262             0 firefox
[112950.724494] [22613] 10002 22613   255430     1115     305    11258             0 plugin-containe
[112950.724498] [28281] 10002 28281    24729        0      51      318             0 ssh
[112950.724502] [29075] 10002 29075  8427463  3939146   15686  4003496             0 python3
[112950.724506] [31021] 10002 31021   132646     1958     154    12669             0 python2.7
[112950.724509] [31488] 10002 31488     1757      149       9        0             0 inotifywait
[112950.724513] [31489] 10002 31489     3131       47      11       25             0 syncdir.sh
[112950.724517] [31493] 10002 31493     1668       43       8        0             0 inotifywait
[112950.724521] [31494] 10002 31494     3132       47      11       27             0 syncdir.sh
[112950.724524] Out of memory: Kill process 29075 (python3) score 959 or sacrifice child
[112950.724529] Killed process 29075 (python3) total-vm:33709852kB, anon-rss:15756584kB, file-rss:0kB

It was using 32G memory which is max on this machine

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant