Problemy s vycerpanim pameti

Mel bych prosbu.Pouzivam na domacim serveru Fedora Core 5.Pred casem se zacal objevovat problem, ze server po nejake dobe vytuhl a musel se natvrdo vyresetovat.Koukal jsem do logu a vzdy doslo nejakou formou k vycerpani pameti i swapu a server killnul sluzbu, ktera zrovna mela nejvetsi pametove naroky /obcas BIND, obcas DHCPD, obcas AV atd./
Povypinal jsem vsechny /podle me zbytecne sluzby/, ale problem se jevi stale jako nevyreseny.
Ted po cca 12 hodinach behu to vypada, ze k vycerpani pameti dojde opet.
Viz.log.

############################################
top - 09:44:39 up 12:14, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 65 total, 1 running, 63 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.5% us, 0.1% sy, 0.0% ni, 99.0% id, 0.3% wa, 0.0% hi, 0.0% si
Mem: 513372k total, 403572k used, 109800k free, 44968k buffers
Swap: 786424k total, 76k used, 786348k free, 277952k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8732 root 15 0 2132 912 720 R 2 0.2 0:00.01 top
1 root 16 0 1988 664 572 S 0 0.1 0:01.29 init
2 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0
3 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0
4 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0
5 root RT 0 0 0 0 S 0 0.0 0:00.02 migration/1
6 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1
7 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1
8 root 10 -5 0 0 0 S 0 0.0 0:00.00 events/0
9 root 10 -5 0 0 0 S 0 0.0 0:00.00 events/1
10 root 10 -5 0 0 0 S 0 0.0 0:00.02 khelper
11 root 11 -5 0 0 0 S 0 0.0 0:00.00 kthread
14 root 10 -5 0 0 0 S 0 0.0 0:00.32 kblockd/0
15 root 10 -5 0 0 0 S 0 0.0 0:00.14 kblockd/1
16 root 14 -5 0 0 0 S 0 0.0 0:00.00 kacpid
80 root 10 -5 0 0 0 S 0 0.0 0:00.00 khubd
135 root 15 0 0 0 0 S 0 0.0 0:00.81 pdflush
136 root 15 0 0 0 0 S 0 0.0 0:00.44 pdflush
137 root 15 0 0 0 0 S 0 0.0 0:01.10 kswapd0
138 root 11 -5 0 0 0 S 0 0.0 0:00.00 aio/0
139 root 11 -5 0 0 0 S 0 0.0 0:00.00 aio/1
226 root 10 -5 0 0 0 S 0 0.0 0:00.00 kseriod
302 root 11 -5 0 0 0 S 0 0.0 0:00.00 kpsmoused
327 root 11 -5 0 0 0 S 0 0.0 0:00.00 scsi_eh_0
384 root 13 -5 0 0 0 S 0 0.0 0:00.00 kmirrord
397 root 15 0 0 0 0 S 0 0.0 0:02.17 kjournald
436 root 11 -5 0 0 0 S 0 0.0 0:00.00 kauditd
460 root 13 -4 2200 680 384 S 0 0.1 0:00.38 udevd
1113 root 23 0 0 0 0 S 0 0.0 0:00.00 kjournald
1115 root 16 0 0 0 0 S 0 0.0 0:01.40 kjournald
1117 root 15 0 0 0 0 S 0 0.0 0:00.00 kjournald
1381 root 16 0 1656 560 460 S 0 0.1 0:00.13 syslogd
1384 root 15 0 1600 396 332 S 0 0.1 0:00.04 klogd
1394 root 16 0 1600 300 224 S 0 0.1 0:00.00 irqbalance
1459 rpc 15 0 1732 560 464 S 0 0.1 0:00.00 portmap
1478 rpcuser 17 0 1744 724 620 S 0 0.1 0:00.02 rpc.statd
1508 root 16 0 4728 584 308 S 0 0.1 0:00.00 rpc.idmapd
“log” 147L, 17483C

############################################

UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Sep08 ? 00:00:01 init [5]
root 2 1 0 Sep08 ? 00:00:00 [migration/0]
root 3 1 0 Sep08 ? 00:00:00 [ksoftirqd/0]
root 4 1 0 Sep08 ? 00:00:00 [watchdog/0]
root 5 1 0 Sep08 ? 00:00:00 [migration/1]
root 6 1 0 Sep08 ? 00:00:00 [ksoftirqd/1]
root 7 1 0 Sep08 ? 00:00:00 [watchdog/1]
root 8 1 0 Sep08 ? 00:00:00 [events/0]
root 9 1 0 Sep08 ? 00:00:00 [events/1]
root 10 1 0 Sep08 ? 00:00:00 [khelper]
root 11 1 0 Sep08 ? 00:00:00 [kthread]
root 14 11 0 Sep08 ? 00:00:00 [kblockd/0]
root 15 11 0 Sep08 ? 00:00:00 [kblockd/1]
root 16 11 0 Sep08 ? 00:00:00 [kacpid]
root 80 11 0 Sep08 ? 00:00:00 [khubd]
root 135 11 0 Sep08 ? 00:00:00 [pdflush]
root 136 11 0 Sep08 ? 00:00:00 [pdflush]
root 137 1 0 Sep08 ? 00:00:01 [kswapd0]
root 138 11 0 Sep08 ? 00:00:00 [aio/0]
root 139 11 0 Sep08 ? 00:00:00 [aio/1]
root 226 11 0 Sep08 ? 00:00:00 [kseriod]
root 302 11 0 Sep08 ? 00:00:00 [kpsmoused]
root 327 11 0 Sep08 ? 00:00:00 [scsi_eh_0]
root 384 11 0 Sep08 ? 00:00:00 [kmirrord]
root 397 1 0 Sep08 ? 00:00:02 [kjournald]
root 436 11 0 Sep08 ? 00:00:00 [kauditd]
root 460 1 0 Sep08 ? 00:00:00 /sbin/udevd -d
root 1113 1 0 Sep08 ? 00:00:00 [kjournald]
root 1115 1 0 Sep08 ? 00:00:01 [kjournald]
root 1117 1 0 Sep08 ? 00:00:00 [kjournald]
root 1381 1 0 Sep08 ? 00:00:00 syslogd -m 0
root 1384 1 0 Sep08 ? 00:00:00 klogd -x
root 1394 1 0 Sep08 ? 00:00:00 irqbalance
rpc 1459 1 0 Sep08 ? 00:00:00 portmap
rpcuser 1478 1 0 Sep08 ? 00:00:00 rpc.statd
root 1508 1 0 Sep08 ? 00:00:00 rpc.idmapd
dbus 1522 1 0 Sep08 ? 00:00:00 dbus-daemon --system
root 1628 1 0 Sep08 ? 00:00:00 /usr/sbin/automount --timeout=60 /net program /etc/auto.net
root 1641 1 0 Sep08 ? 00:00:00 cupsd
root 1649 1 0 Sep08 ? 00:00:00 /usr/sbin/sshd
root 1659 1 0 Sep08 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
ntp 1671 1 0 Sep08 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root 1681 1 0 Sep08 ? 00:00:00 /usr/sbin/dhcpd
“ser” 66L, 4295C
############################################
jeste prikladam hlasku z logu…ted koukam, jestli to nakonec nedela Samba…

Sep 8 06:38:09 linux kernel: DMA32: empty
Sep 8 06:38:09 linux kernel: Normal: 564kB 38kB 116kB 132kB 164kB 2128kB 1256kB 0512kB 01024kB 12048kB 04096kB = 2920kB
Sep 8 06:38:11 linux kernel: HighMem: empty
Sep 8 06:38:12 linux kernel: Swap cache: add 641493, delete 641243, find 52399/106816, race 1+13
Sep 8 06:38:13 linux kernel: Free swap = 0kB
Sep 8 06:38:16 linux kernel: Total swap = 786424kB
Sep 8 06:38:19 linux kernel: Free swap: 0kB
Sep 8 06:38:22 linux kernel: 131070 pages of RAM
Sep 8 06:38:23 linux kernel: 0 pages of HIGHMEM
Sep 8 06:38:30 linux kernel: 2743 reserved pages
Sep 8 06:38:36 linux kernel: 26289 pages shared
Sep 8 06:38:40 linux kernel: 250 pages swap cached
Sep 8 06:38:42 linux kernel: 0 pages dirty
Sep 8 06:38:46 linux kernel: 0 pages writeback
Sep 8 06:38:55 linux kernel: 119131 pages mapped
Sep 8 06:38:55 linux kernel: 3788 pages slab
Sep 8 06:38:56 linux kernel: 2491 pages pagetables
Sep 8 06:38:56 linux kernel: oom-killer: gfp_mask=0x201d2, order=0
Sep 8 06:38:59 linux kernel: [] out_of_memory+0x25/0x15a [] __alloc_pages+0x1fa/0x27a
Sep 8 06:39:02 linux kernel: [] __do_page_cache_readahead+0xc9/0x212 [] __wake_up_common+0x2f/0x53
Sep 8 06:39:03 linux kernel: [] dm_any_congested+0x2f/0x35 [dm_mod] [] filemap_nopage+0x14a/0x312
Sep 8 06:39:04 linux kernel: [] __handle_mm_fault+0x3eb/0xadf [] schedule+0xae5/0xb50
Sep 8 06:39:07 linux kernel: [] do_page_fault+0x17f/0x5e2 [] do_page_fault+0x0/0x5e2
Sep 8 06:39:09 linux kernel: [] error_code+0x4f/0x54 <6>Mem-info:
Sep 8 06:39:12 linux kernel: DMA per-cpu:
Sep 8 06:39:15 linux kernel: cpu 0 hot: high 0, batch 1 used:0
Sep 8 06:39:17 linux kernel: cpu 0 cold: high 0, batch 1 used:0
Sep 8 06:39:20 linux kernel: cpu 1 hot: high 0, batch 1 used:0
Sep 8 06:39:21 linux kernel: cpu 1 cold: high 0, batch 1 used:0
Sep 8 06:39:24 linux kernel: DMA32 per-cpu: empty
Sep 8 06:39:27 linux kernel: Normal per-cpu:
Sep 8 06:39:29 linux kernel: cpu 0 hot: high 186, batch 31 used:4
Sep 8 06:39:30 linux kernel: cpu 0 cold: high 62, batch 15 used:52
Sep 8 06:39:31 linux kernel: cpu 1 hot: high 186, batch 31 used:27
Sep 8 06:39:34 linux kernel: cpu 1 cold: high 62, batch 15 used:57
Sep 8 06:39:48 linux kernel: HighMem per-cpu: empty
Sep 8 06:39:53 linux kernel: Free pages: 4988kB (0kB HighMem)
Sep 8 06:39:58 linux kernel: Active:59340 inactive:59447 dirty:0 writeback:0 unstable:0 free:1247 slab:3788 mapped:119131 pagetables:2491
Sep 8 06:39:59 linux kernel: DMA free:2068kB min:88kB low:108kB high:132kB active:5608kB inactive:4280kB present:16384kB pages_scanned:10244 all_unreclaimable? yes
Sep 8 06:40:00 linux kernel: lowmem_reserve[]: 0 0 495 495
Sep 8 06:40:08 linux kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Sep 8 06:40:19 linux kernel: lowmem_reserve[]: 0 0 495 495
Sep 8 06:40:23 linux kernel: Normal free:2920kB min:2804kB low:3504kB high:4204kB active:231880kB inactive:233380kB present:507896kB pages_scanned:591344 all_unreclaimable? yes
Sep 8 06:40:23 linux kernel: lowmem_reserve[]: 0 0 0 0
Sep 8 06:40:24 linux kernel: HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Sep 8 06:41:08 linux kernel: lowmem_reserve[]: 0 0 0 0
Sep 8 06:45:07 linux kernel: DMA: 1
4kB 08kB 116kB 032kB 064kB 0128kB 0256kB 0512kB 01024kB 12048kB 04096kB = 2068kB
Sep 8 06:45:08 linux kernel: DMA32: empty
Sep 8 06:45:08 linux kernel: Normal: 564kB 38kB 116kB 132kB 164kB 2128kB 1256kB 0512kB 01024kB 12048kB 0*4096kB = 2920kB
Sep 8 06:45:08 linux kernel: HighMem: empty
Sep 8 06:45:31 linux kernel: Swap cache: add 641501, delete 641243, find 52399/106817, race 1+13
Sep 8 06:45:59 linux nmbd[1794]: [2006/09/08 06:45:58, 0] nmbd/nmbd_incomingdgrams.c:process_local_master_announce(309)
Sep 8 06:46:17 linux kernel: Free swap = 0kB
Sep 8 06:46:25 linux nmbd[1794]: process_local_master_announce: Server SUSE at IP 192.168.1.2 is announcing itself as a local master browser for workgroup KGB.VOK and we think we are master. Forcing election.
Sep 8 06:46:29 linux kernel: Total swap = 786424kB
Sep 8 06:46:30 linux nmbd[1794]: [2006/09/08 06:46:30, 0] nmbd/nmbd_become_lmb.c:unbecome_local_master_success(149)
Sep 8 06:46:30 linux kernel: Free swap: 0kB
Sep 8 06:46:33 linux nmbd[1794]: *****
Sep 8 06:46:36 linux kernel: 131070 pages of RAM
Sep 8 06:46:40 linux nmbd[1794]:
Sep 8 06:46:41 linux kernel: 0 pages of HIGHMEM
Sep 8 06:46:43 linux nmbd[1794]: Samba name server LINUX has stopped being a local master browser for workgroup XXX.XXX on subnet 192.168.1.1
Sep 8 06:46:44 linux kernel: 2743 reserved pages
Sep 8 06:46:45 linux nmbd[1794]:
Sep 8 06:46:47 linux kernel: 26280 pages shared
Sep 8 06:46:49 linux nmbd[1794]: *****
Sep 8 06:46:49 linux kernel: 258 pages swap cached
Sep 8 06:46:51 linux kernel: 0 pages dirty
Sep 8 06:46:52 linux kernel: 0 pages writeback
Sep 8 06:46:54 linux kernel: 119131 pages mapped
Sep 8 06:46:54 linux kernel: 3788 pages slab
Sep 8 06:46:54 linux kernel: 2491 pages pagetables
Sep 8 06:46:56 linux kernel: Out of Memory: Kill process 1437 (named) score 6031 and children.
Sep 8 06:46:57 linux kernel: Out of memory: Killed process 1437 (named).
Sep 8 06:46:59 linux kernel: Out of Memory: Kill process 1438 (named) score 6031 and children.
Sep 8 06:47:00 linux kernel: Out of memory: Killed process 1438 (named).
Sep 8 06:47:03 linux nmbd[1794]: [2006/09/08 06:47:03, 0] nmbd/nmbd_become_lmb.c:become_local_master_stage2(396)
8 06:47:04 linux kernel: oom-killer: gfp_mask=0x201d2, order=0

muzete nekdo prosim zkusit poradit, v cem muze byt problem a jak situaci resit??
Diky

Brusell

Příspěvek byl upraven (09-09-06 09:46)

Problem jste popsal v podstate spravne. Ted zbyva zjistit presne ktera aplikace pamet okupuje, pak to chce prohledat logy dane aplikace. Obcas se stava, ze napr. nejaky demon generuje chybu, napr. kazdou sekundu, protoze je spatne nakonfigurovany (napr. nemuze najit nadrazene DNS) a hlaseni uklada do pametoveho bufferu, ktery stale roste… samozrejme to je ciste spekulace. Zameril bych se ale asi na tu sambu a pripadne bind…

Je mozno nejakym zpusobem zjistit, ktera aplikace bere z dlouhodobeho hlediska nejvice pameti???Sambu, BIND i DNS jsem aktualizoval na posledni verzi jiz pred casem, tak je mozne ze se jedna o nejaky jejich bug.

Diky za reakci

Pokud se to neda vycist z klasickeho ps nebo pmap pak muzete zkusit valgrind (je ve Fedore).

http://www.tldp.org/HOWTO/Valgrind-HOWTO/index.html