linuxram/linuxram.md

8.9 KiB

Understanding RAM indicators in Linux

This document aims at helping a sys admin to diagnose linux ram usage problems, as it can be confusing and difficult given the complexity of the indicators.

detailed explanation of /proc/meminfo fields

Active(file)

from 1

Pagecache memory that has been used more recently and usually not reclaimed until needed.

Cached

From 1:

In-memory cache for files read from the disk (the pagecache). Doesn't include SwapCached.

Inactive(file)

From 1:

Pagecache memory that can be reclaimed without huge performance impact.

MemAvailable

from 1

An estimate of how much memory is available for starting new applications, without swapping. Calculated from MemFree, SReclaimable, the size of the file LRU lists, and the low watermarks in each zone. The estimate takes into account that the system needs some page cache to function well, and that not all reclaimable slab will be reclaimable, due to items being in use. The impact of those factors will vary from system to system.

Shmem

From 1

Total memory used by shared memory (shmem) and tmpfs.

references

On alambix97, MemTotal has the value 196498248, while the actual size of physical ram is expected to be (12 dimms of 16 GiB)

>>> 12*16*1024*1024
201326592

moreover, alambix98 has the same amount of physical ram than alambix97, but its value of meminfo/MemTotal is not exactly the same :

root@alambix98:~# cat /proc/meminfo  | grep MemTotal
MemTotal:       196498260 kB

from2

MemTotal — Total amount of usable RAM, in kibibytes, which is physical RAM minus a number of reserved bits and the kernel binary code.

Rik van Riel's comments when adding MemAvailable to /proc/meminfo:

/proc/meminfo: MemAvailable: provide estimated available memory

Many load balancing and workload placing programs check /proc/meminfo to estimate how much free memory is available. They generally do this by adding up "free" and "cached", which was fine ten years ago, but is pretty much guaranteed to be wrong today.

It is wrong because Cached includes memory that is not freeable as page cache, for example shared memory segments, tmpfs, and ramfs, and it does not include reclaimable slab memory, which can take up a large fraction of system memory on mostly idle systems with lots of files.

Currently, the amount of memory that is available for a new workload, without pushing the system into swap, can be estimated from MemFree, Active(file), Inactive(file), and SReclaimable, as well as the "low" watermarks from /proc/zoneinfo.

However, this may change in the future, and user space really should not be expected to know kernel internals to come up with an estimate for the amount of free memory.

It is more convenient to provide such an estimate in /proc/meminfo. If things change in the future, we only have to change it in one place.

- MemAvailable: An estimate of how much memory is available for starting new applications, without swapping. Calculated from MemFree, SReclaimable, the size of the file LRU lists, and the low watermarks in each zone. The estimate takes into account that the system needs some page cache to function well, and that not all reclaimable slab will be reclaimable, due to items being in use. The impact of those factors will vary from system to system. 
- Active(file): Pagecache memory that has been used more recently and usually not reclaimed until needed. 
- Inactive(file): Pagecache memory that can be reclaimed without huge performance impact. 
- KReclaimable: Kernel allocations that the kernel will attempt to reclaim under memory pressure. Includes SReclaimable (below), and other direct allocations with a shrinker. 
- SReclaimable: Part of Slab, that might be reclaimed, such as caches. 
- SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure.
- Shmem: Total memory used by shared memory (shmem) and tmpfs.
- Cached: In-memory cache for files read from the disk (the pagecache). Doesn't include SwapCached. 
- SwapCached: Memory that once was swapped out, is swapped back in but still also is in the swapfile (if memory is needed it doesn't need to be swapped out AGAIN because it is already in the swapfile. This saves I/O). 

The tmpfs is counted under shmem, but is also added into the "cached" portion. In older Linux (kernel + procps), this was used to determine the "Free" memory, but this was pretty problematic, since most of us see cached memory as immediately reclaimable. This is not the case anymore with tmpfs.

On a recent system (kernel >= 3.14) you will find something new under /proc/meminfo:

MemAvailable:    xxxx kB

This does take all these elements into account, and as long as htop and free were to rely on this value, you would get an accurate representation. Note that on my Debian 8 system, even though the kernel knows MemAvailable, this is not the case:

ardi@oab1ardi-mcdev:~/mc/oattest1/workspace/bcm_linux_3_4rt$ cat /proc/meminfo | grep Avail
MemAvailable:    **1319148** kB

ardi@oab1ardi-mcdev:~/$ free
             total       used       free     shared    buffers     cached
Mem:       2058360    1676332     382028      33116      40356     933916
-/+ buffers/cache:     702060    **1356300**
Swap:            0          0          0

ardi@oab1ardi-mcdev:~/$ sudo dd if=/dev/zero bs=1M count=200 of=/run/delme
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 0.0628098 s, 3.3 GB/s

ardi@oab1ardi-mcdev:~/$ free
             total       used       free     shared    buffers     cached
Mem:       2058360    1881060     177300     237916      40372    1138720
-/+ buffers/cache:     701968    **1356392**
Swap:            0          0          0

ardi@oab1ardi-mcdev:~/mc/oattest1/workspace/bcm_linux_3_4rt$ cat /proc/meminfo | grep Avail
MemAvailable:    **1114152 kB**

A final sidenote:

In fact, tmpfs can be pretty dangerous. Unlike other types of memory usage, tmpfs files cannot be cleaned up by an OOM killer, nor is there any record of which process actually created the tmpfs files. Hence why debian 8 for example chooses not to use tmpfs for /tmp (which any process could write to).

Credits to the following links: https://linuxraj.wordpress.com/2015/03/10/memory-utilization-from-procmeminfo-memavailable/ https://rwmj.wordpress.com/2012/09/12/tmpfs-considered-harmful/
Read about tmpfs here. The following is copied from that article, explaining the relation between shared memory and tmpfs in particular.

1) There is always a kernel internal mount which you will not see at
   all. This is used for shared anonymous mappings and SYSV shared
   memory. 

   This mount does not depend on CONFIG_TMPFS. If CONFIG_TMPFS is not
   set the user visible part of tmpfs is not build, but the internal
   mechanisms are always present.

2) glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for
   POSIX shared memory (shm_open, shm_unlink). Adding the following
   line to /etc/fstab should take care of this:

    tmpfs   /dev/shm    tmpfs   defaults    0 0

   Remember to create the directory that you intend to mount tmpfs on
   if necessary (/dev/shm is automagically created if you use devfs).

   This mount is _not_ needed for SYSV shared memory. The internal
   mount is used for that. (In the 2.3 kernel versions it was
   necessary to mount the predecessor of tmpfs (shm fs) to use SYSV
   shared memory)

So, when you actually use POSIX shared memory (which i used before, too), then glibc will create a file at /dev/shm, which is used to share data between the applications. The file-descriptor it returns will refer to that file, which you can pass to mmap to tell it to map that file into memory, like it can do with any "real" file either. The techniques you listed are thus complementary. They are not competing. Tmpfs is just the file-system that provides in-memory files as an implementation technique for glibc.

As an example, there is a process running on my box currently having registered such a shared memory object:

# pwd
/dev/shm
# ls -lh
insgesamt 76K
-r-------- 1 js js 65M 24. Mai 16:37 pulse-shm-1802989683
#