Monday, July 26, 2010

Dissect Linux memory cache

Command “free” has buffers and cached columns, What is the difference? How to dig further to find the size of Dentry cache, inode cache and page cache in cached column.

Difference between Buffers and cached


$free -m
total       used       free     shared    buffers     cached
Mem:          3777       3746         31          0        160        954
-/+ buffers/cache:       2631       1145
Swap:          753          1        751

Buffer Pages
Whenever the kernel must individually address a block, it refers to the buffer page that holds the block buffer and checks the corresponding buffer head.
Here are two common cases in which the kernel creates buffer pages:
· When reading or writing pages of a file that are not stored in contiguous disk blocks. This happens either because the filesystem has allocated noncontiguous blocks to the file, or because the file contains "holes"
· When accessing a single disk block (for instance, when reading a superblock or an inode block).


Raw disk operation such dd use buffers.
Read 10M of raw disk block
$dd if=/dev/sda6 of=/dev/zero bs=1024k count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.209051 seconds, 50.2 MB/s

Buffers sized increased by 10MB( 170M-160M)
$free -m
total used free shared buffers cached
Mem: 3777 3754 23 0 170 952
-/+ buffers/cache: 2631 1145
Swap: 753 1 751
Dentry cache, inode cache and page cache in cached column.
Dentry cache: Directory Entry Cache, pathname (filename) lookup cache.
Inode cache: Cache for inode, not actual data block.
Page cache: Cache for actual data block
[FROM: http://www.mjmwired.net/kernel/Documentation/filesystems/vfs.txt ]
The combined value of dentry and inode cache is not bigger than whole slab size
$grep -i slab /proc/meminfo 
Slab: 183896 kB

Examine in detail by checking /proc/slabinfo.
$ awk '/dentry|inode/ { print $1,$2,$3,$4}' /proc/slabinfo 
nfs_inode_cache 122787 123312 984
rpc_inode_cache 24 25 768
ext3_inode_cache 9767 9770 776
mqueue_inode_cache 1 4 896
isofs_inode_cache 0 0 624
minix_inode_cache 0 0 640
hugetlbfs_inode_cache 1 7 576
ext2_inode_cache 0 0 728
shmem_inode_cache 441 455 776
sock_inode_cache 231 235 704
proc_inode_cache 670 756 608
inode_cache 2415 2415 576
dentry_cache 99060 110162 200
#sum up them (bytes)
$awk '/dentry|inode/ { x=x+$3*$4} END {print x }' /proc/slabinfo 
153348952

view live stats with slabtop by sorting by cache size
$slabtop -s c
Active / Total Objects (% used) : 332028 / 361849 (91.8%)
Active / Total Slabs (% used) : 45630 / 45631 (100.0%)
Active / Total Caches (% used) : 102 / 139 (73.4%)
Active / Total Size (% used) : 171443.16K / 175338.12K (97.8%)
Minimum / Average / Maximum Object : 0.02K / 0.48K / 128.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
125948 125864 99% 0.96K 31487 4 125948K nfs_inode_cache
113962 105262 92% 0.20K 5998 19 23992K dentry_cache
14609 14460 98% 0.52K 2087 7 8348K radix_tree_node
9770 9765 99% 0.76K 1954 5 7816K ext3_inode_cache
59048 44057 74% 0.09K 1342 44 5368K buffer_head
2415 2410 99% 0.56K 345 7 1380K inode_cache

I haven’t found to way to get page_cache size directly, It needs bit of calculation page_cache=~ cached – inode – dentry
alternatively observe the value change  by releasing pagecache

#write dirty pages to disk
sync
#To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
#To free dentries and inodes:
echo 2 > /proc/sys/vm/drop_caches
#To free pagecache, dentries and inodes:
echo 3 > /proc/sys/vm/drop_caches

As this is a non-destructive operation and dirty objects are not freeable, it is highly recommended to run command “sync” first.
#Suppress page cache usage.
sysctl -w vm.swappiness=0

vm.swappiness  # value range: 0 - 100, lower value tends to shrink page cache to get free memory, higher value  tends to use  swap to get free memory

Friday, July 16, 2010

Debugging issues with strace in Linux.

strace runs the specified command until it exits.  It intercepts and records the system calls. -T option shows the time spent in system calls. It is particularly useful to troubleshoot slow response issues, because you can pinpoint the step taking the longest time. (-r option  Print  a  relative timestamp upon entry to each system call, which is actually  the time spent in last system call, It is easier to read than –T output because it is displayed in first column)

#The following Telnet command took 20 secs to response, Was it issue with DNS or web server?  (It is an obvious DNS issue, but I just want to demonstrate how can strace pinpoint the issue.)
$time strace -f -F -i -r -t -T -v -o /tmp/trace.log telnet  www.google.com 80
telnet: could not resolve www.google.com/80: Name or service not known
real    0m20.080s
user    0m0.010s
sys     0m0.050s
#List line number and time spent by sorting time

$awk '{ print "LINE#"NR, $1}'  /tmp/trace.log | sort -nk2 | tail -5
LINE#55 0.010000
LINE#140 5.000075
LINE#154 5.000075
LINE#136 5.000076
LINE#150 5.000076
 #Print out the lines in question. it is clear that DNS timed out on waiting response from DNS server 100.0.0.23, it tried four times(the remaining 3 timeout were not included here) each time took 5 secs.
$awk '{  if ( NR > 125  &&  NR <= 136 ) {print "LINE#"NR, $0 } }' /tmp/trace.log
LINE#126      0.000000 [b7e601d1] stat64("/etc/resolv.conf", {st_dev=makedev(117, 0), st_ino=50235, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=83, st_atime=2010/07/16-09:47:25, st_mtime=2010/07/16-09:45:02, st_ctime=2010/07/16-09:45:02}) = 0 <0.000000>
LINE#127      0.000000 [b7e2a0f1] gettimeofday({1279237645, 625155}, NULL) = 0 <0.000000>
LINE#128      0.000000 [b7e72402] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 <0.000000>
LINE#129      0.000000 [b7e71f0c] connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("100.0.2.3")}, 28) = 0 <0.000000>
LINE#130      0.000000 [b7e61e88] fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR) <0.000000>
LINE#131      0.000000 [b7e61e88] fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000000>
LINE#132      0.000000 [b7e2a0f1] gettimeofday({1279237645, 625155}, NULL) = 0 <0.000000>
LINE#133      0.000000 [b7e67296] poll([{fd=4, events=POLLOUT}], 1, 0) = 1 ([{fd=4, revents=POLLOUT}]) <0.000000>
LINE#134      0.000000 [b7e7220c] send(4, "B\262\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1"..., 32, MSG_NOSIGNAL) = 32 <0.000000>
LINE#135      0.000000 [b7e67296] poll([{fd=4, events=POLLIN}], 1, 5000) = 0 (Timeout) <5.000076>
LINE#136      5.000076 [b7e2a0f1] gettimeofday({1279237650, 625231}, NULL) = 0 <0.000000>

Wednesday, July 7, 2010

Master Bash command line

You may already know the bash shortcut keys: CTL+A/CTL+E/CTL+R ... , but do you know how the shortcut keys are defined and any other useful shortcut keys?
The functions of short keys come from readline library, bash seems to be the only shell supports readline.
In addition to the shell capability, its editing mode has to be set to emacs instead of vi

$set -o | egrep '^vi | ^emacs'
emacs           on
vi              off

To check current key bindings:
$bind -p | grep  \[CM]
…
"\M-.": yank-last-arg
"\M-_": yank-last-arg
"\M-\C-y": yank-nth-arg
"\M-y": yank-pop
The text C-k is read as `Control-y', Control is Ctrl key.
The text M-k is read as `Meta-k', Meta key is ESC/ALT key
The following is not a complete list of short keys,check
readline readme for a complete list)

Example command:

$ echo one two three
one two three
TASK #1 paste the last argument: the word “three”
m-. insert last argument of the previous command( special variable $_ also refer to last argument, it works for ksh/bash)


TASK #2 paste the 1st argument or the command.
m-c-y : to paste 1st argument “one”


TASK #3 paste the command or the nth argument
m-2-m-c-y : to paste the 2nd argument “two”;

m-0-m-c-y : to paste the command “echo” itself.

TASK #4 delete word “three” to change it to “echo one two”
Step #1) Retrieve last command:
Use either of thee options: Up arrow key ; c-p key ; c-r key to search, keep pressing c-r to find the next match
Step #2) Delete the word.
c-w Delete from the cursor to the previous whitespace(you don’t need to type backspace five times to delete the word)
m-d is the opposite of c-w is , so to delete “echo”; the key sequence will be c-p c-a m-d

Less used text keys to cut text: c-k and c-u
C-k Kill the text from the current cursor position to the end of the line
C-u Kill backward from the cursor to the beginning of the current line.
To delete text “echo one”: c-p m-b m-b c-u (m-b is to back one word)


TASK #5 swap “three” “two” change it to “echo one three two”
c-p m-b c-w c-e c-y the key-point is to cut (c-w) word “two” and paste(c-y) it to end of line


TASK #6 undo the changes
m-r Undo all changes made to this line
c-_ Incremental undo, separately remembered for each line


###Other shell tips
CDPATH: directory search path Instead of typing full path, firstly adding the parent dir to CDPATH ENV variable(save it to .profile for permanent change), then cd dir-name will go to the dir
$ export CDPATH=/var/log/
$cd audit
/var/log/audit
OLDPWD: - is equivalent to $OLDPWD; When it is used as the operand, this shall be equivalent to the command:
cd "$OLDPWD" && pwd ; (reference man ksh or man bash)


$cd /tmp
$cd /var/tmp
$cd -
/tmp
DIRS
“-“ only remember last dir, for dirs >2 , you can use “pushd .” to remember any number of dirs;
the dir names can be displayed with “dirs” command or go to the last dir with “popd” command