Thursday, October 13, 2011

Collectl, an all-in-one tool for collecting Linux statistical data

Collectl,collect for Linux, is a single tool which integrates functions of various tools:sar,iostat,mpstat,top,slaptop,netstat,nfstat,ps .. .
- Supported: Linux
- Requirement: Perl
Collectl features:
- run in command line or run as daemon
- Various output formats: raw,gunplot,gexprt(ganglia),sexpr,lexpr,csv(--sep ,)
- Send data to other programs (ganglia) remotely via socket instead of writing to a file
- IPMI monitoring for fans and temperature sensors
- Support module (Perl scripts)  for customized checks
- Monitor process’s disk read/write, find the top processes keeping disk busy
The last one is the most impressive feature, I haven’t found other Linux tools can do it. (DTtrace can in Solaris)
collectl  examples
#help, all options 
$collect –x
#-s?, what to monitor:c – cpu  d – disk “collectl   --showsubsys”
#-c 5 : collect 5 samples and exit
#-oT:  T - preface output with time only ; “collectl   --showoptions”
$collectl   -sc -c5 -i2 --verbose -oT
waiting for 2 second sample...
# CPU SUMMARY (INTR, CTXSW & PROC /sec)
#Time      User  Nice   Sys  Wait   IRQ  Soft Steal  Idle  CPUs  Intr  Ctxsw  Proc  RunQ   Run   Avg1  Avg5 Avg15
12:39:34      0     0     0     0     0     1     0    97     1  1082     23     0    76     1   0.42  0.42  0.44
12:39:36      0     0     0     0     0     1     0    97     1  1088     24     0    76     1   0.42  0.42  0.44

The following demonstrates how collectl identify the process reading/writing most data to disk
#Hammer disk by writing 50mb data with dd
$dd if=/dev/urandom of=test bs=1k count=50000
#collectl identifies the “dd” process
#in top mode, sort by  “iokb   total I/O KB” ; “collectl –showtopopts”
$collectl -i2  --top iokb
TOP PROCESSES sorted by iokb (counters are /sec) 12:50:31
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
6861  root     18  6784    0 R    3M  572K  0  0.91  0.00  45   0:00.91    0 3680    0   97 dd
1  root     15     0    0 S    2M  632K  0  0.00  0.00   0   0:28.21    0    0    0    0 init
2  root     RT     1    0 S     0     0  0  0.00  0.00   0   0:00.00    0    0    0    0 migration/0

2 comments:

  1. glad you're finding collectl useful. I just pushed a new version out to sourceforge so you may want to snatch a new copy.

    I'd also recommend you try out both colplot and colmux in the collectl-utils project at sourceforge. colplot is very slick and easy to use to generate plots of your collectl data via a broswer. colmux allows you to multiplex multiple sessions to multiple nodes and present the output in top-like format for almost anything collectl can report.

    -mark

    ReplyDelete
  2. It seems a good tool, i gonna test it. I was trying to make one by myself but this is exactly what i need. Indeed i achieved some results for monitoring my cluster performance but i had to combine the output of several tools like sar, vmstat and uptime. As soon as i have results i will share my comments :)

    ReplyDelete