Thursday, June 20, 2013

VMware PowerCLI: Map datastore name to LUN devicename.

It is not obvious as it is thought to be to map datastore name to LUN devicename in native PowerCLI codes,
The  esxcli interface exposed to PowerCLI  make it very easy.(Tested in ESXi 5.0)
PowerCLI>$esxcli=get-esxcli -vmhost esx01
PowerCLI>$ | ft devicename,volumename -autosize

DeviceName                           VolumeName
----------                           ----------
naa.600601605bc02e00007fb97cacbee211 datastore01
naa.600601605bc02e00fac8cd88acbee211 datastore02

Script to automatically partition a new disk and create LVM PV

It is a very common task to create a single partition on  whole disk  and create LVM PV, How to automate it?

fdisk doesn't support making partition in script mode, sfdisk can, but it is not as good as the powerful parted tool. parted can also optimize partition alignment automatically(parted -a optimal).

#Create a single primary partiton with whole disk size and create LVM PV on it
if [[ -z $disk ]]; then
 echo "Usage: $0 disk device name: e.g $0 /dev/sdb"

if [[ -e ${disk}${partno} ]]; then
 echo "==> ${disk}${partno} already exist"

echo "==> Create MBR label"
parted -s $disk  mklabel msdos
ncyl=$(parted $disk unit cyl print  | sed -n 's/.*: \([0-9]*\)cyl/\1/p')

if [[ $ncyl != [0-9]* ]]; then
 echo "disk $disk has invalid cylinders number: $ncyl"

echo "==> create primary parition  $partno with $ncyl cylinders"
parted -a optimal $disk mkpart primary 0cyl ${ncyl}cyl
echo "==> set partition $partno to type: lvm "
parted $disk set $partno lvm on
partprobe > /dev/null 2>&1
echo "==> create PV ${disk}${partno} "
pvcreate ${disk}${partno}

Friday, April 19, 2013

Understanding SysV-style Initscripts in Red Hat Linux

It is often needed to write your own init start/stop script, the following is the minimum requirement for your script to behave as expected. The discussion is based on Red Hat Linux family, other distributions like Debian use LSB (Linux Standard Base Specification) Init Scripts.

Location of the script

/etc/init.d is the well known location, but actually /etc/rc.d/init.d is the real original location. Since /etc/init.d is a hard link to /etc/rc.d/init.d, it makes no difference.

Header of the script

It needs at least 3 lines.  The shell script interpreter (/bin/sh, /bin/bash .. etc), the chkconfig header and script description
#   chkconfig: 345 56 10
#   description: Startup/shutdown script for the Common UNIX 

Body of the script

Obviously, it need  to accept parameter “start”, which  /etc/rc3.d/S* will call  on OS startup and accept parameter “stop”, which /etc/rc0.d/K* script will call on OS shutdown.
The lockfile is often overlooked, it is used to check the existence of the daemon on OS shutdown, otherwise the stop action won’t be called. If you found an issue that a script started on OS startup but never stop properly on shutdown, you need to create lockfile. note: lockfile is not pidfile which contains PID of the process, lockfile is usually a blank file.
lockfile=/var/lock/subsys/$(basename $0)
case $1 in
  [ $? = 0 ] && touch $lockfile
  [ $? = 0 ] && rf –f  $lockfile


It is recommended to import functions in /etc/rc.d/init.d/functions to use ‘daemon’ to startup your application or killproc to shutdown your application instead of reinventing the wheel.

LSB headers

You may see something like this in an init script.
 # Provides: boot_facility_1 [ boot_facility_2 ...]
 # Required-Start: boot_facility_1 [ boot_facility_2 ...]
 # Required-Stop: boot_facility_1 [ boot_facility_2 ...]
 # Should-Start: boot_facility_1 [ boot_facility_2 ...]
 # Should-Stop: boot_facility_1 [ boot_facility_2 ...]
 # Default-Start: run_level_1 [ run_level_2 ...]
 # Default-Stop: run_level_1 [ run_level_2 ...]
 # Short-Description: short_description
 # Description: multiline_description
They are LSB(Linux Standard Base) headers, they are supported by default in Debian and SUSE Linux.
Red Hat Linux supports this by additional package “redhat-lsb” and it is not installed by default, Be warned,50+ dependences need to installed as well.


Sunday, March 24, 2013

Configure multipath on Solaris 11 for IBM V7000 SAN storage

IBM V7000 is not listed from command “mpathadm show mpath-support”, but it is still supported, Solaris 11 mpxio supports any third party storage device that is T10/T11 standards-compliant.
Procedures to setup multipath
#Setup zoning in SAN switch
#Login to V7000 management UI to map Solaris host to the volume, select host type ‘TPGS’
#Rescan new SAN disks without rebooting
$cfgadm -o force_update -c configure cX (X is the port id as shown cfgadm -al)
#verify SAN disks are detected.
#create scsi_vhci.conf
#scsi_vhci.conf doesn’t need to customized, scsi-vhci-failover-override parameter is optional, IBM V7000 is detected as f_tpgs with the standard probe.
$cp /kernel/drv/scsi_vhci.conf /etc/driver/drv/scsi_vhci.conf

#Obtain the device path of the fc ports of a single HBA
$ls -l /dev/cfg
lrwxrwxrwx   1 root     root          60 Feb 28 10:20 c4 -> ../../devices/pci@400/pci@2/pci@0/pci@8/SUNW,qlc@0/fp@0,0:fc
lrwxrwxrwx   1 root     root          62 Feb 28 10:20 c5 -> ../../devices/pci@400/pci@2/pci@0/pci@8/SUNW,qlc@0,1/fp@0,0:fc

the path needed is the string between ./devices/ and fp@, so the paths are  

#Create fp.conf
cp /kernel/drv/fp.conf /etc/driver/drv/fp.conf
#edit fp.conf to enable multipath for the two fc ports only
name="fp" parent="/pci@400/pci@2/pci@0/pci@8/SUNW,qlc@0" port=0 mpxio-disable="no";
name="fp" parent="/pci@400/pci@2/pci@0/pci@8/SUNW,qlc@0,1" port=0 mpxio-disable="no";
#run the command to enable multiple path on fc ports only, server will need to be rebooted.
$stmsboot –u –D fp

$ echo| format will show only a single disk
$stmsboot -L
non-STMS device name                    STMS device name
/dev/rdsk/c4t50050768024046D8d0 /dev/rdsk/c0t6005076802830163A000000000000005d0
/dev/rdsk/c4t50050768022046D9d0 /dev/rdsk/c0t6005076802830163A000000000000005d0
/dev/rdsk/c5t50050768022046D8d0 /dev/rdsk/c0t6005076802830163A000000000000005d0
/dev/rdsk/c5t50050768024046D9d0 /dev/rdsk/c0t6005076802830163A000000000000005d0
$mpathadm list lu
                Total Path Count: 1
                Operational Path Count: 1
                Total Path Count: 1
                Operational Path Count: 1
                Total Path Count: 4
                Operational Path Count: 4
$mpathadm show lu /dev/rdsk/c0t6005076802830163A000000000000005d0s2

#the disk is deteced as f_tpgs as shown in messages log
$grep f_ /var/adm/messages
Mar 21 13:56:46 dnmsovm1 scsi: [ID 583861] ssd4 at scsi_vhci0: unit-address g6005076802830163a000000000000005: f_tpgs

Friday, February 8, 2013

Monitor customized application in Windows by SNMP

The native SNMP service  in Windows can provide basic metrics like CPU, memory and disk etc, but it doesn’t have “extend” feature in net-snmp, which allows you run a script for application monitoring. Net-snmp can’t be used as replacement for Windows SNMP service because some SNMP extension agent relies on it and known issue like HOST-RESOURCES MIB doesn’t work in net-snmp. 

 The good news is that you can have net-snmp co-exist with Windows SNMP, you can have nice features like extend ability, in the mean time, pass the other functions to native Windows SNMP service.

As of Net-SNMP 5.4, the Net-SNMP agent is able to load the Windows SNMP service extension DLLs by using the Net-SNMP winExtDLL extension. The extension requires the net-snmp binary to be native (32bit net-snmp extension won’t work in 64bit Windows).

Net-snmp 64bit binary is hard to find, it seems only net-snmp-5.5.0-2 has 64bit binary pre-compiled, you might need to compile yourself for other versions. 

Install net-snmp

Run the net-snmp binary installer select “with Windows Extenstion” instead of standard agent, unselect “net-snmp trap service” and “Perl SNMP modules”, the default path is c:\usr

Configure net-snmp

Register net-snmp as Windows service

Edit c:\usr\registeragent.bat to disable modules conflicting to Windows   by adding parameter.
(Note: if system_mib is also disabled, SNMPv2-MIB::sysuptime won’t report correct time)
Run c:\usr\registeragent.bat

Edit C:\usr\etc\snmp\snmpd.conf

rocommunity public
#Test extend feature to execute a script, the script path must use Unix style ‘/’
extend userscript c:/temp/test1.bat

Start Windows service “net-snmp agent”(Native SNMP service must be stopped)


#Test standard SNMP metrics, the HOST-RESOURCES-MIB is provided by native SNMP service, not net-snmp
[root@zabbix]#/usr/bin/snmpwalk -v 2c  -c public   HOST-RESOURCES-MIB::hrSystemUptime
HOST-RESOURCES-MIB::hrSystemUptime.0 = Timeticks: (640892116) 74 days, 4:15:21.16

#The extend feature is provided by net-snmp, Execute the script by snmpwalk
[root@zabbix]#/usr/bin/snmpwalk -v 2c -Ov -c public 'NET-SNMP-EXTEND-MIB::nsExtendOutLine."userscript"'
STRING: web-time=80
STRING: web-status=[ok]


Check which Windows modules loaded, start snmpd in command line with debugging “WinExtDLL”
Snmpd.exe -I-udp,udpTable,tcp,tcpTable,icmp,ip,interfaces,snmp_mib  -DwinExtDLL 



Thursday, February 7, 2013

Shell script to check Oracle Tablespace usage

I searched a shell script to check Oracle Tablespace usage, most scripts returned use complex SQL statements and they don’t report usage accurately, because auto-extend or multiple data files was not taken into account for calculation. Actually, there is a built-in view “dba_tablespace_usage_metrics” for the purpose starting from Oracle 10g. 
The following script check the Oracle database availability or tablespace usage and measure the response time.The scripts output “key=value” format, which can be easily discovered by LLD in Zabbix.(with LLD, Zabbix can dynamically discover any number of items to monitor without adding the items manually )

Script sample output

db-time= 71
db-status=[OK]: Name:SYSAUX SizeMB:1024 Used%: 73 ; Name:SYSTEM SizeMB:1024 Used%: 72 ; Name:USERS SizeMB:5 Used%: 20 ; Name:TEMP SizeMB:2048 Used%: 2 ; Name:UNDOTBS1 SizeMB:2048 Used%: 1 ;  8 rows selected.

The Oracle login in the script should have permission to read the view or have “select_catalog_role” role granted.

Script detail

function checkdb {

t1="$(date +%s%N)"

rt=$($ORACLE_HOME/bin/sqlplus -S ${OUSER}/${OPASS}@${TNSNAME}<< _END
set heading off
set linesize 200
   'Name:'|| tablespace_name,
   'SizeMB:'||round(TABLESPACE_SIZE*8/1024)||' Used%:',
order by 3 desc;

t2="$(date +%s%N)"
echo "db-time= $((($t2 - $t1)/1000000))"
#remove blank lines,ignore UNDOTBS,get the numeric value by removing tab and spaces
tbpct=$(echo "$rt" | egrep -v '^$|UNDOTBS' | head -1 | sed 's/.*Used%:\(.*\);/\1/'  |  sed 's/[ \t]*//g')
#Critical condition: thresh-hold > 95 or non-numeric value returned
if [ $tbpct -gt 95 ] || [[ "$tbpct" != +(\d) ]] ; then
 echo "db-status=[CRITICAL]:" $rt
 echo "db-status=[OK]:" $rt