Thursday, March 13, 2014

Setup SAN Boot for RHEL 6.x using native multipath on EMC storage

Requirements:
1) RHEL 6.x (most apply to RHEL 5.x too, RHEL 5.x use mkinitrd instead of Dracut and the /etc/multipath.conf is slightly different refer to Red Hat KB in reference section)
2) EMC storage was setup with Active/Active (ALUA)
3) Boot LUN was presented with single path for initial install


Procedures:

1. Server boots up after initial install
2. Login to server as root to enable multipath
[root@server1]#mpathconf --enable –-with_multipathd y
3. Edit /etc/multipath.conf and make sure it only contains following valid parameters

blacklist {
}


defaults {
 user_friendly_names yes
}
devices {
  device {
    vendor "DGC"
    product ".*"
    product_blacklist "LUNZ"
    hardware_handler "1 alua"   
    path_checker directio    
    prio alua                
  }
}
4.Find out the logical path the root disk is mapped
[root@server1]#multipath –v3
It should be /dev/mapper/mpatha

5. Create initramfs with multipath module
[root@server1]#dracut --force -–add multipath

6. Make sure multipath.conf is included in initrd image
[root@server1]#lsinitrd /boot/initramfs-*.x86_64.img | grep multipath.conf
-rw-r--r--   1 root     root         2525 Feb 27 13:31 etc/multipath.conf
7. Modify the /boot/grub/device.map and change
 (hd0) /dev/sda    to
 (hd0) /dev/mapper/mpatha
This is assuming the boot disk is on /dev/mapper/mpatha as verified in step 2 above.

8. Reboot the server.

9. Verify multipath, check hwhandler='1 alua' and member disk sda for mpatha
[root@server1]#multipath –ll 
mpatha (3600601609973310067eb1e1ed69ae311) dm-0 DGC,VRAID
size=150G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 1:0:0:0 sda 8:0   active ready running
  

10. Ask storage administrator to enable other paths for boot LUN.
11. Reboot server again after multipath is aenabled in storage too
12. Login server to verify all paths, check hwhandler='1 alua' prio>0
if hwhandler='1 emc' or prio=0 means PNR mode

[root@server1]#multipath -ll
mpatha (3600601609973310067eb1e1ed69ae311) dm-0 DGC,VRAID
size=150G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 1:0:1:0 sdd 8:48  active ready running
| `- 2:0:1:0 sdj 8:144 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 1:0:0:0 sda 8:0   active ready running
  `- 2:0:0:0 sdg 8:96  active ready running
mpathb (360060160997331009fd6e124d69ae311) dm-1 DGC,VRAID
size=800G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 1:0:0:1 sdb 8:16  active ready running
| `- 2:0:0:1 sdh 8:112 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 1:0:1:1 sde 8:64  active ready running
  `- 2:0:1:1 sdk 8:160 active ready running
13. Partition other LUNS using fdisk command as normal, but use logical path /dev/mapper/mpathb etc (partition will be created as /dev/mapper/mpathbp1 instead of /dev/mapper/mpathb1
NOTE: any change to /etc/multipath.conf requires re-create initramfs ( dracut --force -–add multipath) and a reboot, because the boot LUN is on SAN, if boot LUN is local disk, change to /etc/multipath.conf only requires multipathd restart

Wednesday, March 12, 2014

Automate Server deployment with Ansible

There are many server automation applications in the market: puppet,chef,cfengine and salt. Ansible is relatively new, but I think it is better than puppet in server deployment automation tasks.
1. Dependency packages
Ansible depends on python, which is installed by default at least for Red Hat alike distributions
Puppet depends on ruby, which is not installed by default.
2. Agent
Ansible is agentless, it rely on SSH
Puppet need agent running in target server as a daemon.
3. Security
Ansible use SSH as transport method, so Username and password are required for each connection.(Ansible is smart enough to cache the SSH and sudo password, so it will be only prompted once for the first server)
Puppet: agent is controlled by master server, if master server is compromised, all hosts can be brought down easily
4. Setup
Ansible is easy to setup, as there is no agent. Ansible server is easy to setup too, there are just python scripts. You can even run it without installing it.
Puppet need packages installed in agent host or server, the agent certificate need to be signed before server can talk agent.
Ansible use SSH TCP port 22, which is standard firewall port already opened in most infrastructure.
Puppet use customized TCP port , typically 8139
5. Command line mode
Ansible supports command line mode for ad-hoc tasks, so you don’t need to write tasks definitions, just pass the command to ansbile such as return date for a number servers.
ansbile myservers –k –K –u admin –m raw –a “date”

The following example show a typical server deployment
[root@centos1 post]# cat setup.yml 
---        #ansible playbook use YAML syntax http://en.wikipedia.org/wiki/YAML
- hosts: server1          #It is a server or server group as defined in /etc/ansbile/hosts
  user: admin 
  sudo: yes
  vars_files:
    - vars/settings.yml   #global variables
    - vars/{{ ansible_hostname }}.yml             #server specific variable . ansible_hostname is variable, it is server1.yml for server1
  tasks:

  - name: yum
    action: yum name=${item}  state=present      #install yum packages
    with_items:
      - kernel-devel-{{ ansible_kernel }}
      - ed
      - ksh
      - ntp
  - script: ./scripts/sshd.sh        #- The script will insert 'UseDNS no' , - script is shorthand  for - name: XX ,action: YY

  - name: users | Delete users       #delete users delusers is list if users defined in setting.yml
    action: user name=$item state=absent
    with_items: delusers

  - name: ifcfg-eth0 | Configuration file      #ansible template engine is Jinja2 http://jinja.pocoo.org/docs/
    action: template src=./templates/ifcfg-eth0.j2 dest=/etc/sysconfig/network-scripts/ifcfg-eth0 owner=root group=root

  - name: route-eth0 | Configuration file, /etc/sysconfig/network-scripts/route-eth0
    action: template src=templates/route-eth0.j2 dest=/etc/sysconfig/network-scripts/route-eth0

  - name: resolv.conf | Configuration file, /etc/resolv.conf
    action: template src=templates/resolv.conf.j2 dest=/etc/resolv.conf

  - name: ntpd | Configuration file, /etc/ntp.conf
    action: template src=templates/ntp.conf.j2 dest=/etc/ntp.conf
    notify:
    - restart ntpd

  - name: snmpd | Configuration file, /etc/snmp/snmpd.conf
    action: copy src=files/snmpd.conf dest=/etc/snmp/snmpd.conf owner=root group=root mode=0644
    notify:
    - restart snmpd


  - copy: src=files/clock dest=/etc/sysconfig/clock owner=root group=root mode=0644
  - command: ln -fs /usr/share/zoneinfo/Australia/Sydney /etc/localtime


  handlers:
  - name: restart sshd
    action: service name=sshd enabled=yes state=restarted
  - name: restart ntpd
    action: service name=ntpd enabled=yes state=restarted
  - name: restart snmpd
    action: service name=snmpd enabled=yes state=restarted

####----Global variables 
[root@centos1 post]# cat vars/settings.yml 
#
# ntp.conf
ntpservers: [10.1.1.1, 10.1.1.2]

#users to delete
delusers: [user1, user2]

#resolv.conf
domainname: .example.com
searchdomain: [example.com]
nameservers: [10.1.1.1, 10.1.1.2]

####----Server specific  variable
[root@centos1 post]# cat vars/server1.yml 
eth1: 
   device: eth1
   ipaddr: 172.16.1.2
   netmask: 255.255.255.0
   routes: ['192.168.1.0/24 via 172.16.1.254', '192.168.2.0/24 via 172.16.1.254']
 
####----How the tempalate reference the variable
[root@centos1 post]# cat templates/resolv.conf.j2 
#
# resolver configuration file...
#
options         timeout:1 attempts:8 rotate
domain          {{domainname}}
search          {{domainname}} {{ searchdomain | join (' ') }}

{% for host in nameservers %}
nameserver {{host}}
{% endfor %}

[root@centos1 post]# cat templates/ifcfg-eth1.j2 
DEVICE={{eth1.device}}
BOOTPROTO=static
ONBOOT=yes
USERCTL=no
IPADDR={{eth1.ipaddr}}
NETMASK={{eth1.netmask}}
{% if eth1.gateway is defined  %} 
GATEWAY={{eth1.gateway}}
{%endif%}



####----a separate playbook to create LVM and file system 
[root@centos1 post]# cat setup-lvm.yml 
---
- hosts: server1
  user: admin
  sudo: yes
  gather_facts: no
  vars:
    mntp:  /opt
    vgname: vg01
    pvname: /dev/sdb1
    lv1: opt
 
  tasks:

  - script: ./scripts/disks.sh $pvname       #a script to create LVM partion and create physical volume
  - name: filesystem | Create pv,vg,lv and file systems
    action: lvg  vg=$vgname pvs=$pvname

  #- name: filesystem | create lv
  - lvol: vg=$vgname lv=$lv1 size=51196

 # - name: filesystem | create fs
  - filesystem: fstype=ext4 dev=/dev/${vgname}/${lv1}

  #- name: filesytem | mount dir
  - mount: name=${mntp} src=/dev/${vgname}/${lv1} dump=1 passno=2 fstype=ext4 state=mounted
How to run the playbook?
[root@centos1 post]# ansible-playbook -k -K setup.yml

  -k, --ask-pass        ask for SSH password
  -K, --ask-sudo-pass   ask for sudo password
Download all the files
https://drive.google.com/file/d/0B-RHmV4ubtk8Y2wyazhZRS1pSVk/edit?usp=sharing