Saturday, October 22, 2022

Build Custom AWS Config rules using Guard DSL

I want to use AWS config to report S3 buckets without public blocking enabled except for some buckets with prefix like www, there is a AWS managed rule which can achieve this with a limitation: the excluded buckets must be provided as exact names separated by comma, regex match is not supported.

 

Previously, to create a custom rule, you would have to define an AWS Lambda function, from June 2022, you can author AWS Config custom rules using Guard DSL without needing to develop AWS Lambda functions.

 

What is Guard DSL?

Guard DSL is open-source policy-as-code domain-specific language (DSL) to write rules and validate JSON- and YAML-formatted data such as CloudFormation Templates, K8s configurations, and Terraform JSON plans/configurations against those rules. It is language developed by CloudFormation Guard project.

 Ignore the word CloudFormation here, Guard DSL is mainly used for validating CloudFormation template, but in the AWS config context, it has nothing to do with CloudFormation.

 

Get started with Guard DSL

Learn Guard DSL syntax

Ignore all object reference name from above document, the names are CloudFormation Objects. For the purpose of AWS config, refer to aws-config-resource-schema 

Sample AWS Config Rule with Guard DSL.

Create AWS Config Rule->Create custom rule using Guard

Scope of changes ->Resource Type(S3 Buckets)

Rule Content: 

rule s3_bucket_is_public when
    resourceType == "AWS::S3::Bucket" configuration.name != /^www/ {
        supplementaryConfiguration.PublicAccessBlockConfiguration.blockPublicPolicy exists
        supplementaryConfiguration.PublicAccessBlockConfiguration.blockPublicAcls exists
        supplementaryConfiguration.PublicAccessBlockConfiguration.restrictPublicBuckets exists
        supplementaryConfiguration.PublicAccessBlockConfiguration.ignorePublicAcls exists
    }

rule s3_bucket_not_public when
    s3_bucket_is_public {
    supplementaryConfiguration.PublicAccessBlockConfiguration {
        blockPublicAcls == true
        blockPublicPolicy == true
        restrictPublicBuckets == true
        ignorePublicAcls == true
    }
}
The rules are evaluated in order, evaluation continues regardless the result of previous rule.
PublicAccessBlockConfiguration block  has  conditions separated by new line, which implies AND
Please note the first rule name is being used as condition in the second rule

- when bucket with www prefix is being evaluated, the result is not_applicable for both conditions.
- when bucket with public blocking enabled is being evaluated, the result is compliant for both conditions
- when bucket without public blocking enabled is being evaluated, 1st rule result is not_compliant, 2nd rule result is not_applicable. effective result is not_compliant

Friday, February 1, 2019

Send Docker container log to Splunk

Docker engine has Splunk logging driver to send container logs to Splunk via HEC(http event collector. It is easy to setup, however it could suffer data loss if HEC is down. That is the advantage of file based data ingestion which can retry and resume even splunk service is down for long period of time. The default Docker logging driver is json-file, there is no problem for Splunk agent to read the file, however how could you set index for different application, since the container id is random so the file path is random. The trick to use jounald driver and use rsyslog rules to read the jounald log and write to different file names based on any docker label.


Overview:
steps included:
- enable persistent journald storage
- create rsyslog rule to read journald and write to json log file,file name depends on docker_appnane

remaining steps to be achieved by other Ansible roles
- docker container need to set log driver to journald and expose label docker-appname e.g
docker run --log-driver=journald \
--log-opt labels=docker-appname \
--label docker-appname=mulesoft \

- Splunk UF read /var/log/docker-{{docker_appname}}.log and forward to splunk cloud
- logrorate rule to rotate /var/log/docker-{{docker_appname}}.log

Rsyslog rule in RHEL7
#imjournal module is loaded in main syslogd.conf

$IMJournalStateFile imjournal.state
$imjournalRatelimitInterval 300
$imjournalRatelimitBurst 30000

module(load="mmjsonparse") 

action(type="mmjsonparse")

#output all json fileds and remove redundant last msg field
#template(name="jsonformat" type="string" string="%$!all-json:R,ERE,1,FIELD:(.*), (\"msg\":.*)--end% }\n")
template(name="jsonformat" type="string" string="%$!all-json%\n")

if ($!DOCKER_APPNAME == "{{docker_appname}}") then {
    action(type="omfile" file="/var/log/docker-{{docker_appname}}.log" template="jsonformat")
    stop
} 


Splunk customized source type to be set in Heavy forwarder or Indexer(universal forwarder doesn't support sourcetype defination)

props.conf
[json_realtime_timestamp]
KV_MODE = json
MAX_TIMESTAMP_LOOKAHEAD = 16
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %s%6N
TIME_PREFIX = "_SOURCE_REALTIME_TIMESTAMP":\s*"
pulldown_type = 1
TZ=UTC


inputs.conf in universal forwarder in Docker host
[monitor:///var/log/docker-mulesoft.log]
disabled = true
sourcetype=json_realtime_timestamp
index = mulesoft

Wednesday, July 18, 2018

Python script to generate Ansible ini inventory file from csv file

Ansible in memory inventory file created by add_host is often used in AWS EC2 provisioning. Inventory file can be generated easily,however it has drawback. Because it is in memory, all server post build tasks have to be in one big playbook. Which means it is not easy to re-run failed tasks if there is failure and existing post build playbooks can't be reused.

I created a Python script to generate a temporary inventory file from csv file used in EC2 provisioning. The inventory file can be used in multiple post build playbooks. The file name is static, however it will not be overwritten,if you set concurrent build limit to 1 in CI/CD server.
Some AWS EC2 instances in my company need static hostname  The ip field will be changed automatically with the EC2 private IP return right after provisioning and and there is a playbook to create host record in infoblox. the group are Ansible group vars,multipe groups are separated by semicolon and the order is important,vars in last group will take precedence
The csv file
name,ip,group,zone,env
awselk1,,elasticsearch;elasticsearch-master,2a,prod
awselk2,,elasticsearch;elasticsearch-data,2a,prod

The script
#!/usr/bin/python
# Takes a file CSV file "xxx.csv" and outputs xxx.ini for Ansible host inventory data
import csv
import sys
import os
 
if len(sys.argv) <= 1:
   print "Usage:" +sys.argv[0]+" input-filename"
   sys.exit(1) 
net_dn = {'prod':'prod.example.com', 'preprod':'preprod.example.com',
          'test':'test.example.com', 'dev':'dev.example.com'}
groups = []
envs = set()
hosts_ini = {}

csvname = sys.argv[1]
scriptpath = os.path.dirname(sys.argv[0])

ansible_ini = os.path.join(scriptpath, 'hosts-aws-tmp.ini')

lines = []
hosts_text = ''
with open(csvname) as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        domain = net_dn[row['env'].strip()]
        line = row['name'].strip()+'.'+domain
        #lines.append(line)
        envs.add(row['env'])
        # support multiple groups separated by ;
        for g in row['group'].strip().split(';'):
          g = g.strip()
          if (not g in groups):
            groups.append(g)
          hosts_ini.setdefault(g, []).append(line)

#groups=set(groups)
if ( len(envs) !=1 ):
   print "ERROR: only single enviroment is supported!"
   sys.exit(1)
env = list(envs)[0]
env_text = "["+env+":children]"+"\n"+"\n".join(groups)   
vars_text = "\n\n["+env+":vars]"
vars_text += """
ansible_user=ansible
ansible_ssh_private_key_file=~/.ssh/id_rsa
ansible_become=true
ansible_become_user=root
ansible_become_method=sudo
ansible_gather_facts=no
"""
vars_text+="aws_env=aws-"+env+'\n'
#generate groups in order as input
for g in groups:
   hosts_text+='\n['+g+']\n'
   hosts_text+='\n'.join(hosts_ini[g])
   hosts_text+='\n'
 
all_text = env_text+vars_text+hosts_text
print all_text
with open(ansible_ini,'w') as new_ini_file:
    new_ini_file.write(all_text)   
print "INFO:Generated Ansible host inventory file: " + ansible_ini  


The Ansible inventory file generated
[prod:children]
elasticsearch
elasticsearch-master
elasticsearch-data

[prod:vars]
ansible_user=ansible
ansible_ssh_private_key_file=~/.ssh/id_rsa
ansible_become=true
ansible_become_user=root
ansible_become_method=sudo
ansible_gather_facts=no
aws_env=aws-prod

[elasticsearch]
awselk1.prod.example.com
awselk2.prod.example.com

[elasticsearch-master]
awselk1.prod.example.com

[elasticsearch-data]
awselk2.prod.example.com

Wednesday, July 11, 2018

Ansible - turn values in CSV file into list of dictonary

Ansible can load yaml values into Ansible variable easily, However I am still fan of plain CSV file, because it is easy to edit and has no strict format and repetitive of variable names.
How to convert csv into yaml? 
I used to use Python script to archive this,actually Ansible can do this natively.

Given this csv file for AWS EC2 instances

$ cat vars/aws/ec2data.csv
name,ip,zone,group,env
splunk01,10.1.1.1,2a,splunk,prod
splunk02,10.1.1.2,2b,splunk,prod

The playbook

---
- hosts: localhost
  connection: local
  gather_facts: no
  vars:
    ec2data_file: vars/aws/ec2data.csv

  tasks:
  - name:  reading {{ec2data_file}}
    command: /usr/bin/awk -F',' '!/^#/ && !/^$/ && NR!=1 { print $1,$2,$4,$5}' {{ec2data_file}}
    register: csvout
 #turn ec2_host into list with default filter and append list of dictionary in each loop. 
 #split is Python function to split string,default delimeter is space
  - name: turn csv output to list of dict
    set_fact:
      ec2_host: "{{ ec2_host|default([]) + [ { \
                      'name': item.split().0,  \
                      'ip':   item.split().1,  \
                      'group':item.split().2,  \
                      'env':  item.split().3 } ] }}"
    with_items: "{{csvout.stdout_lines}}"

  - debug: msg="{{item.name}},{{item.ip}}" verbosity=1
    with_items: "{{ ec2_host }}"


The result
skipping: [localhost] => (item={'ip': u'10.1.1.1', 'group': u'splunk', 'name': u'splunk01', 'env': u'prod'})
skipping: [localhost] => (item={'ip': u'10.1.1.2', 'group': u'splunk', 'name': u'splunk02', 'env': u'prod'})

Thursday, October 30, 2014

Python script to run remote SSH commands with sudo permission

I created a Python script to run remote SSH command with sudo permission. Linux SSH command doesn’t support password as command option, you have to use expect script to connect to multiple servers for automation. plink tool in Windows support password as command option.
The trick to accept sudo password is ‘-S’ option in sudo, which accept sudo password piped from stdin.It seems to be safe, I turned on debug and I couldn’t see the password recorded in secure/messages logs.
There are two versions of the script: the command line one and the class/module one.

The command line version.

if the clear text password is an concern, you can wrap the script by getpasswd module in Python,which read password from stdin.Read password once and apply the password to multiple servers.
[root@~]# ./pyssh.py  -s server1 -u admin -p Passwd123 date
Thu Oct 30 15:36:27 EST 2014

#'service sshd status' command  ran successfully with sudo enabled '-t'
[root@~]# ./pyssh.py  -t -s server1 -u admin -p Passwd123  'service sshd status'
openssh-daemon (pid  15686) is running...

#!/usr/bin/env python
import sys
import paramiko
import argparse
import socket
parser = argparse.ArgumentParser()
parser.add_argument("-s", "--servername", help="hostname or IP", required=True)
parser.add_argument("-P", "--port", help="ssh port default=22", default=22)
parser.add_argument("-t", "--sudo", help="enable sudo,sudo password will use the value of --password",action='store_true')
parser.add_argument("-u","--username",help="username",required=True)
parser.add_argument("-p","--password",help="password",required=True)
parser.add_argument("cmd",help="command to run")
args=parser.parse_args()

host = args.servername
port = args.port
user = args.username 
password = args.password
cmd = args.cmd
if args.sudo:
    fullcmd="echo " + password + " |   sudo -S -p '' " + cmd
else:
    fullcmd=cmd

#if __name__ == "__main__":
client = paramiko.SSHClient()
#Don't use host key auto add policy for production servers
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.load_system_host_keys()
try: 
    client.connect(host,port,user,password)
    transport=client.get_transport()
except (socket.error,paramiko.AuthenticationException) as message:
    print "ERROR: SSH connection to "+host+" failed: " +str(message)
    sys.exit(1)
session=transport.open_session()
session.set_combine_stderr(True)
if args.sudo: 
    session.get_pty()
session.exec_command(fullcmd)
stdout = session.makefile('rb', -1)
print stdout.read()
transport.close()
client.close() 

The  class version

The class version allow multiple commands to run in an existing SSH transport,which is more efficient.To use the class,copy pyssh.sh to a folder and create a new script to import the class 'from pyssh import PySSH',then reference the code in MAIN section without if statement.
#!/usr/bin/env python
import sys
import socket
import paramiko
#=================================
# Class: PySSH
#=================================
class PySSH(object):
  
  
    def __init__ (self):
        self.ssh = None
        self.transport = None  

    def disconnect (self):
        if self.transport is not None:
           self.transport.close()
        if self.ssh is not None:
           self.ssh.close()

    def connect(self,hostname,username,password,port=22):
        self.hostname = hostname
        self.username = username
        self.password = password

        self.ssh = paramiko.SSHClient()
        #Don't use host key auto add policy for production servers
        self.ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
        self.ssh.load_system_host_keys()
        try:
            self.ssh.connect(hostname,port,username,password)
            self.transport=self.ssh.get_transport()
        except (socket.error,paramiko.AuthenticationException) as message:
            print "ERROR: SSH connection to "+self.hostname+" failed: " +str(message)
            sys.exit(1)
        return  self.transport is not None

    def runcmd(self,cmd,sudoenabled=False):
        if sudoenabled:
            fullcmd="echo " + self.password + " |   sudo -S -p '' " + cmd
        else:
            fullcmd=cmd
        if self.transport is None:
            return "ERROR: connection was not established"
        session=self.transport.open_session()
        session.set_combine_stderr(True)
        #print "fullcmd ==== "+fullcmd
        if sudoenabled:
            session.get_pty()
        session.exec_command(fullcmd)
        stdout = session.makefile('rb', -1)
        #print stdout.read()
        output=stdout.read()
        session.close()
        return output

#===========================================
# MAIN
#===========================================        
if __name__ == '__main__':
    hostname = 'server1'
    username = 'admin'
    password = 'password123'
    ssh = PySSH()
    ssh.connect(hostname,username,password)
    output=ssh.runcmd('date')
    print output
    output=ssh.runcmd('service sshd status',True)
    print output
    ssh.disconnect()


Friday, September 5, 2014

Build Puppet module to use Hiera lookup

Puppet can use Hiera to look up data. This helps you disentangle site-specific data from Puppet code, for easier code re-use and easier management of data that needs to differ across your node population

One typical example is IP address and NTP/DNS servers, the IP address is unique for each server and NTP/DNS is global. I built a linux-network test module to demonstrate the usage of Hiera
Hiera supports yaml and Json as backend by default, however you can write your  custom backend using Hiera API.

Define datadir in hiera.yaml
[root@server1 modules]# cat /etc/puppetlabs/puppet/hiera.yaml 
---
:backends:
  - yaml

:hierarchy:
  - defaults
  - "%{clientcert}"
  - "%{environment}"
  - global

:yaml:
# datadir is empty here, so hiera uses its defaults:
# - /var/lib/hiera on *nix
# - %CommonAppData%\PuppetLabs\hiera\var on Windows
# When specifying a datadir, make sure the directory exists.
  :datadir: /etc/puppetlabs/puppet/hieradata

Set all values in YAML file instead of manifest file
You can also add class name in YAML file, then assign class to node with hiera_include
[root@server1 modules]# cat /etc/puppetlabs/puppet/hieradata/global.yaml 
---
#
# ntp.conf
ntpservers: [10.1.1.11, 10.1.1.12]

#resolv.conf
domainname: example.com
searchdomain: [example1.com, example2.com]
nameservers: [10.1.1.13, 10.1.1.14]

[root@server1 modules]# cat /etc/puppetlabs/puppet/hieradata/server1.example.com.yaml 
eth1:
   device: eth1
   ipaddr: 172.16.1.2
   netmask: 255.255.255.0
   routes: ['192.168.1.0/24 via 172.16.1.254', '192.168.2.0/24 via 172.16.1.254']
   gateway: 172.16.1.254
eth3:
   device: eth3
   ipaddr: 172.16.1.3
   netmask: 255.255.255.0
   #routes: ['192.168.1.0/24 via 172.16.1.254', '192.168.2.0/24 via 172.16.1.254']
Execute the whole class or a function of the class in site.pp, the codes in site.pp become universal.
The site.pp manifest file is just generic code
[root@server1 modules]# cat /etc/puppetlabs/puppet/manifests/site.pp

node "server1" {
include "linux-network"
}

node "server2" {
linux-network::setinterface { 'eth1': }
}
linux-network module manifest files
[root@server1 modules]# cat ./linux-network/manifests/init.pp 
class linux-network {
 linux-network::setinterface { 'eth1': ; 'eth3': }
 linux-network::setroute { 'eth1': ; 'eth3':}

 linux-network::setconf_ntp {'ntp.conf':}
 linux-network::setconf_resolv {'resolv.conf':}
}

[root@server1 modules]# cat ./linux-network/manifests/setconf_ntp.pp 
define linux-network::setconf_ntp  ( ) {

$ntpservers=hiera_array('ntpservers')

file {"/etc/ntp.conf":
 ensure => present,
 owner => root,
 mode => 644,
 content => template("${module_name}/ntp.conf.erb")
 }
}

[root@server1 modules]# cat ./linux-network/manifests/setconf_resolv.pp 
define linux-network::setconf_resolv  ( ) {

$domainname=hiera('domainname')
$searchdomain=hiera_array('searchdomain')
$nameservers=hiera_array('nameservers')

file {"/etc/resolv.conf":
ensure => present,
owner => root,
mode => 644,
content => template("${module_name}/resolv.conf.erb")
 }
}

[root@server1 modules]# cat ./linux-network/manifests/setinterface.pp 
define linux-network::setinterface  ( ) {

$device=$title
$eth=hiera($device)
$ipaddr=$eth['ipaddr']
$netmask=$eth['netmask']
$gateway=$eth['gateway']

file {"/etc/sysconfig/network-scripts/ifcfg-$device":
 ensure => present,
 owner => root,
 mode => 644,
 content => template("${module_name}/ifcfg.erb")
 }

}

[root@server1 modules]# cat ./linux-network/manifests/setroute.pp 
define linux-network::setroute  ( ) {

$device=$title
$eth=hiera($device)
$routes=$eth['routes']

file {"/etc/sysconfig/network-scripts/route-$device":
ensure => present,
owner => root,
mode => 644,
content => template("${module_name}/route.erb")
 }
}
linux-network module template files
[root@server1 modules]# cat ./linux-network/templates/ifcfg.erb 
DEVICE=<%=@device %>
BOOTPROTO=static
ONBOOT=yes
USERCTL=no
IPADDR=<%=@ipaddr%>
NETMASK=<%=@netmask%>
<%- if @gateway =~ /(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/   -%>
GATEWAY=<%=@gateway %>
<%- end -%>

[root@server1 modules]# cat ./linux-network/templates/ntp.conf.erb 
 tinker panic 0
 restrict default kod nomodify notrap nopeer noquery
 restrict -6 default kod nomodify notrap nopeer noquery
 restrict 127.0.0.1 
 restrict -6 ::1
#
 <%- @ntpservers.each do |x| -%>
 server <%= x %>
<%- end  -%>
 driftfile /var/lib/ntp/drift

 [root@server1 modules]# cat ./linux-network/templates/resolv.conf.erb
#
# resolver configuration file...
#
options         timeout:1 attempts:8 rotate
domain       <%=@domainname %>
<%-  if !@searchdomain.empty?   -%>
search <%=@domainname  -%> <%=  @searchdomain.join(' ') %>
<%- end -%>
<%-  @nameservers.each do |  x | -%>
nameserver <%= x %>
<%- end -%>

[root@server1 modules]# cat ./linux-network/templates/route.erb 
<%- if defined?(@routes)   -%>
<%- @routes.each do | x | -%>
<%=x %>
<%- end -%>
<%- end -%>

Thursday, March 13, 2014

Setup SAN Boot for RHEL 6.x using native multipath on EMC storage

Requirements:
1) RHEL 6.x (most apply to RHEL 5.x too, RHEL 5.x use mkinitrd instead of Dracut and the /etc/multipath.conf is slightly different refer to Red Hat KB in reference section)
2) EMC storage was setup with Active/Active (ALUA)
3) Boot LUN was presented with single path for initial install


Procedures:

1. Server boots up after initial install
2. Login to server as root to enable multipath
[root@server1]#mpathconf --enable –-with_multipathd y
3. Edit /etc/multipath.conf and make sure it only contains following valid parameters

blacklist {
}


defaults {
 user_friendly_names yes
}
devices {
  device {
    vendor "DGC"
    product ".*"
    product_blacklist "LUNZ"
    hardware_handler "1 alua"   
    path_checker directio    
    prio alua                
  }
}
4.Find out the logical path the root disk is mapped
[root@server1]#multipath –v3
It should be /dev/mapper/mpatha

5. Create initramfs with multipath module
[root@server1]#dracut --force -–add multipath

6. Make sure multipath.conf is included in initrd image
[root@server1]#lsinitrd /boot/initramfs-*.x86_64.img | grep multipath.conf
-rw-r--r--   1 root     root         2525 Feb 27 13:31 etc/multipath.conf
7. Modify the /boot/grub/device.map and change
 (hd0) /dev/sda    to
 (hd0) /dev/mapper/mpatha
This is assuming the boot disk is on /dev/mapper/mpatha as verified in step 2 above.

8. Reboot the server.

9. Verify multipath, check hwhandler='1 alua' and member disk sda for mpatha
[root@server1]#multipath –ll 
mpatha (3600601609973310067eb1e1ed69ae311) dm-0 DGC,VRAID
size=150G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 1:0:0:0 sda 8:0   active ready running
  

10. Ask storage administrator to enable other paths for boot LUN.
11. Reboot server again after multipath is aenabled in storage too
12. Login server to verify all paths, check hwhandler='1 alua' prio>0
if hwhandler='1 emc' or prio=0 means PNR mode

[root@server1]#multipath -ll
mpatha (3600601609973310067eb1e1ed69ae311) dm-0 DGC,VRAID
size=150G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 1:0:1:0 sdd 8:48  active ready running
| `- 2:0:1:0 sdj 8:144 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 1:0:0:0 sda 8:0   active ready running
  `- 2:0:0:0 sdg 8:96  active ready running
mpathb (360060160997331009fd6e124d69ae311) dm-1 DGC,VRAID
size=800G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 1:0:0:1 sdb 8:16  active ready running
| `- 2:0:0:1 sdh 8:112 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 1:0:1:1 sde 8:64  active ready running
  `- 2:0:1:1 sdk 8:160 active ready running
13. Partition other LUNS using fdisk command as normal, but use logical path /dev/mapper/mpathb etc (partition will be created as /dev/mapper/mpathbp1 instead of /dev/mapper/mpathb1
NOTE: any change to /etc/multipath.conf requires re-create initramfs ( dracut --force -–add multipath) and a reboot, because the boot LUN is on SAN, if boot LUN is local disk, change to /etc/multipath.conf only requires multipathd restart