Data Shack

Featured

About Me

Software Engineer.
Sensible in DevOps/SRE. Ability to dig deep across multiple layers of the stack, from networking and virtualization to configuration management, packaging and deployment. Comfortable with creating, testing, deploying, operating and debugging software at scale.

Skilled in Python, Shell Scripting, CI/CD, Terraform, AWS, Ansible, Docker, Kubernetes, Jenkins, Vagrant.
Good Experience with databases like PostgreSQL, Greenplum, MSSQL and IBM Db2.

How to modify permissions for files under /mnt/c in WSL

You need to unmount /mnt/c and mount /mnt/c again with below options :

sudo umount /mnt/c

sudo mount -t drvfs C: /mnt/c -o metadata,uid=1000,gid=1000,umask=22,fmask=111

To automatically unmount and mount the C:\ drive (/mnt/c) , add the below lines to /etc/wsl.conf in WSL.
If wsl.conf does not exist, create it.

[automount]                                                                                                                                      enabled = true                                                                                                                                                     
options = "metadata,uid=1000,gid=1000,umask=22,fmask=111"

Now you will be able to use chmod/chown to modify the permissions on the files under /mnt/c

Problem with VS Code, cannot find valid and existing property in SQLAlchemy object.

Error:
Instance of 'SQLAlchemy' has no 'foo' member'

Solution:
Visual Studio Code: Open File > Preferences > Settings > Edit in settings.json -> and paste this :

{
    "python.linting.pylintArgs": [
        "--load-plugins",
        "pylint-flask"
    ]

Basic Ansible Setup

Ansible is a simple IT automation system. It handles configuration management, application deployment, cloud provisioning, ad-hoc task execution, network automation, and multi-node orchestration.

Commercial versions are available. But most of it is free and open source.
To use Ansible, we need to install it on our control machine. Installation on target machines is not required (agentless).

I have installed Ansible on my archlinux machine.

You check the Ansible version as below:

Now you can see the config file path = /etc/ansible/ansible.cfg

As Ansible uses SSH to connect to the targets. Make sure you can establish SSH connection to the target machines from the control machine.

I will use a AWS EC2 instance as my target here. I already have this setup and have the ssh key copied there.

Now we setup the hosts file for ansible:

It contains the list of hosts that you want to connect and how to connect to those.

[RHEL8aws]
ec2-3-23-104-142.us-east-2.compute.amazonaws.com

[RHEL8aws:vars]
ansible_ssh_private_key_file = ~/RHEL8aws.pem

RHEL8aws.pem is the ssh key file generated during the AWS EC2 instance setup.

Now we are ready to use Ansible.
Most simple way to check if are setup properly is to ping the host.

ansible -m ping RHEL8aws    # only hosts with RHEL8aws tag
or
ansible -m ping all         # all hosts

Ansible can accomplish more complex tasks via playbooks.
But i will cover only ad hoc commands here.

Ansible has various modules to execute commands on the target host.
Another such module is the shell module, used to execute shell commands. We have already seen the ping module.

By default Ansible will use the same user as the control machine.
You can change this with the -u option.

List comprehensions and lambda, map, filter in Python

List Comprehension :

List comprehensions are easy and concise way to create new lists from already existing lists.

Create a list of numbers.

nums = [1,2,3,4,5,6,7,8,9,10]

Below is one way to square the numbers in the list and create a new list.

squares = []
for n in nums:
    squares.append(n**2)
print(squares)

Using list comprehension :

# create a list by iterating over nums(list) and return the square of the iterator(n).
squares = [ n**2 for n in nums ]
print(squares)

Output:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

Lambda Function :

It is an expression that returns a function instead of assigning it a name.
Let us see an example:

doubler = lambda x : x*2
print(doubler(5))

Output :

We have created a lambda function that doubles the number passed.
The doubler function above will print 10.

map :

Now suppose we want to apply some function to each element of a list.
map function will do it for us:

nums = [1,2,3,4,5,6,7,8,9,10]
squares = list(map(lambda n : n**2 , nums))
print(squares)

Output:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

filter :

filter function is used to filter out results from a list.
filter function will run our filter condition on each element of the list. If the filter condition is satisfied the element is passed to the output list.

Filtering using list comprehension:

# filtering squares greater than 50
filter_squares = [n for n in squares if n>50]
print(filter_squares)

Output :

[64, 81, 100]

Filter using filter function:

# filtering squares greater than 30
filter_squares = list(filter(lambda n:n>30,squares))
print(filter_squares)

Output :

[36, 49, 64, 81, 100]

Record shell sessions

script command is used to record shell sessions.
It is useful for documentation, capturing and replicating issues.

Simplest Implementation:

Now by default, output file is ‘typescript’. We can pass any other name for output file as below.
$ script myscript.log
Now myscript.log would contain the output.
If we examine the contents of this file, we find exactly what we expect.
Our session is recorded 🙂

Record and Playback:

Another useful feature of this script command is the ability to record and playback the session. We have to enable timing log for this to work.

$ script myscript.log –timing=time.log

time.log would be the timing log which helps in replaying the recorded script.

To replay the recorded shell session we use the scriptreplay command:

$ scriptreplay -s myscript.log -t time.log
or
$ scriptreplay -s myscript.log –timing=time.log

Shell Script to read csv file into Array

This is a shell script to read a csv file into array and display the contents.

Using an array is preferred we can keep track of no. elements in a row.
For example:
echo ${row[@]} will display all the elements of array.
echo ${#row[@]} will display the no. of elements in the array.
Also to get the 2nd element of the array, use : echo ${row[1]}

Note: Array index here in bash shell starts from 0.

#!/bin/bash

clear

row_no=1

while IFS=',' read -a row ;             # read into an array named row
do
        echo "ROW number: $row_no"
        echo "ROW contents: ${row[@]}"  # display all the elements of array
        echo "ROW Length : ${#row[@]} " # no. of elements in the array
        echo -e "\n----------------------------------\n"
        ((row_no++))
done < file.csv

Below is the csv file contents and output of the script.

Open and Read a file in Python

Below is an example code which reads “file.csv”. Assume it contains data from a table with table headers as it’s first row.

We extract the table header with : contents_header = contents[0]
We then remove the table header from dataset by slicing: contents = contents[1:]

Function explore_data() prints rows read from file. It can also show total no. of rows and columns in the data.

from csv import reader

opened_file = open('file.csv')
read_file = reader(opened_file)
contents = list(read_file)   #read the contents of file as a list
contents_header = contents[0] #extract only the header
contents = contents[1:]      #remove the header

def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line between rows
        
    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

print('Header : \n', contents_header)
print('\n')
explore_data(contents, 0, 3, True)  #print row 0 to 2, Also show no. of rows and columns

Example:

Shell Script to compare row counts between two Greenplum Postgresql Database

This shell script takes 3 arguments: source db, target db and a list file of tables.

List file is in format:
$ cat listfile.lst
schema.table : where<> ::
schema2.table2 : where<> ::

Also ~/gpdbinfo.dat file contains database name, host and port details.

#!/bin/ksh
######################################################################
#USAGE : gpdb_row_compare.ksh <source> <target> <list of tables>
# format for list of tables --> schema.table:where clause::
######################################################################

module load gpdb/client/4.3.6.1

touch Report.file
touch greenReport.file
touch redReport.file
echo "" > Report.file
echo "" > greenReport.file
echo "" > redReport.file

SOURCE_GP=$1
TARGET_GP=$2
listfile=$3

header_write()
{
echo "<table border=1>" >> Report.file
echo "tr><th SIZE=5>QUERY</th><th SIZE=5>$SOURCE_GP Count</th><th SIZE=5>$TARGET_GP Count</th><th SIZE=5>MATCH</th></tr>" >> Report.file
} 

green_match()
{
#echo "<tr bgcolor=\"lime\">" >> greenReport.file
echo "<tr>" >> greenReport.file
echo "<td align=centre><FONT SIZE=2>$1</FONT></td>" >> greenReport.file
echo "<td align=centre><FONT SIZE=2>$2</FONT></td>" >> greenReport.file
echo "<td align=centre><FONT SIZE=2>$3/FONT></td>" >> greenReport.file
echo "<td align=centre><FONT SIZE=2>YES</FONT></td>" >> greenReport.file
echo "</tr>" >> greenReport.file
}

red_match()
{
#echo "<tr bgcolor=\"lightpink\">" >> redReport.file
echo "<tr>" >> redReport.file
echo "<td align=centre><FONT SIZE=2>$1</FONT></td>" >> redReport.file
echo "<td align=centre><FONT SIZE=2>$2</FONT></td>" >> redReport.file
echo "<td align=centre><FONT SIZE=2>$3/FONT></td>" >> redReport.file
echo "<td align=centre><FONT SIZE=2>NO</FONT></td>" >> redReport.file
echo "</tr>" >> redReport.file
}

#######################################

echo -e "\n Connecting to GP --- $SOURCE_GP "
echo -e "\nDETAILS --"
cat ~/gpdbinfo.dat | grep "$SOURCE_GP:" | awk -F':' '{print "-t -d "$3"-h "$2" -p "$5 }' | read GPDBINFO
cat ~/gpdbinfo.dat | grep "$SOURCE_GP:" | awk -F':' '{print $3}' | read GPSHORTNAME
cat ~/gpdbinfo.dat | grep "$SOURCE_GP:" | awk -F':' '{print $2}' | read HOST
cat ~/gpdbinfo.dat | grep "$SOURCE_GP:" | awk -F':' '{print $3}' | read PORT

echo "gp: $GPSHORTNAME"
echo "host: $HOST"
echo "port : $PORT"
USER="$(whoami)"
echo "user : $USER"
echo -e "/n"

set -A prod_count
i=0

while IFS=':' read -r table filter dummy dummy2 || [[ -n "$line" ]];do
    query="select count(*) from $table $filter ;"
    prod_count[$i]=$(psql $GPDBINFO -c "$query")
    echo "$query -->${prod_count[$i]}"
    ((i++))
done < "$listfile"

echo ${prod_count[@]}

######################################

echo -e "\n Connecting to GP --- $TARGET_GP "
echo -e "\nDETAILS --"
cat ~/gpdbinfo.dat | grep "$TARGET_GP:" | awk -F':' '{print "-t -d "$3"-h "$2" -p "$5 }' | read GPDBINFO
cat ~/gpdbinfo.dat | grep "$TARGET_GP:" | awk -F':' '{print $3}' | read GPSHORTNAME
cat ~/gpdbinfo.dat | grep "$TARGET_GP:" | awk -F':' '{print $2}' | read HOST
cat ~/gpdbinfo.dat | grep "$TARGET_GP:" | awk -F':' '{print $3}' | read PORT

echo "gp: $GPSHORTNAME"
echo "host: $HOST"
echo "port : $PORT"
USER="$(whoami)"
echo "user : $USER"
echo -e "/n"

set -A stg_count
i=0

while IFS=':' read -r table filter dummy dummy2 || [[ -n "$line" ]];do
    query="select count(*) from $table $filter ;"
    prod_count[$i]=$(psql $GPDBINFO -c "$query")
    echo "$query -->${stg_count[$i]}"
    ((i++))
done < "$listfile"

echo ${stg_count[@]}

################################

### Prepare Report File

i=0

while IFS=':' read -r table filter dummy dummy2 || [[ -n "$line" ]];
do
    query="select count(*) from $table $filter ;"
    echo "${stg_count[$i]}"
    echo "${prod_count[$i]}"
    
    if [ ${stg_count[$i] == ${prod_count[$i]} ]
    then
        echo "green"
        green_match "$query" ${prod_count[$i]} ${stg_count[$i]}
    else
        echo "red"
        red_match "$query" ${prod_count[$i]} ${stg_count[$i]}
    fi

    ((i++))
done < "$listfile"

header_write
cat redReport.file >> Report.file
cat greenReport.file >> Report.file

#####################################

### Send Report File as Email
./sendmail.ksh $@

#####################################

Below is the script to send the report file generated above as email :

#!/bin/ksh

{
echo "From: GP COUNT COMPARE REPORT"
echo "To: subhankd@gmail.com"
echo "Cc: "
echo "MIME-Version: 1.0"
echo "Subject: GP Compare: $1 vs $2"
echo "Content-Type: text/html"

cat ./Report.file

} | sendemail -t

echo " Email Sent."

Write a Service Unit on Arch Linux

A service unit describes how to manage a service or application. This includes how to start or stop the service, under which circumstances it should be automatically started, and the dependency and ordering information for related software.

System Unit files are generally loaded from:
/etc/systemd/system/

Below is an example of a simple service unit file:

Using this example you can automate certain tasks on boot. You just need to change the service dependency and the script that you need to execute.

[subhankd@archm system]$ cat /etc/systemd/system/my_sub.service
[Unit]
Description=My Service -- Run before sddm
Before=sddm.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/local/bin/my_sub.sh
ExecStop=/usr/local/bin/my_sub2.sh

[Install]
WantedBy=sddm.service

[Unit] Section

The first section found in most unit files is the [Unit] section. This is generally used for defining metadata for the unit and configuring the relationship of the unit to other units.

Description= : Set this to something short, specific, and informative.
Before=sddm.service : Here I have used the sddm service as my trigger. sddm is the display manager that was installed on my machine.

[Service] Section

The [Service] section is used to provide configuration that is only applicable for services.

Type=oneshot: This is useful for scripts that do a single job and then exit.
RemainAfterExit=yes : So that systemd still considers the service as active after the process has exited.
ExecStart=/usr/local/bin/my_sub.sh : Execute the script before starting sddm.
ExecStop=/usr/local/bin/my_sub2.sh : Execute the script before stopping sddm.

[Install] Section

This section is optional and is used to define the behavior or a unit if it is enabled or disabled. Enabling a unit marks it to be automatically started at boot. This is accomplished by latching the created unit in onto another unit that will be started at boot.

WantedBy= : To specify how a unit should be enabled. This directive allows you to specify a dependency relationship

Now once you have the service unit file ready, create the script file as per your requirement. Do make sure the script files are executable by the root user.

Finally enable the service you just created, to start at boot:

[subhankd@archm system]$ systemctl enable my_sub.service