Saturday, 26 May 2018

numpy basics python

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

  • a powerful N-dimensional array object
  • sophisticated (broadcasting) functions
  • tools for integrating C/C++ and Fortran code
  • useful linear algebra, Fourier transform, and random number capabilities


Importing Numpy:

import numpy as np

Creating numpy Array:

 d = np.array([1,2,3,4,5])

Numpy range:

  d = np.arange(1,10). # It will create numpy array from range 1 to 9
   

numpy shape:


It will return total elements count based on rows or shape
d = np.array([1,2,3])
print d   # array([1, 2, 3])
print d.shape # (3,)

numpy reshape:



It will change the shape of numpy arrays

d = np.arange(1,10)    # array([1,2,3,4,5,6,7,8,9])
d.shape     # (9,)
d.reshape(3,3)
print d    #  Array([[1, 2, 3],
                                [4, 5, 6],
                                [7, 8, 9]])
Above example, it will reshape like 3X3 matrix structure


np.zeros()


    It will create zero value matrix numpy array. We have to give dimension value in the function and it will create matrix arrays.
np.zeros(3, 3)       # Array([[0., 0., 0.],
                                            [0., 0., 0.],
                                            [0., 0., 0.]]) 


np.vstack()


It will vertically stack each elements in  numpy array.

c = np.array([1,2,3])  # array([1, 2, 3])
np.vstack(c)    # array([ [ 1],
                                       [ 2],
                                       [ 3]])
   

np.eye()


It will create numpy  identical matrix array.

 np.eye(3) # it will create 3X3 matrix
                     Array (  [ 1,    0,   0]
                                   [ 0,     1,   1]
                                   [ 0,     0,   1])
        

     
np.dot()


 It will dot product of two matrix (multiplication)       
 np.dot(M1, M2)

np.sum()


 It will sum of all the elements in given array.
#M = Array([[1, 2, 3],
            [4, 5, 6],
            [7, 8, 9]])
np.sum(M) # 45 it will sum all the elements in array
np.sum(M, axis=0)   # [[12, 15, 18]] 
    
If axis= 0, it will sum column wise, it 
If axis = 1, it will sum row wise


np.random.rand()


It will produce random np arrays


np.append()


Append elements to nd array
 A = array([1, 2, 3])
 B = np.append(A, 4) # [1, 2, 3, 4]
 B = np.append(A, [4, 5,6,7]) # [1, 2, 3, 4, 5, 6, 7]
        
    






            
    

     

Install apache airflow on ubuntu

What is Airflow:

Airflow is a platform to programmatically author, schedule and monitor workflows. This blog contains following procedures to install airflow in ubuntu/linux machine. 
  1. Installing system dependencies 
  2. Installing airflow with extra packages
  3. Installing airflow meta database
    1. Mysql
    2. Postgres
  4. Installing Rabbitmq (Message broker for CeleryExecutor)
We can use RabbitMQ as a message broker if you are using Celery executor. For LocalExecutor no need to install any message brokers like Rabbitmq/Redis 

1. Installing Dependency packages:


apt-get update && apt-get upgrade -y

sudo apt-get -yqq install git \
    python-dev \
    libkrb5-dev \
    libsasl2-dev \
    libssl-dev \
    libffi-dev \
    build-essential \
    libblas-dev \
    liblapack-dev \
    libpq-dev \
    python-pip \
    python-requests \
    apt-utils \
    curl \
    netcat \
    locales \
    libmysqlclient-dev \
    supervisor
pip install --upgrade pip 

2. Install Apache airflow


pip install PyYAML==3.12
pip install requests==2.18.4
pip install simplejson==3.12.0

pip install apache-airflow[crypto,celery,postgres,hive,hdfs,jdbc,gcp_api,rabbitmq,password,s3,mysql]==1.8.1
pip install celery==3.1.17 

3. Install Meta Database:


i. Install  Mysql


#Installing and enable mysql server
sudo debconf-set-selections <<< 'mysql-server mysql-server/root_password password airflowd2p'
sudo debconf-set-selections <<< 'mysql-server mysql-server/root_password_again password airflowd2p'

sudo apt-get -y install mysql-server    libmysqlclient-dev 

ii . Install Postgressql


# Installing and enable postgresql in systemd and starting server
sudo apt-get -y install postgresql \
    postgresql-contrib
update-rc.d postgresql enable

service postgresql start

4. Install rabbitbq

apt-get update && apt-get upgrade -y
#Install erlang - dependency package for rabbitmq
sudo dpkg -i erlang-solutions_1.0_all.deb
sudo apt-get update
#Install rabbitmq server
echo "deb https://dl.bintray.com/rabbitmq/debian xenial main" | sudo tee /etc/apt/sources.list.d/bintray.rabbitmq.list
sudo apt-get update

sudo apt-get -yqq install erlang    rabbitmq-server

5. Create Rabbitmq users:


#!/usr/bin/env bash
#Creating airflow user, tag, virtual host
rabbitmq-plugins enable rabbitmq_management
rabbitmqctl add_user airflow_user airflow_user
rabbitmqctl add_vhost airflow
rabbitmqctl set_user_tags airflow_user airflow_tag
rabbitmqctl set_user_tags airflow_user administrator

rabbitmqctl set_permissions -p airflow airflow_user ".*" ".*" ".*"


ansible basics for beginners

What is Ansible


Ansible interacting with machines via SSH. So nothing need to be installed in client machines. Only prerequisite is ansible need to be installed in controller machine with python and ssh enabled.

Inventory:


Inventory file:


Inventory file is an simple text file which contains List of machines going to interact with it. We can mention single machines or group of machines going to use it. We can pass direct commands to modules in cmd line using ansible cli.

Cmd: ansible group-name -i <inventory-filename> -m <module-name> <module-params>

ansible group-name -i <inventory-filename> -m <module-name> <module-params>
 
Inventory:
server1.mycomp.com
server2.mycomp.com
 
[clients] #group name
server3.mycomp.com
server4.mycomp.com  


Ex: 
ansible clients -i inventory -m ping
ansible clients -i inventory -m apt -a "name=mysql-server state=present"

    Inventory file can also be an executable file. For example if you don’t know the number of instances running in AWS means we can simple write a script to return running instances name from AWS.

Ansible play books:


    Ansible playbook is an simple YAML file which contains list of tasks that need to be performed in client machines which we mentioned in inventory file.

playbook.yaml
---

- hosts: all
  tasks:
    - name: updating package list
      apt: update_cache=yes cache_valid_time=3600
- hosts: clients
  tasks:
    - name: installing mysql server
      apt: name=mysql-server state=present

In above code snippet, we used apt module for updating and installing packages. Host all specifies perform the task to all the host machines which we mentioned in inventory file. 

And also we can perform task to specific group of hosts. “hosts: client” specifies perform  below mentioned tasks only to client group which we created in inventory file. “-name” of each tasks contains some human readable message which will print while performing the tasks. This will be very helpful while monitoring the execution

    Running playbook:

  ansible-playbook -i inventory playbook.yaml

Vaiables in playbook:


Ansible using jinja2 templating system for dealing with varibles.

playbook.yaml
---
- hosts: all
  tasks:
    - name: updating package list
      apt: update_cache=yes cache_valid_time=3600
- hosts: clients
  vars:
    init_script: "create_db.sql"
  tasks:
    - name: installing mysql server
      apt: name=mysql-server state=present
    - name: coping init sql files

      copy: src=/tmp/{{init_script}} dest=/tmp/mysql/{{init_script}}


Variable loops in playbook:


playbook.yaml
---
- hosts: all
  tasks:
    - name: updating package list
      apt: update_cache=yes cache_valid_time=3600
- hosts: clients
  vars:
    init_script: “create_db.sql"
  tasks:
    - name: installing mysql server
      apt: name={{item}} state=present
      with_items:
        - python 
        - python-pip 
        - vim
    - name: coping init sql files

      copy: src=/tmp/{{init_script}} dest=/tmp/mysql/{{init_script}}

Other way - we can combine the variables based on hosts vise

playbook.yaml


---

- hosts: all
  tasks:
    - name: updating package list
      apt: update_cache=yes cache_valid_time=3600
- hosts: clients
  vars:
    packages:
      - python 
      - python-pip 
      - vim
  tasks:
    - name: installing mysql server
      apt: name={{item}} state=present
      with_items: {{packages}}
        - name: coping init sql files
          copy: src=/tmp/{{init_script}} dest=/tmp/mysql/{{init_script}}
     

Directory Group variables:


In default ansible will look directory called “group_vars” and “host_vars” in same location which playbook located. If you define any variables under the group_vars directory it will automatically applied to that specific group.

My folder structure:
    - inventory
    - playbook.yml
    - group_vars
            - all 
            - clients
    - host_vars
            - server.com

In above folder structure, variable defined in the file called “all” under the group_vars directory which will be available for all hosts defined in inventory hosts. If you want to define variables for specific host create file with same hostname under the “host_vars” directory.

Inventory directory:


    Normally inventory file will be simple test file but it can also be an directory. 

     ansible-playbook -i <inventory-dirctory> playbook.yml

  • ansible-playbook -i uat deploy.yml
  • ansible-playbook -i dev deploy.yml
  • ansible-playbook -i prod deploy.yml

Directory structure of inventory folder:
        
        dev
              - hosts
              - group_vars
              - host_vars
        uat
              - hosts
              - group_vars
              - host_vars
        Prod
              - hosts
              - group_vars
              - host_vars
        deploy.yml

Is there any text files available in your inventory directory, ansible will treat it as inventory file.

Roles in ansible:


You can use single playbook file for managing entire tasks of your infrastructure. But once in a stage your playbook file will be more bigger and hard to manage. For this ansible has the “role” feature, so you can split your playbook yaml file into more moduler way.

You can create a directory called “roles” and create playbook modules.

Roles directory structure:

        dev
              - hosts
              - group_vars
              - host_vars
        roles
              - common
                    - defaults
                        - main.yml   # variable values
                    - tasks
                        - main.yml   # list of tasks need to be execute
                    - files
                        - server.py   # file need to be copy
                    - templates
                        - config.py.j2  # template file used for template module
                    - meta
                        - main.yml  # list the dependency task before perform
              - webserver
                      - defaults
                        - main.yml   # variable values
                    - tasks
                        - main.yml   # list of tasks
              - db
                    - tasks
                        - main.yml   # list of tasks
        deploy.yml

Deploy.yaml

- hosts: database-server
  roles:
    - common
    - db
- hosts: web-server
  roles:
    - common
    - webserver



Here we can break down the roles folder into more modules. It has documented in ansible documentation site. 
  • Defaults folder contains the variable need to be register
  • Task folder contains task need to be perform for that group
  • Files folder contains the files need to be transferred
  • Templates folder is for template module
  • Meta folder contains the dependency list for That specific group
    
        Ex:
                main.yml
                --- 
                Dependencies:
                    - common
                    - db 

Sunday, 11 February 2018

Introduction to Python Argparse


What is Python Command line arguments?


While executing python script, we can provide additional arguments in command line. These arguments are passed into the program. We can access those arguments inside the program with help of python modules(sys, argparse, etc..).  python "sys" library module is one of the traditional and simple way of handling command line arguments.

my-script.py
import sys
print len(sys.argv)
print sys.argv
print sys.argv[0]
print sys.argv[1]
print sys.argv[2]


$ python my-script.py arg1 arg2
3
['my-script.py', 'arg1', 'arg2']
test.py
arg1
arg2


Python "argparse" module:


There are many python modules available for handling python command line arguments. One of the most popular module is argparse.  Argparse was added into python 2.7 as replacement of optparse.  It provided more features then traditional sys module.

Parsing command line arguments:


There is an function called "arg_parse" from ArgumentParser class which is used to parse the command line arguments. In default it will take arguments from sys.argv[1:], but we can also provide our own list. 

We can define arguments using add_argument function it will return the Namespace object which containing the arguments to the commands.

import sys
import argparse
parser = argparse.ArgumentParser(description='sample app')
parser.add_argument("name", help="Please enter your name")
args = parser.parse_args()
print args
print args.name
$ python my-script.py -h
usage: my-script.py [-h] name

sample app

positional arguments:
  name        Please enter your name

optional arguments:
  -h, --help  show this help message and exit
$ python my-script.py Jerry
Namespace(name='Jerry')
Jerry
-h or --help is an default feature added into your script when you import argparse module.  it will show the available positional arguments and optional arguments with help messages provided by us.


argument type:


We can externally specify the type of the argument to argparse can accept. "type" field is used for specifying cast. it will convert the argument value to specified type while parsing the arguments. if cannot convert to specified type it will throw error.

import sys
import argparse
parser = argparse.ArgumentParser(description='sample app')
parser.add_argument("square", type=int, help="Please enter your name")
args = parser.parse_args()
print args.square**2
$ python my-script.py 4
16

Optional arguments:


When you add positional argument to parser, we must provide value to positional arguments otherwise it will throw an error. But optional arguments are actually optional, there is no error when running the program without it.

import sys
import argparse
parser = argparse.ArgumentParser(description='sample app')
parser.add_argument("--square", dest="square", default=2, type=int, help="Please enter integer value")
args = parser.parse_args()
print args.square**2
$ python my-script.py --square 4
16
$ python my-script.py
4

If we are not providing any command line arguments, it will take value from default field. "None" is the default value for default field.

Short options:


We can define the short versions of the optional arguments. it is very useful for handy

import sys
import argparse
parser = argparse.ArgumentParser(description='sample app')
parser.add_argument("-s","--square", dest="square", default=2, type=int)
args = parser.parse_args()
print args.square**2

$ python my-script.py -s 4
16

Argument Actions:


action field of add_argument() function specifies what kind of action need to be perform to that argument. default value is "store", i.e store the given value to the destination variable. following are the six different kind of actions can be triggered when we add argument.

  • store - it is a default value of an action field. it will store specified value to destination variable
  • store_const - store the value defined as part of argument specification
  • store_true/store_false - save boolean values to the variables
  • append - save the value to the list
  • append_const - store the value defined in the argument to list
  • version - prints the version details about the program

examples:


import sys
import argparse
parser = argparse.ArgumentParser(description='sample app')
parser.add_argument("-v", "--verbose", action="store_true", default=False)
parser.add_argument("-s","--square", dest="square", default=2, type=int)
parser.add_argument("-a","--add", dest="my_list", default=[], action="append")
args = parser.parse_args()
if args.verbose:
    print "printing verbose output"
print "square value ", args.square**2
print "my list is ", args.my_list
$ python my-script.py -v --square 4 -a 2 -a 3
printing verbose output
square value  4
my list is  ['2', '3']


Thursday, 8 February 2018

Generate SSH keys using cmd line - Mac OS/Linux

ssh-keygen:

    ssh-keygen is an command line tool which is used for generate, manage and convert ssh keys. ssh-keygen can create keys for use by ssh protocal version 1 and 2. it has many option 

The type of key to be generated is specified with the -t option.  If invoked without any arguments, ssh-keygen will generate an RSA key for use in SSH protocol 2 connections.

Normally each user wishing to use SSH with public key authentication runs this once to create the authentication key in 
  • ~/.ssh/identity
  • ~/.ssh/id_dsa,
  • ~/.ssh/id_ecdsa
  • ~/.ssh/id_ed25519
  • ~/.ssh/id_rsa.
Additionally, the system administrator may use this to generate host keys, as seen in /etc/rc.

Normally this program generates the key and asks for a file in which to  store the private key.  The public key is stored in a file with the same name but ``.pub'' appended.  The program also asks for a passphrase.  The passphrase may be empty to indicate no passphrase (host keys must have an empty passphrase), or it may be a string of arbitrary length. 

open your terminal and run following command:

ssh-keygen -t rsa -f ~/.ssh/[KEY_FILENAME] -C [USERNAME]

-f - is name that you want to use for your ssh key files.
-c - is user for whom you will apply this ssh file
-t  - Specifies the type of key to create. (dsa | ecdsa | ed25519 | rsa | rsa1)

ex: 

ssh-keygen -t rsa -f ~/.ssh/my-ssh-keys -C ubuntu
 Specifies Above command create following two files.

my-ssh-keys - private key
my-ssh-keys.pub - public key

This command generates a private SSH key file and a matching public SSH key with the following structure:
ssh-rsa [KEY_VALUE] [USERNAME]
And restrict access to your private key
chmod 400 ~/.ssh/[KEY_FILENAME]
And restrict access to your private key.. Once you created public key and private key, add your public key into server's authorized_keys file which you want to access via ssh.
Normally it will be located in ~/.ssh/authorized_keys

cat ~/.ssh/authorized_keys

connect the server using ssh command line tool. When you connect first time it will ask to add server IP address to your known host list. Give yes for that.

ssh -i [private_key_file] [username]@[server-name]
ssh -i my-private-key ubuntu@10.193.10.23

Monday, 29 January 2018

Creating simple hello world flask app using docker

First make sure docker is installed in your machine and you have necessary permissions to execute the following commands in your system. If you want to know about basics of docker please refer my previous blog - Docker guide for beginners

Following sample files are available in this github repo


Step 1 : Create working directory

mkdir flask-helloworld
cd flask-helloworld

Step 2: create following files inside the working directory


requirements.txt

Flask

app.py

from flask import Flask
app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello World!"

if __name__ == "__main__":
    app.run(host='0.0.0.0')

Dockerfile

# base image
FROM python:3-onbuild
# specify the port that container should expos
EXPOSE 5000
# run the application
CMD ["python", "./app.py"]


Step 3 : Build and Run your container


Build docker image:


Following command read your Dockerfile and build your custom images. Go to the path where your Docker file locates and execute following command.

docker build -t <imagename>:<tag-name> <path of your dockerfile>
docker build -t sample-app:v1 .

final dot(.) specifies the current directory. -t specifies the tag name of the image



Listing docker images:


docker images
docker images -a




Running Docker container:


Docker run command is used to run a image inside the container. If the specified image available in local machine docker will take it from local or it will download from dockerhub and then store it to local machine.
docker container run <image-name>
docker container run sample-app:v1
It will create a new container and run the sample-app image inside the container.




If you want to execute the container in background use --detach (or) -d flag. It will detach the process from foreground and allow us to execute it into background. It will return the unique container id.

docker container run -p 5000:5000 -d sample-app:v1













Executing commands inside the container:


Following command allow us to login inside the container. It is very helpful to debug our application if something went wrong. we can execute linux commands inside the container

docker run -it <image-name> sh

docker run -it sample-app:v1 sh




Stop the container:


Following command used to stop the container.
docker container stop 9a425901d134

Deleting containers:


Every run docker creating new containers so it will eat disk space, so best practice is cleanup the containers once done with that.

docker container rm <container id>

docker container rm 9a4