Wednesday, July 1, 2015

Myths of Processing - CPU Scheduling

Always when i start to do performance analysis, i always get bogged down by  Procesors, Core and Virtualization. In this post, I have tried to understand the performance impact in general. Although, the overall performance of any processes depends upon other factors  such as CPU intensive, I/O Intensive or Network I/O intensive, This post is focused more on CPU operations and how different operating systems and Virtualization layer manage the CPU resources.

CPU scheduling is one of most critical and important part of the system software. Not all operating systems are equal, it is one of the key metrics that distinguished the operating systems in terms of it's performance - which is largely guided by CPU's operations. Also, when the application is written, how it is going to use the multicore and multiprocessor should also be the part of the discussion.

Due to the availability of the different types of milt-processors in terms of number of processors and number of cores, the processing power of the computers has increased dramatically. But does all the programs and operating systems really benefits from increasing the cores and processors ? - It needs more scientific explanation and is largely guided by the design of the operating systems.

For example windows 2003 is limited up-to four CPU's. However, it can take advantage of cores, if you quad processors, it can go upto 16 cores. If it has a dual core, the windows can run upto 8 cores.

Virtual CPUs (vCPU) in VMware virtual machines appear to the operating system as single core CPUs. So, just like in the example above, if you create a virtual machine with 8 vCPUs (which you can do with vSphere) the operating system sees 8 single core CPUs. If the operating system is Windows 2003 Standard Edition (limited to 4 CPUs) it only runs on 4 vCPUs ( source Vmware knowledge base).

But this configuration can be overwritten by adding cpuid.coresPerSocket in configuration file of Vmware (.vmx) file

This is a very good article that specifically talks on configuring this parameter in  Vmware Vsphere.

Friday, August 15, 2014

Linux Process Profiling - Creating Core Dump

This is the first post in series of Process Profiling, in this post I will show how to create a core dump of a process. We can use following tools to do the core dump of the process.

1. Using Standard Signal  QUIT (3)
2. Using gdb  - GNU Debugger

 We will use the GNU Debugger here.

 #ulimit -c    /* Checking whether cored dump is disabled or enabled
0                  /* Disabled 
#ulimit -c unlimited    /* Changing the sized to unlimited

$ulimit -a
 core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 65536
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 65536
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

 # ps aux | egrep 'VSZ| 24291'  /* Identifying the Virtual Memory Size of the Process 24291

root     24291  0.8  0.0   4656  1032 pts/30   S    09:47   0:06 /bin/bash ./
root     28349 19.0  0.0   4048   736 pts/14   R+   09:59   0:00 egrep VSZ| 24291

#gdb --pid=24291 /*Using the specific Process Number to generate its core; The core will be generated in present working directory, ensure that you have enough space.

GNU gdb Red Hat Linux (6.6-45.fc8rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
Attaching to process 24291
Reading symbols from /bin/bash...(no debugging symbols found)...done.
Using host libthread_db library "/lib/".
Reading symbols from /lib/ debugging symbols found)...done.
Loaded symbols for /lib/
Reading symbols from /lib/ debugging symbols found)...done.
Loaded symbols for /lib/
Reading symbols from /lib/ debugging symbols found)...done.
Loaded symbols for /lib/
Reading symbols from /lib/ debugging symbols found)...done.
Loaded symbols for /lib/

(no debugging symbols found)
0x00110416 in __kernel_vsyscall ()

(gdb) gcore  /* Creating the Core
Saved corefile core.24291

(gdb) quit  /* Quiting the gdb
The program is running.  Quit anyway (and detach it)? (y or n) y
Detaching from program: /bin/bash, process 24291

# ls -latrh /* The core file is generated
-rw-r--r--  1 root     root     4.6M 2014-08-15 10:00 core.24291

Monday, July 14, 2014

Installing Cacti with Spine Poller from Source

Installing Cacti and Spine from Source                                         

  1. Required Libraries / Tools
    1. Mysql
    2. Php
    3. Snmp
    4. HTTP
    5. Miscellaneous libraries associated with above packages
  2. Pre-work
Install Apache
# yum install httpd httpd-devel
Install MySQL
# yum install mysql mysql-server
Install PHP
# yum install php-mysql php-pear php-common php-gd php-devel php php-mbstring php-cli php-mysql
Install PHP-SNMP
# yum install php-snmp
Install NET-SNMP
# yum install net-snmp-utils p net-snmp-libs php-pear-Net-SMTP
Install RRDTool
# yum install rrdtool
# Recompiling PHP with socket support (if not)
./configure --with-apxs2=/usr/local/apache/bin/apxs --with-mysql=/usr/include/mysql --prefix=/usr/local/apache/php --with-config-file-path=/usr/local/apache/php --enable-force-cgi-redirect --disable-cgi --with-zlib --with-gettext --with-gdbm -enable-sockets
1. ./Configure  /* creates Make Config file

2.make   /* Builds using MakeConfig file

3. make install  /* Installs Software
Installing PHP SAPI module:       apache2handler
/usr/local/apache/build/ SH_LIBTOOL='/usr/local/apache/build/libtool' /usr/local/apache/modules
/usr/local/apache/build/libtool --mode=install cp /usr/local/apache/modules/
cp .libs/ /usr/local/apache/modules/
cp .libs/libphp5.lai /usr/local/apache/modules/
libtool: install: warning: remember to run `libtool --finish /admin/scripts/home/tac/php-5.3.1/libs'
chmod 755 /usr/local/apache/modules/
[activating module `php5' in /usr/local/apache/conf/httpd.conf]
Installing PHP CLI binary:        /usr/local/apache/php/bin/
Installing PHP CLI man page:      /usr/local/apache/php/man/man1/
Installing build environment:     /usr/local/apache/php/lib/php/build/
Installing header files:          /usr/local/apache/php/include/php/
Installing helper programs:       /usr/local/apache/php/bin/
  program: phpize
  program: php-config
Installing man pages:             /usr/local/apache/php/man/man1/
  page: phpize.1
  page: php-config.1
Installing PEAR environment:      /usr/local/apache/php/lib/php/
[PEAR] Archive_Tar    - already installed: 1.3.3
[PEAR] Console_Getopt - already installed: 1.2.3
[PEAR] Structures_Graph- already installed: 1.0.2
[PEAR] XML_Util       - already installed: 1.2.1
[PEAR] PEAR           - already installed: 1.9.0
Wrote PEAR system config file at: /usr/local/apache/php/etc/pear.conf
You may want to add: /usr/local/apache/php/lib/php to your php.ini include_path
/admin/scripts/home/tac/php-5.3.1/build/shtool install -c ext/phar/phar.phar /usr/local/apache/php/bin
ln -s -f /usr/local/apache/php/bin/phar.phar /usr/local/apache/php/bin/phar
Installing PDO headers:          /usr/local/apache/php/include/php/ext/pdo/
4. copy the .ini file to /etc/php.ini
Stop and start the httpd
You will have --enable socket support - enabled
            *write a small PHP program <?php phpinfo(); ?> in mysystem.php  /*shows system variables of php
             *browse http://hostname/mysystem.php
  1. Cacti Installation
0. Create username cacti in OS/MySQL ; Create database cacti;
        #mysqladmin -u root -p create cacti
1. wget
2. tar -xvzf cacti-0.8.8b.tar.gz

         mysql -p cacti < /usr/local/apache/htdocs/ips/cacti/cacti.sql
                        mysql> GRANT ALL ON cacti.* TO cactiuser@localhost IDENTIFIED BY 'password';
                        Query OK, 0 rows affected (0.00 sec)
                        mysql> flush privileges;
                        Query OK, 0 rows affected (0.00 sec)
                        mysql> exit
3. vim include/config.php
        /* make sure these values refect your actual database/host/user/password */
        $database_type = "mysql";
        $database_default = "cacti";
        $database_hostname = "localhost";
        $database_username = "cacti";
        $database_password = "password";
        $database_port = "3306";
        $database_ssl = false;
        Edit this to point to the default URL of your Cacti install
        ex: if your cacti install as at http://serverip/cacti/ this
         would be set to /cacti/
        $url_path = "/cacti/";
4. Installing Spine Poller
5.    Settings Poller in crontab
        vim /var/spool/cron/cacti
        #min hour dayofmonth monthofyear dayofweek0-sunday commands

        *       *       *       *       *   /usr/bin/php /usr/local/apache/htdocs/ips/cacti/poller.php

6. Use Spine Poller that higher degree of Efficieny that polls every one minute - use GUI.
Installing the core Plugin Architecture

        mysql -u cacti  < cacti-plugin-arch/pa.sql

1.    curl >threshold.tgz (Threshold Management)

        curl >settings.tgz (Mailer API)
        tar -xvzf threshold.tgz
        tar -xvzf settings.tgz

2. Use GUI Plugin Settings to Install

3. Creating Crontab for spine Poller
        vim /var/spool/cron/cacti
        #min hour dayofmonth monthofyear dayofweek0-sunday commands

        *       *       *       *       *   /usr/bin/php /usr/local/apache/htdocs/ips/cacti/poller.php

  1. Use Spine Poller for higher performance through GUI interface
  2. chown -R cacti.apache rra log  / Changing ownership of these directories recursively for
drwxr-xr-x  2 cacti    users 4.0K 2012-04-03 20:49 log

drwxr-xr-x  2 cacti    users 4.0K 2014-06-05 16:48 rra

Split Tunneling and DNS

1 what are different types of Tunneling available in VPN?

1. Full Tunnel - The VPN tunnel is used for every traffic (intranet/internal), *more secure
2. Split Tunnel - Two TCP/IP stacks are available,seperation of corporate and internet traffic,conserve b/w

2. what is Split DNS?
Split Domain Name System (DNS) allows DNS queries for certain domain names to be resolved to internal DNS servers over the VPN tunnel, while all the other DNS queries are resolved to the Internet Service Provider's (ISP) DNS servers

3.How are internal zones/domain  provided?
A list of internal domain names is "pushed" to the VPN Client during initial tunnel negotiation. The VPN Client then determines whether DNS queries should be sent over the encrypted tunnel or sent unencrypted to the ISP.

4. Where is Split DNS used ?
Split DNS is only used in split-tunneling environments, since traffic is sent both over the encrypted tunnel and unencrypted to the Internet.

5.What is Dynamic DDNS?
Dynamic DNS (DDNS) allows automatic registration of VPN Client host names into a DNS server upon successful negotiation of the VPN connection. When a VPN Client initiates a connection, the local host name is sent to the concentrator, which in turn forwards this onto the centrally located Dynamic Host Configuration Protocol (DHCP) server for the address allocation. If the DHCP server supports DDNS, then the allocated address and host name are entered automatically. DHCP address allocation is a requirement for DDNS to function, but does not work with local address pools.

6. What are the different ways of handing DNS queries in split tunneling-environment?
    Split-DNS -  DNS queries that match the domain names configured on the Cisco Adaptive Security Appliance (ASA) go through the tunnel, for example, to the DNS servers defined on the ASA, and others do not.

    Tunnel-all-DNS -  only DNS traffic to the DNS servers defined on the ASA is allowed. This setting is configured in the group policy.

    Standard DNS - all DNS queries go through the DNS servers defined by the ASA and, in the case of a negative response, might also go to the DNS servers configured on the physical adapter.

7.How does OS uses split tunneling ?

    On MS Windows, DNS settings are per-interface. This means that, if split tunneling is used, DNS queries can fall back to the physical adaptor's DNS servers if the query failed on the VPN tunnel adaptor. If split tunneling without split-DNS is defined, then both internal and external DNS resolution works because it falls back to the external DNS servers.

8.How DNS is used in VPN?

Depending on how your VPN is configured, you might or might not use the same DNS for your VPN and for Internet. VPN's are (typically) like an additional IP stack on your system, and can have a separate DNS server address configured.

    If your VPN does not assign a new DNS for the VPN session then you will continue to use the DNS server(s) configured in your main Internet IP Stack. This can present a problem if the external DNS cannot resolve internal addresses

    If your VPN does assign a new DNS - for example by using DHCP option 6 "DNS Server" - then you can have different DNS servers for the VPN and for Internet. Your OS must support this, as must the VPN service. If you send traffic out both stacks at once this would be "Split Mode".
    A final option is that you might operate your VPN in Tunnel Mode, sending all communications (including Internet) through the VPN stack. In this case, when you are on the VPN all DNS would use the VPN's DNS. This is probably the most secure way since all internal traffic is sure to stay in the VPN but choke your internet bandwidth.

*wonderful resources at : Cisco site,,

Saturday, July 5, 2014

Installing Multi Cluster Apache Hadoop-2.3.0 in CentOS

1. Pre-planning
After getting frustrated on finding multi cluster hadoop  2.x.x installation tutorial in internet, I decided to wrote this tutorial . I have tried my best to simplify as much as I can. DigitalOcean blog and Apache web site documentation help me a lot to understand the procedures and concepts.
The distributed architecture of the Hadoop has a  master node,  couple of slave nodes and a secondary master node (optional) . In this tutorial, I am going to set up one master node and two slave nodes.
Ensure that all slave nodes are identical ( in terms of partitions, memory, CPU etc ), not a requirement of hadoop installation but for the ease of  management and operational perspective.
Make sure that clusters can communicate with each other using PKI infrastructure.
The clusters will have following names and it should be specified in /etc/hosts (or should have record resource in DNS. Due to the massive communication overload / heart beats among the clusters, it is recommended that /etc/hosts is used instead of DNS server. Alternatively,cachedns agent can be configured in all nodes for caching of DNS records to avoid DNS queries) slave02 slave01 master
Generating and Copying Public Keys to all other nodes. How PKI infrastrucutre works is behind the scope of this installation, please refer to the basic concepts of cryptography and RSA/DSA Asymmetric algorithms.
In Master
ssh-keygen -t rsa
In slave01
ssh-keygen -t rsa
In slave02
#ssh-keygen -t rsa
Now , all nodes can communicate with any nodes without password through the keys that were just generated and copied.
Installing Java in all nodes
In Master Node:
#yum install java-1.7.0-openjdk-devel
#ssh slave01 "yum install java-1.7.0-openjdk-devel"
#ssh slave02 "yum install java-1.7.0-openjdk-devel"
check java version
#java -version
2. Hadoop Installation and Configuration
Master Node
#curl -O
# tar -xvzf hadoop-2.3.0.tar.gz
#mv hadoop-2.4.1-src /opt/hadoop
--------------------Entry of .bashrc file---------------------------------------------------
# User specific aliases and functions
alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk
export HADOOP_INSTALL=/opt/hadoop
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
-----------------End of bashrc file----------------------------------------------------
[root@master ~]# source ~/.bashrc /* Exporting Environmental Variables
/*Initializing Configuration Files
[root@master hadoop]# vim /opt/hadoop/etc/hadoop/
# The java implementation to use.

Configuring configuration .xml files
[root@master hadoop]# vim /opt/hadoop/etc/hadoop/core-site.xml
[root@master hadoop]# vim /opt/hadoop/etc/hadoop/yarn-site.xml
[root@master hadoop]#cp /opt/hadoop/etc/hadoop/mapred-site.xml.template /opt/hadoop/etc/hadoop/mapred-site.xml
[root@master hadoop]# vim /opt/hadoop/etc/hadoop/mapred-site.xml

Configuring /usr/local/hadoop/etc/hadoop/hdfs-site.xml . This file has to be configured for each host in the cluster that is being used. It is used to specify the directories which will be used as the namenode and the datanode on that host.
[root@master hadoop]# mkdir -p /data/local/hadoop_store/hdfs/namenode / * In Master Node
[root@master hadoop]# vim /opt/hadoop/etc/hadoop/hdfs-site.xml
2 /* The replication factor for files; ideally it should be at least 3 for reliability , depends upon how many data nodes you have.

For Slave Nodes:
Copy all hadoop files to slaves from Master
[root@master hadoop]#scp -r /opt/ha* slave01:/opt/
[root@master hadoop]#scp -r /opt/ha* slave02:/opt/
Create directories to be used in DataNodes
[root@master hadoop]# ssh slave01 "mkdir -p /data/local/hadoop_store/hdfs/datanode"
[root@master hadoop]# ssh slave02 "mkdir -p /data/local/hadoop_store/hdfs/datanode"
[root@slave01 ~]# vim /opt/hadoop/etc/hadoop/hdfs-site.xml
[root@slave02 ~]# vim /opt/hadoop/etc/hadoop/hdfs-site.xml
hdfs namenode -format /*Formatting the hdfs file system
cat /data/local/hadoop_store/hdfs/namenode/current/VERSION /*Verifying the version
Verifying the Processes
[root@master hadoop]# jps
15217 NameNode
15379 SecondaryNameNode
15800 Jps
15535 ResourceManager
[root@master hadoop]# jps
16321 NameNode
16635 ResourceManager
16881 Jps
16489 SecondaryNameNode
[root@master hadoop]# ssh slave01 "jps"
11887 NodeManager
11799 DataNode
11997 Jps
[root@master hadoop]# ssh slave02 "jps"
9415 Jps
9306 NodeManager
9219 DataNode
Verifying Through GUI

Troubleshooting Nodes:
Check the log files of each processes
Ex: [root@slave02 ~]# tail /opt/hadoop/logs/ /* All log files are located into logs folder inside hadoop
Make sure that IPTABLES or SELINUX is not blocking the communication between nodes, Typical error you get are:
2014-07-05 01:22:16,719 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/ Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-07-05 01:22:17,721 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/ Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

Note: I don't owe the  public domain, its only used for my internal configuration.