An Intro into the PERL programming language
My Favourite Character - Jack Sparrow
[info]gurubaran
A NOTE ABOUT THIS POST: Programming Perl Version 3.0 - This is a very good book for reference. This is my own version of extracts from the book. I would have included most of the concepts that I have learned and understood.

1. An Intro into the PERL programming language

Is PERL an easy language to learn?

If you ask this question the answer is both yes and no. As anyone else I too had compared PERL with other programming or scripting languages like python and ruby. Like in the end what you will find out each one has its own advantages.

Let me state up front the best use of PERL

1.System Administration
2.Data Extraction ( RegEx comes very handy here)
3.File Manipulation tasks ( With perl you can do what you find difficult to achieve using the shell scripts)
4.The CPAN provides the most comprehensive library to support any tasks. Hence installing the modules from the CPAN and calling the appropriate library routines makes the world an easier place to live.

Ok, anything in the world that has advantages has the disadvantages.

1.Lot of ugly programs can be written using PERL like robots, scrapers.
2.The perl code is difficult to understand and read since we may not be following some predefined standards in PERL. The issue of following the conventional programming is addressed in PERL 3.0


2.Variables

Different types of variables available in perl

Type             Character          Example         Is a name for:
Scalar           $                       $value            String and Integer values can be stored here
Array            @                      @books           An array of values
Sub routine   &                      &leash             A callable chunk of perl code 
Typeglob       *                      *james             Everything named james


2.1 Singularities

Speaking of variables we must know the distinction between the often used words singularities and pluralities in the perl context. The scalar variable is used in singular context meaning it is used to hold a value in a variable name and the $var_name is used to access, assign and modify the variable name.

The scalar variable can be used to hold - integers, floating-point numbers, strings, and even references to other variables, or to objects

$ans = 2; # an integer
$pi = 3.14; # a "real" number
$avocados = 6.02e23; # scientific notation
$pet = "Positon"; # string
$sign = "I love my $pet"; # string with interpolation
$cost = 'It costs $100'; # string without interpolation
$thence = $whence; # another variable's value
$exit = system("vi $file"); # numeric status of a command
$cwd = `pwd`; # string output from a command

Scalar references to objects

$post = new Position "King";
if (not $post) { die "Object not created! "; }
$post->assign();

Here we create a reference to a Position object and put it into the variable $post. Next, we test $post as a scalar Boolean to see if it is "true", and we throw an exception if it is not true, which in this case would mean that the new Position constructor failed to make a proper Positon object. But on the last line, we treat $post as a reference by asking it to look up the assign() method for the object held in $post, which happens to be a Positon, so Perl looks up the assign() method for Positon objects. The context is important in Perl because that's how Perl knows what you want without your having to say it explicitly, as many other computer languages force you to do



2.2 Pluralities

Some kinds of variables hold multiple values that are logically tied together. Perl has two types of multivalued variables: arrays and hashes. In many ways, these behave like scalars--they spring into existence with nothing in them when needed, for instance. But they are different from scalars in that, when you assign to them, they supply a list context to the right side of the assignment rather than a scalar context.

2.2.1Arrays


An array is an ordered list of scalars, accessed by the scalar's position in the list. (It might also contain references to subarrays or subhashes.) To assign a list value to an array, you simply group the values together (with a set of parentheses):

@home = ("couch", "chair", "table", "stove");

As in C, arrays are zero-based, so while you would talk about the first through fourth elements of the array, you would get to them with subscripts 0 through 3

$home[0] = "couch";
$home[1] = "chair";
$home[2] = "table";
$home[3] = "stove";

Conversely, if you use @home in a list context, such as on the right side of a list assignment, you get back out the same list you put in. So you could set four scalar variables from the array like this:
($potato, $lift, $tennis, $pipe) = @home;
These are called list assignments. They logically happen in parallel, so you can swap two variables by saying:
($alpha,$omega) = ($omega,$alpha);

2.2.2 Hashes

A hash is an unordered set of scalars, accessed by some string value that is associated with each scalar. For this reason hashes are often called associative arrays. But that's too long for lazy typists to type, and we talk about them so often that we decided to name them something short and snappy. The other reason we picked the name "hash" is to emphasize the fact that they're disordered. (They are, coincidentally, implemented internally using a hash-table lookup, which is why hashes are so fast, and stay so fast no matter how many values you put into them.) You can't push or pop a hash though, because it doesn't make sense. A hash has no beginning or end. Nevertheless, hashes are extremely powerful and useful. “Until you start thinking in terms of hashes, you aren't really thinking in Perl”.

This is correct.

%longday = ("Sun", "Sunday", "Mon", "Monday", "Tue", "Tuesday",
"Wed", "Wednesday", "Thu", "Thursday", "Fri",
"Friday", "Sat", "Saturday");

But this is easier to interpret.

%longday = (
"Sun" => "Sunday",
"Mon" => "Monday",
"Tue" => "Tuesday",
"Wed" => "Wednesday",
"Thu" => "Thursday",
"Fri" => "Friday",
"Sat" => "Saturday",
);

 
Tags:

Web Spiders, Robots, Bots and Scrapers
My Favourite Character - Jack Sparrow
[info]gurubaran
                                    Web Spiders, Robots, Bots and Scrapers.


Table of Contents

1.Intended Audience
2.Pre Requisites
3.What is a Web Spider?
4.Example of a BOT – AWBOT!!!
5.Extract from the author of AWBOT.
6.Output of AWBOT
7.Scrapers
8.Artificial Intelligence in Bots and Scrapers
9.The concept of Fail-over
10.Web Administrators attitude of Thwarting the Bots and Scrapers
11.Data Mining and Self-Analysis
12.EXAMPLE OUTPUT FROM THE STOCK INFO SCRAPER
13.Appendix A - Not included here  

1. Intended Audience

Most part of this article is self explanatory. This article is intended to be read by any computer enthusiasts.

2. Pre Requisites

A familiarity with computers, networks and websites is most preferred as some of the terms are not explained in layman terms. A student of computer science is most welcome to read this article. This article can be used as the base for further exploration or taking up some research activities after careful dwelling into the subject which may be beyond the intent this article. The perl programs used for explanation. Though you need not have any familiarization with Perl programs - a honest interest in it will do.

3. What is a Web Spider?  

A spider is a program that crawls the Internet in a specific way for a specific purpose. The purpose could be to gather information or to understand the structure and validity of a Web site. Spiders are the basis for modern search engines, such as Google and AltaVista. These spiders automatically retrieve data from the Web and pass it on to other applications that index the contents of the Web site for the best set of search terms.

What is indexing and crawling?
How does a web spider does it?
What is the use of Robot?
Where does a Robot find its advantages in Web?

Well I leave the above questions unanswered, I know the curiosity in you will find the answers for them.

4. Example of a BOT - AWBOT!!!  

A bot - a perl bot used for testing and benchmarking websites for load testing.
It generates load as defined in the config file and tests the website for the reliability and/or response to heavy load.

5. Extract from the author of AWBOT. 

AWBot is an easy to use tool to test a web site.:
AWbot connects to your web site and make URL requests like any other visitors.
AWBot is not a web indexing robot but a web client tool that emulate some visitors browsing on your web site to test its stability after a development change, to test its reliability and/or response to heavy load.
You choose which pages you want to test in a test/config file (parameters in URLs or forms can be easily supplied).
Then you can launch AWbot as often as you want to test your site, get benchmarks information or make some load benchmarking (AWBot can be launched with several simultaneous process). AWbot supports the following feature:
* Easy to use (create one test/config file) and run (an "auto" mode for sites with no forms soon available).
* Support sites requiring Basic HTTP authentication.
* Can make different pre-post tasks before-after a test (external script, SQL commands...).
* Can check each HTML page resulting of HTTP requests to verify if contents contains/does not contains particular keywords or to extract values.
* URL or params to use in test can be dynamically defined (using values catched from previous page).
* A multi-session test launcher to run several simultaneous tests for load benchmarking.
* Report errors, response time for each page and average response time.
* A lot of other feature to match your test needs.
* Absolutely free with sources (GNU General Public License)
* AWBot has a XML Portable Application Description.

6. Output of AWBOT 
[root@guru-co bin]# perl awbot.pl -config=guru.conf
Awbot 1.1 (build 1.14) started for config/test file guru.conf
.
Test finished. Results are available in file ./output/guru.conf.out

[root@guru-co bin]# cat ./output/guru.conf.out
TEST awbot 1.1 (build 1.14)
---------------------------
Perl version: /usr/bin/perl 5.008008
Config file: guru.conf
Server: sites.visolve.com - User: guru - Delay: 0
Botname: AWBot - TimeOut: 120 - MaxSize: 0
Date: 2009-09-18 12:35:05
Process ID: 5365

ACTIONS
---------------------------
2009-09-18 12:35:05:215 URL 1 - http://sites.visolve.com/
---> OK - 4865.728 ms

SUMMARY
---------------------------
Total requests to do: 1
Total requests sent: 1 (1 answered)
Total requests duration: 4865.728 ms
Average request response time: 4865 ms/request
Faster request response time: URL 1 - 4865 ms
Slower request response time: URL 1 - 4865 ms
Total Check (Successfull/Done): CheckYes: 0/0 CheckNo: 0/0
URL 1 - Duration: 4865 ms - Cumul: 4865 ms - CheckYes: 0 / 0 CheckNo: 0 / 0

  7. Scrapers  

 8. Artificial Intelligence in Bots and Scrapers

An example of a scraper is a perl program to find the stock info (Say for example the current stock price). A scraper in simple terms can be defined as a program similar to Bots. The differentiation between them being scraper is used to extract a particular kind of information or data from the web. The websites URL from where it has to fetch the data and the content it has to fetch can be pre-programmed or even the scraper can be designed to find the list of URLs from where it has to fetch the data. The amount of Artificial Intelligence with which the Scraper works is enhanced in the latter case described above. The efficiency of the scraper increases with the complexity of the dynamic content it has to deal with. Here the artificial intelligence part comes into play. All computer geeks are well aware of the fact that the computer can only do what it is programmed to do. The AI part in the Bots and Scrapers is where we tell the Robots of how it should learn new facts, how it is to look for information in the ever changing dynamic content of the Web and how to deal with the complexities of Web.

9. The concept of Fail-over

Let us assume that the website that we have programmed for the scraper to find the information,is down. In this case, for our scraper to be robust we must specify a list of websites from where it should look for information. The complexity part comes into play because the manner in which the information can be extracted from different web sources differs significantly. We can design a numbers of modules to extract info from different sources. When one modules fails, the next one can take over doing the job for us. The programmer must keep this aspect in his design to make the Scraper is Robust and fault tolerant.

10. Web Administrators attitude of Thwarting the Bots and Scrapers

Next big hurdle in our project of designing a scraper is the common attitude of the web administrators. Well if not all, most of the Web administrators and designers believe that the Bots and Scrapers are Naughty malicious programs, designed to overload the website with their ability to generate enormous load. Well this fact can sometimes be true. There are some crackers whose primary motive is to bring the crash a website and the scrapers can be designed to generate millions of HTTP Requests per second. When the threshold limit of the Web Server exceeds, it causes problems to Web Administrators.

The way in which a website should handle the requests is specified in the Robots.txt. Now, due to rationalization mentioned above the Web Administrators DENY requests from Bots and Scrapers. :(

Believe me, for writing scrapers and Bots without the malicious intention, a course in ethical hacking is deemed to be important.

This necessitates the need for Writing Polite Robots.
The polite robots respect the Robots.txt file and informs it that it is a Robot with a short description of its intent. When the Robots.txt permit Robots,it is designed is a fashion so that it doesn't make too many requests too fast.

==========NOT ALL HACKERS ARE CRACKERS.=========

Ethical hacking is highlighted in the usefulness of the Bots and Scrapper Programs I have described.
In the first case the AWBot is used for testing and benchmarking the load and stress the website can handle.
In the latter case the scraper is used to extract the useful stock info for Stock Analysts.

11. Data Mining and Self-Analysis  

Data in itself is useless. The raw data needs to converted into information for conveying the meaning and to form a base on which a decision can be taken. Though this is not the intent of the Bots and Scrapers, this extension is most preferred. Well the perl has the ability to write and read from excel sheets. We can take the help of HTML::Extract perl module to do the job for us. Perl also has the ability to store the information in the databases. The driver extension for most of the databases is available.

Since we deal with perl, most of the people use linux platforms wherein we have the Open Office pre installed. Perl has the capability to deal with Open Office applications. The charts designed on the basis of the Stock information from the Scrapers can be used as a base for informed stock decisions.

12. EXAMPLE OUTPUT FROM THE STOCK INFO SCRAPER  

[root@guru-co Perl_Scripts]# perl yahooasia.pl
Enter the company name:TCS

The company name entered is: TCS

TCS.BO->symbol => TCS.BO
TCS.BO->ex_div => 27 Jul
TCS.BO->currency => INR
TCS.BO->success => 1
TCS.BO->isodate => 2009-09-18
TCS.BO->volume => 138700
TCS.BO->close => 579.90
TCS.BO->bid => 588.15
TCS.BO->div => 13.00
TCS.BO->low => 578.00
TCS.BO->net => +8.25
TCS.BO->div_date =>
TCS.BO->p_change => +1.42
TCS.BO->year_range => 355.25 - 805.00
TCS.BO->avg_vol => 682524
TCS.BO->day_range => 578.00 - 589.80
TCS.BO->date => 09/18/2009
TCS.BO->name => TCS LTD
TCS.BO->div_yield => 2.24
TCS.BO->eps => 0.00
TCS.BO->high => 589.80
TCS.BO->open => 579.80
TCS.BO->last => 588.15
TCS.BO->time => 14:05
TCS.BO->price => 588.15
TCS.BO->ask => 588.75


The current price of TCS.BO on the BSE is 588.15
Wish to get the price of other companies [y/n]: n

A Final Note: AWBot is the program which I studied to gather information about the perl bots behavior and how to write one myself. The stock info scraper program is the one which I am constructing for automatic stock market analysis.

Copyright © Gurubaran.V.S.
No part of this information can be published in any other journal or magazine without the written permission from the author of this article. This is only intended for educational purpose.

Types of Software Testing
My Favourite Character - Jack Sparrow
[info]gurubaran
Types of Software Testing
 
In the testing phase a software undergoes various types of testing before it is shipped to the customer
 
About 50 types of testing are available.
 
 Automation Testing
 
 Determines how well a product functions through a series of automated tasks, using a variety of tools to simulate complex test data.
 
 Acceptance Testing
 
 Formal testing conducted to determine whether or not a system satisfies its acceptance criteria - enables a customer to determine whether to accept the system or not.
 
 Alpha Testing
 
 Testing of a software product or system conducted at the developer’s site by the customer
 
 Automated Testing
 
 That part of software testing that is assisted with software tool(s) that does not require operator input, analysis, or evaluation.
 
 Beta Testing
 
 Testing conducted at one or more customer sites by the end user of a delivered software product system.
 
 Black-Box Testing
 
 Functional Testing based on the requirements with no knowledge of the internal program structure or data. Also known as closed box testing.
 
 Bottom-up Testing
 
 An integration testing technique that tests the low level components first using test drivers for those components that have not yet been developed to call the low level components for test.
 
 Clear-Box Testing
 
 Another term for White-Box Testing. Structural Testing is sometimes referred o as clear-box testing. This is also known as glass-box or open-box testing. White box testing is usually carried out by developers who are well versed in the source code. Unit test cases which are designed for the smallest part of the software system is a typical example of the white box testing. In Java the unit test case developed using the JUnit is a good example of the White box testing since it is designed to uncover errors in a particular java program/code.

 

 Database Testing
 
Most web sites of any complexity store and retrieve information from some type of database. Clients often want us to test the connection between their web site and database in order to verify data and display integrity.
 
 Dynamic Testing
 
 Verification or validation performed which executes the system code.
 
 Error-based Testing
 
 Testing where information about programming style, error-prone language constructs, and other programming knowledge is applied to select test data capable of detecting defaults, either a specified class of faults or all possible faults.
 
 Exhaustive Testing
 
 Executing the program with all possible combinations of values for program variables.
 
 Failure-directed Testing
 
 Testing based on the knowledge of the types of errors made in the past that are likely for the system under test.
 
 Fault based testing
 
 Testing that employs a test data selection strategy designed to generate test data capable of demonstrating the absence of a set of pre-specified faults, typically, frequent occurring faults.
 
 Functionality Testing
 
 Determines the extent to which a product meets expected functional requirements through validation of product features. This process can be as simple as a smoke test to ensure primary functional operation, or as detailed as checking a variety of scenarios and validating that all output meets specified expectations.
 
 Functional Localization Testing
 
 Determines how well a product functions across a range of language, localized versions are checked to determine whether particular language translations create failures specific to that language versions.
 
 Heuristics Testing
 
 Another term for fault-directed testing.
 
 Hybrid Testing
 
 A combination of top-down testing combined with bottom-up testing of prioritized or available components.
 
 Integration Testing
 
 An orderly progression of testing in which the software components or hardware components, or both are combined and tested until the entire system has been integrated.
 
 Interoperability Testing
 
 Determines, to a deeper extent than compatibility testing, how well a product works with a specific cross section of external components such as hardware, device drivers, second-party software and even specific operating systems and factory delivered computer systems.
 
 Intrusive Testing
 
 Testing that collects timing and processing information during program execution that may change the behavior of the software from its behavior in a real environment.
 
 Install Testing
 
 Determines how well and how easily a product installs on a variety of platform configurations
 
 Load Testing
 
 Determines how well a product functions when it is in competition for system resources. The competition most commonly comes from active processes, CPU utilization, I/O activity, network traffic or memory allocation.
 
 Manual Testing
 
 That part of software testing that requires operator input, analysis, or evaluation.
 
 Mutation Testing
 
 A method to determine test set thoroughness by measuring the extent to which a test set can discriminate the program from slight variants of the program.
 
 Mundane Testing
 
 A test that include many simple and repetitive steps, it can be called as Manual Testing
 
 Operational Testing
 
 Testing performed by the end user on software in its normal operating environment.
 
 Path coverage Testing
 
 A test method satisfying coverage criteria that each logical path through the program is tested. Paths through the program often are grouped into finite set of classes; one path from each class is tested.
 
 Performance Testing
 
 Determines how quickly a product executes a variety of events. This type of testing sometimes includes reports on response time to a user’s command, system throughput or latency. Although the word performance has various meanings, eg: speed.
 
 Qualification Testing
 
 Formal Testing usually conducted by the developer for the customer, to demonstrate that the software meets its specified requirements.
 
 Random Testing
 
 An essentially black-box testing approach in which a program is tested by randomly choosing a subset of all possible input values. The distribution may be arbitrary or may attempt to accurately reflect the distribution of inputs in the application environment.
 
 Regression Testing
 
 Selective re-testing to detect faults introduced during modification of a system or system component to verify that modifications have not caused unintended adverse effects, or to verify that a modified system or system component still meets its requirements.
 
 Smoke Testing
 
 It is performed only when the build is ready. Every file is compiled, linked, and combined into an executable program every day, and the program is then put through a “smoke test”, arelatively simple check to see whether the product  smokes; when it runs.
 
 Statement Coverage Testing
 
 A test method satisfying coverage criteria that requires each statement be executed at least once.
 
 Static Testing
 
 Verification performed without executing the system’s code. Also called static analysis.
 
 Stress Testing
 
 Determines, to a deeper extent than load testing, how well a product functions when a load is placed on the system resources that exceeds their capacity. Either stress testing can also determine the capacity of a system by increasing the load placed on the resources until a failure or other unacceptable product behaviour occurs. Stress testing can also involve placing loads on the system for extended periods.
 
 System Testing
 
 The process of testing an integrated hardware and software system to verify that the system meets its specified requirements.
 
 System Integration Testing
 
 Determine, through isolation, which component of a product is the roadblock in the development process. This testing is beneficial to products that come together through a series of builds where each step in the development process has the potential to introduce a problem. System integration testing is also used in systems composed of hardware and software. In essence, system integration testing is intended to exercise the whole system in real-world scenarios and, again through isolation, determine which component is responsible for a certain defect.
 
 Top-down Testing
 
 An integration testing technique that test the high-level components first using stubs for lower-level called components that have not yet been integrated and that stimulate the required actions of those components.
 
 Unit Testing
 
 The testing done to show whether a unit (the smallest piece of software that can be independently compiled or assembled, loaded, and tested) satisfies its functional specification or its implemented structure matches the intended design structure.
 
 White box Testing
 
 Testing approaches that examine the program structure and derive test data from the program logic.
 
 Web site Testing - Compatibility Testing
 
compatibility testing tests your web site across a wide variety browser/operating system combinations. This testing typically exposes problems with plug-ins. ActiveX controls, Java applets, JavaScript, forms and frames. Currently there are over 100 possible combinations of different windows operating systems and various versions of NE and IE browsers. It is important to test across a large number of these to ensure that users with diverse config don’t experience problems when using the web site or application.
 
 Web Site Testing - Content Testing
 
Content Testing verifies a web site’s content such as images, clip art and factual text.
 
Web site Testing - Database Testing
 
Most web sites of any complexity store and retrieve information from some type of database. Clients often want us to test the connection between their web site and database in order to verify data and display integrity.
 
 Web site Testing - Functionality Testing
 
Functionality testing ensures that the web site performs as expected. The details of this testing will vary depending on the nature of your web site. Typical examples of this type of testing include link checking, form testing, transaction verification for e-commerce and databases, testing java applets, file upload testing and SSL verification. For testing, which is repetitive in nature, an automated test tool such as Rational’s Visual Test can be used to decrease the overall duration of a test project.
 
 Web site Testing - Performance Testing
 
Performance Testing measures the web site performance during various conditions. When the conditions include different numbers of concurrent users, we can run performance tests at the same time as stress and load tests.  
 
  Eight Second Rule
 
Every page within a web site must load in eight seconds or less, even for users on slow modem connections, or they risk losing their user to a competitor site that serves pages more quickly.
 
 Web site Testing - Server Side Testing
 
Server side testing tests the server side of the site, rather than the client side. Examples of server side testing include testing the interaction between a web and an application server, checking database integrity on the database server itself, verifying that ASP scripts are being executed correctly on the server and determining how well a web site functions when run on different kinds of web servers.
 
 Web site Testing - Stress and Load Testing
 
Load Testing, a subset of stress testing, verifies that a web site can handle a particular number of concurrent users while maintaining acceptable response times. To perform this type of testing use sophisticated automated testing tools, such as Segue’s SilkPerformer, to generate accurate metrics based on overall system load and server configuration.

 Posted by Gurubaran

Perl code
My Favourite Character - Jack Sparrow
[info]gurubaran
Hi Friends,

I wanted a perl code to read a file and print the lines that does not begin with #.
That is the requirement is I wanted a script that would strip a file of its comments and present the output to us. The additional requirement is that we do not want the balnk lines to be printed in the output. The result of this requirement is the below simple and effective perl code that would do the job for us.

PERL CODE

#!/usr/bin/perl
my $file="squid.conf";
open(CONF_FILE,$file);
@data = ;
foreach $line (@data)
{
if($line !~ m/^(#)/i and $line !~ /^\s*$/)
{
print "$line";
}
}
close(CONF_FILE);

PERL CODE EXPLANATION

The perl code reads the file squid.conf. The most part of the perl code is self explanatory for the initiated. The elusive part being the if checking

if($line !~ m/^(#)/i and $line !~ /^\s*$/)

Here it is composed of two regular expressions.

1. $line !~ m/^(#)/i
2. $line !~ /^\s*$/

The first reg ex checks if the line does not start with a # (Comment line).
The second reg ex checks if it is a blank line. When both the conditions are satisfied the line is printed in the standard output.
Tags: ,

Cloud computing explained...
My Favourite Character - Jack Sparrow
[info]gurubaran
Cloud Computing

Cloud computing is a general term for anything that involves delivering hosted services over the Internet.Some analysts and vendors define cloud computing narrowly as an updated version of utility computing: basically virtual servers available over the Internet. Others go very broad, arguing anything you consume outside the firewall is "in the cloud," including conventional outsourcing.

Cloud computing comes into focus only when you think about what IT always needs "a way to increase capacity or add capabilities on the fly without investing in new infrastructure, training new personnel, or licensing new software. Cloud computing encompasses any subscription-based or pay-per-use service that, in real time over the Internet, extends IT's existing capabilities".

These services can be broadly classified into three major categories:
1. Infrastructure-as-a-Service (IaaS)
2. Platform-as-a-Service (PaaS) and
3. Software-as-a-Service (SaaS).

Cloud computing may be confused with Grid Computing, utilty computing and autonomic computing. Let me throw light on these variants in the next article.

1. Infrastructure-as-a-Service (IaaS)
Infrastructure as a Service (IaaS) is the delivery of computer infrastructure (typically a platform virtualization environment) as a service.

These 'virtual infrastructure stacks'[1] are an example of the everything as a service trend and shares many of the common characteristics. Rather than purchasing servers, software, data center space or network equipment, clients instead buy those resources as a fully outsourced service. The service is typically billed on a utility computing basis and amount of resources consumed (and therefore the cost) will typically reflect the level of activity. It is an evolution of web hosting and virtual private server offerings.

2. Platform-as-a-Service (PaaS)

Platform as a service (PaaS) is the delivery of a computing platform and solution stack as a service. It facilitates deployment of applications without the cost and complexity of buying and managing the underlying hardware and software layers[1], providing all of the facilities required to support the complete life cycle of building and delivering web applications and services entirely available from the Internet[2]—with no software downloads or installation for developers, IT managers or end-users. It's also known as cloudware[citation needed].

PaaS offerings include workflow facilities for application design, application development, testing, deployment and hosting as well as application services such as team collaboration, web service integration and marshalling, database integration, security, scalability, storage, persistence, state management, application versioning, application instrumentation and developer community facilitation. These services are provisioned as an integrated solution over the web.

A second definiton of PaaS is more client oriented. PaaS can be defined as the concept to deliver a cost-effective cloud based workspace environment – the platform - to the End-user which integrates work/life environment and facilitates him or/her to work, communicate, interact and play (games) anywhere, anytime, any device in a safe manner based on the roles assigned to the end-user. As such PaaS could also be described as Datacenter Centric Client Based Utility Computing.

3. Software-as-a-Service - SaaS

Software as a Service (SaaS, typically pronounced 'sass') is a model of software deployment whereby a provider licenses an application to customers for use as a service on demand. SaaS software vendors may host the application on their own web servers or download the application to the consumer device, disabling it after use or after the on-demand contract expires. The on-demand function may be handled internally to share licenses within a firm or by a third-party application service provider (ASP) sharing licenses between firms.

"The next big trend (Cloud Computing) sounds nebulous, but it's not so fuzzy when you view the value proposition from the perspective of IT professionals."

Advantages

* Cloud computing lets you access all your applications and documents from anywhere in the world, freeing you from the confines of the desktop and facilitating wholesale group collaboration.

* Unlimited storage capacity.

* Increased data reliability.

* Universal document access.

* Easier group collaboration.

Posted by Gurubaran

Source: Wikipedia and other internet magazines

Shell Script to Find IP Address in your network
My Favourite Character - Jack Sparrow
[info]gurubaran
Shell script to find IP Address and Machine Names in your Network

Here is a simple shell script to find the IP Address and Machine Names in your network. It discovers the machine in your network.

Only the machines that are up will be listed.

The Script uses ping and tracert commands available in linux to accomplish its job.

The schell script is self explanatory for users with the basic linux knowledge.

However more elucidation will be given on request.


#!/bin/bash
rm -f IP.txt
for (( i=1 ; i < 255 ; i++ ))
{
ping -c 2 172.16.1.$i > aa
touch ip.txt
if [[ `grep -c "100% packet loss" aa` -ne 1 ]] ; then
str=`tracert 172.16.1.$i | grep -v "traceroute to" | grep 172.16.1.$i | awk {'print $2'}`
echo "172.16.1.$i $str" >> IP.txt
fi
}

To see the Output

# cat IP.txt

172.16.1.1 guru-co.chip.com
172.16.1.5 has-co.chip.com
172.16.1.13 admin-co.chip.com

Created and Posted by Gurubaran

Trouble Shooting Berkeley DB errors while installing OpenLDAP
My Favourite Character - Jack Sparrow
[info]gurubaran
Trouble Shooting Berkeley DB errors while installing OpenLDAP

configure: error: BerkeleyDB version incompatible with BDB/HDB backends
configure: error: Berkeley DB version mismatch

1) Install New version of the Berkeley DB. Apply all the patches given for that particular version in the oracle website.
2) Sometimes it might be necessary to downgrade to lower versions of the Berkley DB.
A concise explanation of what version of Berkeley DB to use with OpenLDAP version can be found at http://www.openldap.org/faq/data/cache/44.html
3) Set the following environment variables
Set the LD_LIBRARY_PATH to the hidden libs directory from where you have built the Berkeley BD for your machine.
export LD_LIBRARY_PATH="/home/OpenLDAP+Squid/db-4.5.20/build_unix/.libs"
export CPPFLAGS=-I/usr/local/BerkeleyDB.4.7/include
export LDFLAGS=-L/usr/local/BerkeleyDB.4.7/lib
./configure

Posted by Gurubaran
Tags:

A PRE-SCHOOL TEST FOR YOU
My Favourite Character - Jack Sparrow
[info]gurubaran
Which way is the bus below travelling?

To the left or to the right?




Can't make up your mind?

Look carefully at the picture again.

Still don't know?


Primary school children all over the UK were shown this picture and asked the same question.

90% of them gave this answer:

'The bus is travelling to the right.'

When asked, 'Why do you think the bus is travelling to the right?


Scroll down
.
.
.



















They answered:

'Because you can't see the door to get on the bus.'


How do you feel now???






Posted by Gurubaran

Archives in *nix
My Favourite Character - Jack Sparrow
[info]gurubaran
Here some Basic commands to work with archives in *nix

Tarball (tar)
-------------

GNU tar is an archiver that creates and handles file archives in various formats using the 'tar' command ,it was originally used as a backup tool to write data to magnetic tape drives .You can use tar to create file archives, to extract files from previously created archives, store additional files, or update or list files which were already stored.

-Create a uncompressed tarball

> tar -cvf archive.tar file1

-Create an archive containing 'file1', 'file2' and 'dir1:

> tar -cvf archive.tar file1 file2 dir1

-Show contents of an archive

> tar -tf archive.tar

-Extract a tarball

> tar -xvf archive.tar

-Extract a tarball into / tmp

> tar -xvf archive.tar -C /tmp

-Create a tarball compressed into bzip2

> tar -cvfj archive.tar.bz2 dir1

-Decompress a compressed tar archive in bzip2

> tar -xvfj archive.tar.bz2

-Decompress a compressed tar archive in gzip

> tar -cvfz archive.tar.gz dir1

-Decompress a compressed tar archive in gzip

> tar -xvfz archive.tar.gz

Meanning of the c,v,f and z,j options:

-'c' option tells tar to create an archive,

-'v' displays the files added to the tarball and

-'f' specifies the filename. After the filename, all other parameters are the files or directories to add to the archive.

Tarballs are commonly compressed using gzip or bzip2 using the -z or -j command options.

bzip2
-----

bzip2 is a freely available, patent free (see below), high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression.(Read the documentation of bzip2).

-Compress a file called 'file1'

> bzip2 file1
-Compress a file called 'file1'

> bzip2 file1
-Decompress a file called 'file1.bz2'

> bunzip2 file1.bz2

Gzip
----

gzip is a software application used for file compression. gzip is short for GNU zip; the program is a free software replacement for the compress program used in early Unix systems, intended for use by the GNU Project.
gzip was created by Jean-Loup Gailly and Mark Adler. Version 0.1 was first publicly released on October 31, 1992. Version 1.0 followed in February 1993.
Gzip reduces the size of the named files using Lempel-Ziv coding (LZ77). Whenever possible, each file is replaced by one with the extension .gz, while keeping the same ownership modes, access and modification times.There is a good article if you want to learn more a bout gzip, see this link.

-Decompress a file called 'file1.gz'

> gunzip file1.gz
-Compress a file called 'file1'

> gzip file1
-Compress with maximum compression

> gzip -9 file1

Posted by Gurubaran

HOW TO CONTROL NETWORK BANDWIDTH
My Favourite Character - Jack Sparrow
[info]gurubaran
MASTER SHAPER
HOW TO CONTROL BANDWIDTH

Hi Guys, as described in my earlier post master shaper is an open source tool for traffic shaping. Here in this document I describe how to configure master shaper to control the network traffic.
You can download Master Shaper from http://www.mastershaper.org/

INDEX
1.Introduction
2.Define Options
3.Define Service Level
4.Create Filters
5.Create Chains
6.Create Pipes
7.Define Targets, Ports and Protocols
8.How to control bandwith based on Port and IP Address

1.Introduction

This document is intended to present in easy steps about how to configure Master Shaper to control bandwidth based on IP Address and Port. Note that the installation of the Master Shaper is to be completed as a pre-requisite as defined in the Master Shaper documentation before proceeding with the Configuration settings as described here. The intention of this document is to present in easy steps about how to setup the chains,pipes and filters to control the bandwidth based on the IP Address or Port.

2.Define Options

Once the installation of the Master Shaper is completed, we need to define the Options wherein we describe the incoming interface, outgoing interface,bandwidth of the interfaces, the Master Shaper QoS options and the other Master Shaper Options. The explanation of the options are listed in the Master Shaper itself. Below we present an example configuration that worked for our setup.

Example Configuration:
Bandwidth
Inbound Bandwidth: 10000 kbit/s [ The maximum bandwidth of the incoming interface]
Outbound Bandwidth: 10000 kbit/s [ The maximum bandwidth of the outgoing interface]
Interfaces
Incoming Interface: Eth0
Outgoing Interface: Eth2
IMQ: No
MasterShaper QoS Options
ACK packets: Ignore
Classifier: HTB
Default Queuing Discipline: SFQ
MasterShaper Options
Traffic filter: tc-filter
Mode: Bridge
Authentication: Yes

NOTE: Incase you set Authentication to Yes make sure you have defined users to login to the system.

[ Settings → Users
Create New User ]

3.Define Service Level

The next step is to proceed with the setting up of the service levels. The service level is the amount of the bandwidth that we intend to use for a particular protocol, say for example HTTP to control the web traffic. Hence it is necessary to define as many service levels as necessary for each Protocols or ports. This is the preliminary step in traffic shaping.

Settings → Service Level
Create New Service Level

Example Configuration:
General

Name:
Normal 1000
Classifier: HTB
Inbound
In-Bandwidth: 1000 kbit/s
In-Bandwidth ceil: 1000 kbit/s
In-Bandwidth burst: 1000 kbit/s
Outbound
Out-Bandwidth: 1000 kbit/s
Out-Bandwidth ceil: 1000 kbit/s
Out-Bandwidth burst: 1000 kbit/s
Priority: Normal(3)
Queuing Discipline: SFQ

4.Create Filters

Filters define what protocols and ports you wish to control based on traffic shaping. To control the web traffic while creating filters choose

Example Configuration:
Name: HTTP
Protocol : TCP
Ports : HTTP and HTTPS
TOS flags: Ignore
Source Destination
any <---> any

Here we can choose the source and destination to be any and any, direction to be bidirectional. The source, destination and direction can be selected while creating the chains thus keeping the filters generic.
We can use this filter in creating pipes. Pipes are in turn used in Chains for defining the traffic flow.

5.Create Chains

Chains are necessary to match the traffic against targets. If the target definition match your network traffic, the network flow will be redirected into this chain so it can be matched by the following pipe definitions. A chain needs to get defined a total amount of bandwidth and a fallback service level. Any traffic which comes into this chain and don't get matched by any pipe definitions will fall into the fallback service level.

Hence to control the traffic based on IP Address it is necessary to define the targets as given in section 7, then it is also necessary to describe the service level of the chain for the packets that fall into this category. Note that the bandwidth of the chain must be greater than the sum of the bandwidth of the pipes that is assigned below this chain. Let us describe this with an example.

Example Configuration:
Chain Name: CHAIN1
Status: Active
Bandwidth
Service Level: Average (in: 2200 kbit/sec out: 2200 kbit/sec)
Fallback: High (in: 5000 kbit/sec out: 5000 kbit/sec )
Targets
Affecting:
Source Destination
VIVEK CO <---> Any

Here VIVEK CO is a target which is defined by its IP Address. Thus this chain affects any traffic which originates from the target VIVEK CO. If necessary the direction of the Chains can be chosen.

6.Create Pipes

Pipes bring chains, filters and service levels together. In addition you can specify the direction of the pipes (incoming, outgoing). Here you also assign a service level, which regulate the bandwidth usage of this pipe. It is mandatory to choose to which chain the pipes belong. Hence it becomes necessary to create appropriate number of pipes to define the traffic for each chains that is defined.

Example Configuration:
General

Name: HTTP Pipe
Status: Active
Parameters

Chain: CHAIN1
Direction: Source <---> Destination
Filters: HTTP [Choose the filter that is already defined]
Service Level: Normal

7.Define Targets, Ports and Protocols

New Targets, Ports and Protocols can be created if necessary. Targets can be created for a single host or a group of hosts. MAC address can also be used to create the targets.

Target Example
Settings → Target
Create New Target

Name: VIVEK CO
IP: 172.16.1.26

8.How to control bandwidth based on Port and IP Address

Now as described in the above sections, let us consolidate the details we have used to control bandwidth based on IP Address and Ports. After defining the basic Options

1.Create Service Levels as necessary.
2.Define new targets, ports or protocols if necessary
3.Create the Filters, chains and pipes in order

Now go to the overview section to get a feel for how the traffic will be shaped.

Click “Load Ruleset” to load the rules defined. If you get the message “ Rules are enabled. No error found.” then the rules are loaded successfully.

Document by Gurubaran
Posted by Gurubaran

Linux Traffic Shaping or BandWidth Control
My Favourite Character - Jack Sparrow
[info]gurubaran

Linux Traffic shaping can be done using iptables and tc (traffic control )  Commands.

The Master Shaper a Bandwidth Control Open Source tool has a very Good gui implementation for the same.

The MasterShaper is a Web application to control and manage network bandwidth and QoS (Quality of Service) easily with a Web browser. It uses iproute2 and the features of newer 2.4 and 2.6 Linux kernels (such as HTB, HFSC, CBQ, SFQ, ESFQ, NETEM, etc.) to manage inbound and outbound network traffic. MasterShaper also displays graphs about the current bandwidth distribution (created with jpgraph). The filtering mechanisms it supports are tc-filter nd various iptables modules, such as TCP-Flags, TOS, l7-filter, and ipp2p.

Incase you need any help with Traffic Shaping, I would be gald to assist you.

I will reply to ur  comments, 

Created and Posted by Gurubaran


Basic Unix Commands
My Favourite Character - Jack Sparrow
[info]gurubaran
Hi All,

I am jotting down some of the basic unix commands that are used frequently so that it might be of help for the linux and unix newbies.

Welcome to the world of UNIX.


First lets see some command that are with respect to DIRECTORIES


1. cd - change directory


Usage:

Give absolute path name
cd  /home/guru


Suppose now I am in /home/guru and need to go to /home/guru/perl then from /home/guru , I can give

cd perl

This is relative to /home/guru


2. pwd  - present working directory


Self explanatory, it lists the current working directory


3. mkdir  directory_name    - make directory


The above command creates a new directory.


mkdir /home/guru/new_dir


Next let us see some commands for manipulating FILES


.... Will continue when i find time...


By the way ur comments will be useful for me..



 

Created and Posted by Gurubaran

MySQL Create User, database and add permissions
My Favourite Character - Jack Sparrow
[info]gurubaran
Mysql - Create User

The below steps explains how to create a mysql database, a user guru who has all the priviledges in the database and how to set the password for the user.

1. Start the mysql daemon

[root@guru-co htdocs]# /etc/init.d/mysqld status
mysqld is stopped
[root@guru-co htdocs]# /etc/init.d/mysqld start
Starting MySQL:                                            [  OK  ]
[root@guru-co htdocs]# mysql

2. Create a database DB_Shaper

mysql> create database db_shaper;
Query OK, 1 row affected (0.01 sec)

3. Create the user guru giving him all permissions to the database db_shaper

mysql> grant ALL on db_shaper.* to guru@localhost;
Query OK, 0 rows affected (0.09 sec)

Set the password for the user guru

mysql> set password for guru@localhost = password('genera123');
Query OK, 0 rows affected (0.03 sec)


Created and Posted by Gurubaran
Tags:

Polipo - another web cache proxy
My Favourite Character - Jack Sparrow
[info]gurubaran
Information for u...

Polipo is yet another caching web proxy just like Squid. Polipo can be used in a single user environment. For example if you are using any linux versions on your desktop or PC then you can probably go with polipo.

Polipo - What is it?

Polipo is a small and fast caching web proxy (a web cache, an HTTP proxy, a proxy server). Polipo is primarily designed to be used by one person or a small group of people. It can also be used by a large group of people, though this behaviour is not yet tested, I believe. 

Polipo has some features that are unique among currently available proxies:

*) Polipo uses HTTP/1.1 pipelining in case it believes that the remote server supports it, whether the incoming requests are pipelined or come in simultaneously on multiple connections (this is more than the simple usage of persistent connections, which is done by e.g. Squid);
*) Polipo caches the initial segment of an instance if the download has been interrupted and if necessary, complete it later using the Range requests;
*) Polipo upgrades client requests to HTTP/1.1 even if they come in as HTTP/1.0, and up- or downgrade server replies to the client's capabilities (this may involve conversion to or from the HTTP/1.1 chunked encoding);
*) Polipo has complete support for IPv6 (except for scoped (link-local) addresses).
*) Polipo optionally uses a technique known as Poor Man's Multiplexing to reduce latency even further.

Polipo uses a plethora of techniques to make web browsing faster.

By virtue of being a (mostly) compliant HTTP/1.1 proxy, Polipo has all the uses of traditional web proxies.

It is typically used as a web proxy for a single computer or a small network, although there's no fundamental reason why it shouldn't be used by a larger one. 

Since Polipo is small and easy to install, it has applications beyond those of traditional web proxies. You can  usually copy Polipo to whatever machine you are using and do all your browsing through it (with no on-disk cache).
It is also used it to cross firewalls that were misconfigured or overly restrictive.


Source: http://www.pps.jussieu.fr/~jch/software/polipo/

Posted by Gurubaran

Squid - Web Caching Proxy
My Favourite Character - Jack Sparrow
[info]gurubaran
What is Squid?

Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages. Squid has extensive access controls and makes a great server accelerator. It runs on most available operating systems, including Windows and is licensed under the GNU GPL.

Squid is used by hundreds of Internet Providers world-wide to provide their users with the best possible web access. Squid optimises the data flow between client and server to improve performance and caches frequently-used content to save bandwidth. Squid can also route content requests to servers in a wide variety of ways to build cache server hierarchies which optimise network throughput.

Website Content Acceleration and Distribution

Thousands of web-sites around the Internet use Squid to drastically increase their content delivery. Squid can reduce your server load and improve delivery speeds to clients. Squid can also be used to deliver content from around the world - copying only the content being used, rather than inefficiently copying everything. Finally, Squid's advanced content routing configuration allows you to build content clusters to route and load balance requests via a variety of web servers.

" [The Squid systems] are currently running at a hit-rate of approximately 75%, effectively quadrupling the capacity of the Apache servers behind them. This is particularly noticeable when a large surge of traffic arrives directed to a particular page via a web link from another site, as the caching efficiency for that page will be nearly 100%. " - Wikemedia Deployment Information.


Taken from www.squid-cache.org - The official site for SQUID

Your Comments are always welcome...

In the next article let me describe about Squid Configuration...


Posted by Gurubaran
Tags:

Akamai , What is it?
My Favourite Character - Jack Sparrow
[info]gurubaran
What is Akamai?

If you use the Internet for anything - to download music or software, check the headlines, book a flight - you've probably used Akamai's services without even knowing it. We play a critical role in getting content from providers to consumers.

Akamai has created a digital operating environment for the Web. Our global platform of thousands of specially-equipped servers helps the Internet withstand the crush of daily requests for rich, dynamic, and interactive content, transactions, and applications. When delivering on these requests, Akamai detects and avoids Internet problem spots and vulnerabilities, to ensure Websites perform optimally, media and software download flawlessly, and applications perform reliably.

Hundreds of enterprises worldwide use our global platform to sell, inform, entertain, market, advertise, deliver software, and conduct their business online. Through our EdgeControl they also gain insight into worldwide Internet conditions and access to tools to manage their online business. With Akamai's managed services there's no infrastructure to build or deploy. We can have you integrated on our platform in just a few days.

Today Akamai handles tens of billions of daily Web interactions for companies like Audi, NBC, and Fujitsu, and organizations like the U.S. Department of Defense and NASDAQ -- powering brand new business models that serve the changing online economy.


An extract from http://www.akamai.com/html/about/index.html

why linux? why open source?
My Favourite Character - Jack Sparrow
[info]gurubaran
Open Source Benefits

* Cost: License Fees and TCO
* Data integrity/interoperability
* Independence and Flexibility
* Stability and Reliability
* Broader Access to Information
* Community Support
* Engage Students in Collaboration

Open Source for Education in Schools

The use of open technologies in education is now commonplace throughout the world with one notable exception, the United States. School and district technology leaders need to become aware of how these other educational systems are leveraging the use of open technologies to improve student learning, engage parent and community interest in education, provide home access to technologies used in school and use their financial resources in the most effective way possible

Challenges of Open Source in Schools

* Most Softwares are written for Windows and Mac

* Staff,admins and other people must be adequately trained to use open source in the first place.

* Some view this like "It's too much of a change at once and could put a bad taste in administrators', teachers', students', and parents' mouth." But why dont we adopt and see? Why go behind Microsoft for all your needs? As democracy goes "For,by and of the ppl" So is linux... Use it, Contribute and develop it. Linux is already doing wonders in Server space arena.


Adopting to change is the key, why dont we do it early?

Your views and Comments plz.....

Posted by Gurubaran

Import from Outlook Express 6 to Thunderbird
My Favourite Character - Jack Sparrow
[info]gurubaran
Is this option not available. Cool

Follow these easy steps

1. Locate the folder where the Outlook Express Stores the mail

Can be something like
"/win/Documents and Settings/Guru/Local Settings/Application Data/Identities/{some_long_code}/Microsoft/Outlook Express"

2. Download the DbxConv - DBX to MBOX converter by Ulrich Krebs.
NOTE
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation

3. Take a back up of the *.dbx files from the Outlook Express local mail folder.

4. Put the *.dbx files to where the DbxConv.exe is present after unzipping.

5. If you are in Linux OS the DbxConv.exe cannot be executed untill you install wine. Hence this is a prerequisite.

6. Issue the below command if in Linux OS
wine DbxConv.exe Inbox.dbx

7. You would have got the Inbox.mbx. Close Thunderbird. Move the Inbox.mbx to the Thunderbird local mail folder.

8. Open Thunderbird to see your mails. :)

Created and Posted by Gurubaran

Pipelining in Firefox
My Favourite Character - Jack Sparrow
[info]gurubaran
Pipelining is sending more than one request before waiting for a response from the Web server.
Most Webservers complaint with HTTP/1.1 support pipeling.

Well we see that the browsing speed in Opera is sometimes faster than in Firefox. One of the factor is that pipelining is enableb by default in opera but not in Firefox.

Pipelining in FF

Just follow these steps to enable pipelining in Firefox

1 Type about:config in the address bar
2 Set the network.http.proxy.pipelining and set its value to true
3 Set the network.http.pipelining.maxrequests and set this to 8.

Colors using echo
My Favourite Character - Jack Sparrow
[info]gurubaran
Colors in HP UX !!!

Numbers Representing Colors in Escape Sequences


Color Foreground Background
black 30 40
red 31 41
green 32 42
yellow 33 43
blue 34 44
magenta 35 45
cyan 36 46
white 37 47

Example:

echo "\033[1;45mChoose one of the following persons:\033[0m"


The simplest, and perhaps most useful ANSI escape sequence is bold text, \033[1m ... \033[0m. The \033 represents an escape, the "[1" turns on the bold attribute, while the "[0" switches it off. The "m" terminates each term of the escape sequence

$ echo -e "\033[1mThis is bold text.\033[0m"

A similar escape sequence switches on the underline attribute (on an rxvt and and an aterm).

$ echo -e "\033[4mThis is underlined text.\033[0m"

Source: http://www.faqs.org/docs/abs/HTML/colorizing.html

Posted by Gurubaran

Home