Thursday, 13 June 2013

So, After a long time back to my Blogs.
This time with the latest trend in the IT industry. BIG DATA

If you think carefully, nowadays The Industry is data driven. So How crucial is your data, If you analyze your data for some time, definitely you can reach to some conclusion of a predictive future.
So can we really predict the future with our data, Yes … some extent. And that’s what the domain experts do to take their business on the winning side.
Gone are the days, when Industry uses to destroy their obsolete data, now they are stored in the warehouse so that they can use it ahead. That’s what a data warehouse is all about
Today Business grows by getting meaning out of their data.

Lets not take my blogs towards more of Business side and stay on Technical side.


Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage,search, sharing, transfer, analysis,and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, determine quality of research, and determine real-time roadway traffic conditions. – thanks to Wikipedia for that Definition
Believe it or not. This definition did not exist in Wikipedia before 3 years; this embarks the growth of big data technology

So In simple words BIG DATA is a collection of Data from different sources in different formats and to derive a meaning out of it which enhances the different aspects of Industry
To derive meaning out of your data through the business logic you perform en number of operation on you large scale records the operation include Extraction, Cleansing, Transformation, Loading, Reporting and many.
Whether its your xml format data, txt format data any format of data is stored in a Big data environment.

A use case of BIG DATA with Clickstream analysis
A clickstream is the recording of the parts of the screen a computer user clicks on while web browsing or using another software application. As the user clicks anywhere in the webpage or application, the action is logged on a client or inside the web server, as well as possibly the web browser, router, proxy server or ad server. Clickstream analysis is useful for web activity analysis,software testing, market research, and for analyzing employee productivity.
Some examples of Clickstream Analysis
The application of click stream data is only limited by your imagination. Here are some ideas:If you find that many people are visiting your website from a certain set of websites, you could consider catalyzing the process by advertising on those sites.

You could analyze the top exit-pages on your website to figure out whether there is something about those pages that is turning people away. Maybe you are asking for some information that visitors do not want to provide.
So Lets have brief idea of How this analysis is done
All of this data is recorded into the Client Database

If a visitor came in from a search engine, what search term did she use?
What pages did the visitor access on my website? In what sequence?
How much time did the visitor spend on each page?
Where was the visitor before she reached my website?

All your clicks are recorded with your id’s and your information and this is the data that is used ahead for Analysis. This data is in varied formats like URL’s, Timestamps, Text, etc . This data is put into the analytic engine which might be a Hadoop Framework or SQL engine or both and pushed to get the meaningful data out of it and the reporting layers does the rest of your Business Intelligence.

This was just one of the use case.

Big data is going to change the future IT industry .
For getting a deeper insights on the Big data Technologies stay tuned…..
Next will be the Architecture series for trending  Big data databases used currently all around the world .

