Big Data Online Training

What's Big Data?

Big Data usually consists of vast amounts of data causing commonly used software tools to be unable to capture, curate, manage, and process the data within a tolerable execution time. Big Data is described using the "4Vs" : Volume, Velocity, Variety, and Veracity.

Advancements in big data analysis offer opportunities to improve decision-making, spot business trends, and even figure out the answers to questions that were previously unimaginable.

What's Big Data solution?

Example: Trend Micro Smart Protection Network

Trend Micro faced a challenge during detecting threats in the form of fraudulent activity or attacks from large data volumes (6TB of data and 15B lines of logs received daily) which is just the same as looking for a needle in a haystack. And Trend Micro created a Smart Protection Network solution with Hadoop to parallel processe over huge datasets and identify anomalies/threats via the pattern recognition.

Virus Event Raw Logs

Virus Trend Mined From Virus Event Raw Logs

Example: Business Intelligence

Business intelligence (BI) is a set of theories, methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information. BI can handle large amounts of information to help identify and develop new opportunities. Making use of new opportunities and implementing an effective strategy can provide a competitive market advantage and long-term stability.

Wal-Mart Beer and Diapers Story:

Some time ago, Wal-Mart decided to combine the data from its loyalty card system with that from its point of sale systems. and they found an unexpected results that customers tended to co-purchase beer and diapers on Friday afternoons. With more investigations, they found that young men who come here to buy diapers for their babies also tend to buy beers on Friday afternoon. After seeing the results of the data mining, Wal-Mart moved the beer next to the diapers and beer sales went up.

Example: Open Data

Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. The goals of the open data movement are similar to those of other "Open" movements such as open source, open hardware, open content, and open access. The philosophy behind open data has been long established (for example in the Mertonian tradition of science), but the term "open data" itself is recent, gaining popularity with the rise of the Internet and World Wide Web and, especially, with the launch of open-data government initiatives such as

The Best Open Data Releases of 2012 in North America shows illustrating the breadth of what we might learn in the still relatively young field of urban open data. For this year's installment, we're going one step further. Sure, raw data is great. But useful tools, maps and data visualizations built with said data are even better.

Big Data Training

Cloud Computing Era

Hadoop and HDFS