Unit-1 Bigdata

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 6

INTRODUCTION TO BIGDATA

UNIT-1

What is Data?

The quantities, characters, or symbols on which operations are performed by a computer, which may be
stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical
recording media.

What is Big Data?

Big Data is also data but with a huge size. Big Data is a term used to describe a collection of data that is
huge in size and yet growing exponentially with time. In short such data is so large and complex that
none of the traditional data management tools are able to store it or process it efficiently.

Examples of BigData

1.The statistic shows that 500+terabytes of new data get ingested into the databases of social media
site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message
exchanges, putting comments etc.

2.A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many
thousand flights per day, generation of data reaches up to many Petabytes.

APPLICATIONS OF BIGDATA:

Big data involves the data produced by different devices and applications. Given below are some of the
fields that come under the umbrella of Big Data.
 Black Box Data : It is a component of helicopter, airplanes, and jets, etc. It captures
voices of the flight crew, recordings of microphones and earphones, and the performance
information of the aircraft.
 Social Media Data : Social media such as Face book and Twitter hold information and
the views posted by millions of people across the globe.
 Stock Exchange Data : The stock exchange data holds information about the ‘buy’ and
‘sell’ decisions made on a share of different companies made by the customers.
 Power Grid Data : The power grid data holds information consumed by a particular node
with respect to a base station.
 Transport Data : Transport data includes model, capacity, distance and availability of a
vehicle.

 Search Engine Data : Search engines retrieve lots of data from different databases.
Types Of Big Data

Thus Big Data includes huge volume, high velocity, and extensible variety of data. The data in it will be
of three types.
 Structured data : Relational data.
 Semi Structured data : XML data.
 Unstructured data : Word, PDF, Text, Media Logs.

Structured + Unstructured + Semi structured


= Big Data
Data Data Data

Structured Data:
Any data that can be stored, accessed and processed in the form of fixed format is termed as a
'structured' data. Over the period of time, talent in computer science has achieved greater success in
developing techniques for working with such kind of data (where the format is well known in advance)
and also deriving value out of it

Unstructured

Any data with unknown form is classified as unstructured data. In addition to the size being huge, un-
structured data poses multiple challenges in terms of its processing for deriving value out of it. A typical
example of unstructured data is a heterogeneous data source containing a combination of simple text
files, images, videos etc. Now day organizations have wealth of data available with them but
unfortunately, they don't know how to derive value out of it since this data is in its raw form or
unstructured format.
Examples Of Un-structured Data

The output returned by 'Google Search'

Semi-structured

Semi-structured data can contain both the forms of data. We can see semi-structured data as a structured
in form but it is actually not defined with e.g. a table definition in relational DBMS. Example of semi-
structured data is a data represented in an XML file.

Examples of Semi-structured Data

Personal data stored in an XML file-

<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>
Elements of Big Data (or)The 4 Vs of Big Data(or)Characterists of Big Data:

Volume – The name Big Data itself is related to a size which is enormous. Size of data plays a very
crucial role in determining value out of data. Volume is the amount of data generated by the
organizations or individuals. The most organizations exceeding exabytes today.

Every minute 571 new websites are being created. Hence, 'Volume' is one characteristic which needs to
be considered while dealing with Big Data.

Over 9,00,000 servers are owned by Google..which is the largest in the World.

Hence the volume is one of the characteristic for considering bigdata.

Variety – The next aspect of Big Data is its variety.

Variety refers to heterogeneous sources and the nature of data, both structured and unstructured. During
earlier days, spreadsheets and databases were the only sources of data considered by most of the
applications. Nowadays, data in the form of emails, photos, videos, monitoring devices, PDFs, audio,
etc. are also being considered in the analysis applications. This variety of unstructured data poses certain
issues for storage, mining and analyzing data.

Velocity – The term 'velocity' refers to the speed of generation of data. How fast the data is generated
and processed to meet the demands, determines real potential in the data.

Big Data Velocity deals with the speed at which data flows in from sources like business processes,
application logs, networks, and social media sites, sensors, Mobile devices, etc. The flow of data is
massive and continuous.

Veracity: Veracity generally refers to the uncertainty of data. i.e whether the obtained data is correct or
consistent. Only the data that is correct and consistent can be used for further analysis.

Importance of Bigdata:

1. Comprehend market Conditions –

Through big data, organisations can predict what future customer behaviour will be, purchasing
patterns, choices, product preferences. This will leverage the company, and help contest competitors.

2. Know your Customer Better –

 Through big data analysis, companies come to know the general thought process and feedback in
advance and make course corrections. Companies can reduce complaints and act on it before it
becomes big. There are big data tools that predict negative emotions, prompt action can be taken to
mitigate the same by organisations.

3. Control Online Reputation – 

Sentimental analysis can be done through Big Data Tools; thus a company can check on what is being
said by whom online and manage their online image efficiently and effectively.

4. Cost Saving – 

Firstly, there might be an initial cost of application of big data tools, but in the long run, the benefits
will outweigh the cost. Secondly, with the application of real-time big data tools, the IT staff will be
less burdened, so these resources could be used elsewhere, and lastly, the application of big data
technology will make storing of data easier and more accurate.

5. Availability of Data – 

Through big Data tools, relevant data can be available, in an accurate and structured format, in real
time.

Big data analytics:

Big Data Analytics largely involves collecting data from different sources, munge it in a way that it
becomes available to be consumed by analysts and finally deliver data products useful to the
organization business. The process of converting large amounts of unstructured raw data, retrieved from
different sources to a data product useful for organizations forms the core of Big Data Analytics.

Big data Applications:


Big Data can help in transforming major business processes by proper and correct analysis of available
data. Such business processes include:
Sales:
It helps in increasing sale for the business. It also helps in optimizing assignment of sales resources and
accounts, product mix and other operations.
Store Operations :
Different tools can be used to monitor store operations which reduce manual work. Big data helps in
adjusting inventory levels on the basis of predicted buying patterns, study of demographics, weather, key
events, and other factors.
Banking:
Big Data has provided biggest opportunity to companies like Citi bank to see the big picture due to
balancing the sensitive nature of the data for delivering value to clients along with prioritizing the
privacy and protection of information. It has been fully adopted by many companies to drive business
growth and enhance the services they provide to customers
Finance sector:
Financial services have widely adopted big data analytics to inform better investment decisions with
consistent returns. The big data pendulum for financial services has swung from passing fad to large
deployments last year.
Telecom:
A recent report, “Global Big Data Analytics Market in Telecom Industry 2014-2018,” found that use of
data analytics tools in telecom sector is expected to grow at a compound annual growth rate of 28.28
percent over the next four years. Mobile Telecom harnesses Big Data with combined actuate
and Hadoop solution.
Retail sector:
Retailers harness Big Data to offer consumers personalized shopping experiences. Analyzing how a
customer came to make a purchase, or the path to purchase, is 1 way big data tech is making a mark in
retail. 66% of retailers have made financial gains in customer relationship management through big data.
Learn more about Big data retail usage here.
HealthCare:
Big data is used for analyzing data in the electronic medical record (EMR) system with the goal of
reducing costs and improving patient care. This Data includes the unstructured data from physician
notes, pathology reports etc. Big Data and healthcare analytics have the power to predict, prevent & cure
diseases
Media and Entertainment:
Big data is changing the media and entertainment industry, giving users and viewers a much more
personalized and enriched experience. Big data is used for increasing revenues, understanding real-time
customer sentiment, increasing marketing effectiveness and ratings and viewership.
Tourism:
Big data is transforming the global tourism industry. People know more about the world than ever
before. People have much more detailed itineraries these days with the help of Big data.
Airlines:
Big Data and Analytics give wings to the Aviation Industry. An airline now knows where a plane is
headed, where a passenger is sitting, and what a passenger is viewing on the IFE or connectivity system.
Social Media:
Big data is a driving factor behind every marketing decision made by social media companies and it is
driving personalization to the extreme.

You might also like