AMR Assignment

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

What is Data?

The quantities, characters, or symbols on which operations are


performed by a computer, which may be stored and transmitted in the
form of electrical signals and recorded on magnetic, optical, or
mechanical recording media.

What is Big Data?


Big Data is also data but with a huge size. Big Data is a term used to
describe a collection of data that is huge in size and yet growing
exponentially with time. In short such data is so large and complex that
none of the traditional data management tools are able to store it or
process it efficiently. Big data is a field that treats ways to analyze,
systematically extract information from, or otherwise deal with data sets
that are too large or complex to be dealt with by traditional data-
processing application software. Data with many cases (rows) offer
greater statistical power, while data with higher complexity (more
attributes or columns) may lead to a higher false discovery rate.[2] Big
data challenges include capturing data, data storage, data analysis,
search, sharing, transfer, visualization, querying, updating, information
privacy and data source. Big data was originally associated with three key
concepts: volume, variety, and velocity.

Types Of Big Data


BigData could be found in three forms:

1. Structured
2. Unstructured
3. Semi-structured

Structured
Any data that can be stored, accessed and processed in the form of
fixed format is termed as a 'structured' data. Over the period of time,
talent in computer science has achieved greater success in developing
techniques for working with such kind of data (where the format is
well known in advance) and also deriving value out of it. However,
nowadays, we are foreseeing issues when a size of such data grows to
a huge extent, typical sizes are being in the rage of multiple zettabytes.

Examples Of Structured Data

An 'Employee' table in a database is an example of Structured Data

Employee_ID Employee_Name Gender Department Salary_In_lacs

2365 Rajesh Kulkarni Male Finance 650000

3398 Pratibha Joshi Female Admin 650000

7465 Shushil Roy Male Admin 500000

7500 Shubhojit Das Male Finance 500000

7699 Priya Sane Female Finance 550000

Unstructured
Any data with unknown form or the structure is classified as unstructured
data. In addition to the size being huge, un-structured data poses
multiple challenges in terms of its processing for deriving value out of it. A
typical example of unstructured data is a heterogeneous data source
containing a combination of simple text files, images, videos etc. Now day
organizations have wealth of data available with them but unfortunately,
they don't know how to derive value out of it since this data is in its raw
form or unstructured format.

Examples Of Un-structured Data


The output returned by 'Google Search'

Semi-structured
Semi-structured data can contain both the forms of data. We can see
semi-structured data as a structured in form but it is actually not defined
with e.g. a table definition in relational DBMS. Example of semi-
structured data is a data represented in an XML file.

Examples Of Semi-structured Data

Personal data stored in an XML file-

<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>

Characteristics Of Big Data


(i) Volume – The name Big Data itself is related to a size which is
enormous. Size of data plays a very crucial role in determining value out
of data. Also, whether a particular data can actually be considered as a
Big Data or not, is dependent upon the volume of data.
Hence, 'Volume' is one characteristic which needs to be considered while
dealing with Big Data.

(ii) Variety – The next aspect of Big Data is its variety.

Variety refers to heterogeneous sources and the nature of data, both


structured and unstructured. During earlier days, spreadsheets and
databases were the only sources of data considered by most of the
applications. Nowadays, data in the form of emails, photos, videos,
monitoring devices, PDFs, audio, etc. are also being considered in the
analysis applications. This variety of unstructured data poses certain
issues for storage, mining and analyzing data.

(iii) Velocity – The term 'velocity' refers to the speed of generation of


data. How fast the data is generated and processed to meet the
demands, determines real potential in the data.

Big Data Velocity deals with the speed at which data flows in from
sources like business processes, application logs, networks, and social
media sites, sensors, Mobile devices, etc. The flow of data is massive and
continuous.

(iv) Variability – This refers to the inconsistency which can be shown by


the data at times, thus hampering the process of being able to handle and
manage the data effectively.

APPLICATION OF BIG DATA IN THE INDUSTRY

1. Banking and Securities

Industry-Specific big data challenges

A study of 16 projects in 10 top investment and retail banks shows that


the challenges in this industry include: securities fraud early warning, tick
analytics, card fraud detection, archival of audit trails, enterprise credit
risk reporting, trade visibility, customer data transformation, social
analytics for trading, IT operations analytics, and IT policy compliance
analytics, among others.
Applications of big data in the banking and securities industry

The Securities Exchange Commission (SEC) is using big data to monitor


financial market activity. They are currently using network analytics and
natural language processors to catch illegal trading activity in the financial
markets.

Retail traders, Big banks, hedge funds and other so-called ‘big boys’ in the
financial markets use big data for trade analytics used in high frequency
trading, pre-trade decision-support analytics, sentiment measurement,
Predictive Analytics etc.

This industry also heavily relies on big data for risk analytics including;
anti-money laundering, demand enterprise risk management, "Know Your
Customer", and fraud mitigation.

Big Data providers specific to this industry include: 1010data, Panopticon


Software, Streambase Systems, Nice Actimize and Quartet FS.

2. Communications, Media and Entertainment


Industry-Specific big data challenges

Since consumers expect rich media on-demand in different formats and


in a variety of devices, some big data challenges in the communications,
media and entertainment industry include:

 Collecting, analyzing, and utilizing consumer insights


 Leveraging mobile and social media content
 Understanding patterns of real-time, media content usage

Applications of big data in the Communications, media and


entertainment industry .Organizations in this industry simultaneously
analyze customer data along with behavioral data to create detailed
customer profiles that can be used to:

 Create content for different target audiences


 Recommend content on demand
 Measure content performance
A case in point is the Wimbledon Championships (YouTube Video) that
leverages big data to deliver detailed sentiment analysis on the tennis
matches to TV, mobile, and web users in real-time.

Spotify, an on-demand music service, uses Hadoop big data analytics, to


collect data from its millions of users worldwide and then uses the
analyzed data to give informed music recommendations to individual
users.

Amazon Prime, which is driven to provide a great customer experience by


offering, video, music and Kindle books in a one-stop shop also heavily
utilizes big data.

Big Data Providers in this industry include: Infochimps, Splunk, Pervasive


Software, and Visible Measures.

3. Healthcare Providers
Industry-Specific challenges

The healthcare sector has access to huge amounts of data but has been
plagued by failures in utilizing the data to curb the cost of rising
healthcare and by inefficient systems that stifle faster and better
healthcare benefits across the board.

This is mainly due to the fact that electronic data is unavailable,


inadequate, or unusable. Additionally, the healthcare databases that hold
health-related information have made it difficult to link data that can
show patterns useful in the medical field.

Other challenges related to big data include: the exclusion of patients


from the decision making process, and the use of data from different
readily available sensors.

Applications of big data in the healthcare sector:

Some hospitals, like Beth Israel, are using data collected from a cell phone
app, from millions of patients, to allow doctors to use evidence-based
medicine as opposed to administering several medical/lab tests to all
patients who go to the hospital. A battery of tests can be efficient but
they can also be expensive and usually ineffective.

Free public health data and Google Maps have been used by the
University of Florida to create visual data that allows for faster
identification and efficient analysis of healthcare information, used in
tracking the spread of chronic disease.

Obamacare has also utilized big data in a variety of ways.

Big Data Providers in this industry include: Recombinant Data, Humedica,


Explorys and Cerner

4. Education

Industry-Specific big data challenges

From a technical point of view, a major challenge in the education


industry is to incorporate big data from different sources and vendors
and to utilize it on platforms that were not designed for the varying data.

From a practical point of view, staff and institutions have to learn the new
data management and analysis tools.

On the technical side, there are challenges to integrate data from


different sources, on different platforms and from different vendors that
were not designed to work with one another.

Politically, issues of privacy and personal data protection associated with


big data used for educational purposes is a challenge.

Applications of big data in Education :

Big data is used quite significantly in higher education. For example, The
University of Tasmania. An Australian university with over 26000
students, has deployed a Learning and Management System that tracks
among other things, when a student logs onto the system, how much
time is spent on different pages in the system, as well as the overall
progress of a student over time.

In a different use case of the use of big data in education, it is also used to
measure teacher’s effectiveness to ensure a good experience for both
students and teachers. Teacher’s performance can be fine-tuned and
measured against student numbers, subject matter, student
demographics, student aspirations, behavioral classification and several
other variables.

5. Manufacturing and Natural Resources


Industry-Specific challenges

Increasing demand for natural resources including oil, agricultural


products, minerals, gas, metals, and so on has led to an increase in the
volume, complexity, and velocity of data that is a challenge to handle.

Similarly, large volumes of data from the manufacturing industry are


untapped. The underutilization of this information prevents improved
quality of products, energy efficiency, reliability, and better profit
margins.

Applications of big data in manufacturing and natural resources

In the natural resources industry, big data allows for predictive modeling
to support decision making that has been utilized to ingest and integrate
large amounts of data from geospatial data, graphical data, text and
temporal data. Areas of interest where this has been used include;
seismic interpretation and reservoir characterization. Big data has also
been used in solving today’s manufacturing challenges and to gain
competitive advantage among other benefits.

6. Government

Industry-Specific challenges
In governments the biggest challenges are the integration and
interoperability of big data across different government departments and
affiliated organizations.

Applications of big data in Government

In public services, big data has a very wide range of applications including:
energy exploration, financial market analysis, fraud detection, health
related research and environmental protection.

Some more specific examples are as follows:

Big data is being used in the analysis of large amounts of social disability
claims, made to the Social Security Administration (SSA), that arrive in the
form of unstructured data. The analytics are used to process medical
information rapidly and efficiently for faster decision making and to
detect suspicious or fraudulent claims. The Food and Drug Administration
(FDA) is using big data to detect and study patterns of food-related
illnesses and diseases. This allows for faster response which has led to
faster treatment and less death.

The Department of Homeland Security uses big data for several different
use cases. Big data is analyzed from different government agencies and is
used to protect the country.

7.Insurance

Applications of big data in the insurance industry

Big data has been used in the industry to provide customer insights for
transparent and simpler products, by analyzing and predicting customer
behavior through data derived from social media, GPS-enabled devices
and CCTV footage. The big data also allows for better customer retention
from insurance companies.

When it comes to claims management, predictive analytics from big data


has been used to offer faster service since massive amounts of data can
be analyzed especially in the underwriting stage. Fraud detection has also
been enhanced.
Through massive data from digital channels and social media, real-time
monitoring of claims throughout the claims cycle has been used to
provide insights.

8.Retail and Whole sale trade


Industry-Specific challenges

From traditional brick and mortar retailers and wholesalers to current day
e-commerce traders, the industry has gathered a lot of data over time.
This data, derived from customer loyalty cards, POS scanners, RFID etc. is
not being used enough to improve customer experiences on the whole.
Any changes and improvements made have been quite slow.

Applications of big data in the Retail and Wholesale industry

Big data from customer loyalty data, POS, store inventory, local
demographics data continues to be gathered by retail and wholesale
stores.

In New York’s Big Show retail trade conference in 2014, companies like
Microsoft, Cisco and IBM pitched the need for the retail industry to utilize
big data for analytics and for other uses including:

Optimized staffing through data from shopping patterns, local events,


and so on, Reduced fraud ,Timely analysis of inventory

Social media use also has a lot of potential use and continues to be slowly
but surely adopted especially by brick and mortar stores. Social media is
used for customer prospecting, customer retention, promotion of
products, and more.

9.Transportation

Industry-Specific challenges

In recent times, huge amounts of data from location-based social


networks and high speed data from telecoms have affected travel
behavior. Regrettably, research to understand travel behavior has not
progressed as quickly
In most places, transport demand models are still based on poorly
understood new social media structures.

Applications of big data in the transportation industry:

 Some applications of big data by governments, private


organizations and individuals include:
 Governments use of big data: traffic control, route planning,
intelligent transport systems, congestion management (by
predicting traffic conditions)
 Private sector use of big data in transport: revenue management,
technological enhancements, logistics and for competitive
advantage (by consolidating shipments and optimizing freight
movement)
 Individual use of big data includes: route planning to save on fuel
and time, for travel arrangements in tourism etc.

You might also like