Introduction to Hadoop: What Hadoop is and why it is important in today's data-driven world.

Monika Rajput

Senior Data Engineer | CNCF Ambassador | Organizer @CNCF Gurugram | Public Speaker | Women in Business

Published May 12, 2023

What is Hadoop?

In today's data-driven world, millions of data are generated in seconds. It's getting tough to store and analyze the structure and unstructured data. But with the help of Hadoop, we can quickly get this job done.

Hadoop has two Main Layers

Storage: HDFS ( Hadoop Distributed File System)
Processing of Data: MapReduce

HDFS

HDFS is simply storage for big data, The ability of HDFS to store the data in a distributed manner gives us the power to store large amounts of Data.

Suppose you have 4TB of data that you need to store, Now HDFS spits the data into 1TB and stores it in clusters.

No alt text provided for this image — HDFS

MapReduce

MapReduce is for the Data Processing of Large amounts of data. So MapReduce processes the data in different machines or Nodes. So we can imagine our 4TB of data processed in multiple machines or DataNodes in chunks at the same time, when the processing is done on different clusters the results are then aggregated to give the final output.

Introduction to Hadoop: What Hadoop is and why it is important in today's data-driven world.

Monika Rajput

Senior Data Engineer | CNCF Ambassador | Organizer @CNCF Gurugram | Public Speaker | Women in Business

What is Hadoop?

HDFS

MapReduce

Latest on Data & Business

1,797 follower

More articles by this author

Insights from the community

Others also viewed

Harnessing the Power of Hadoop A Guide to Effective Data Management

Hadoop Training in Hyderabad

Hadoop

Hadoop Ecosystem Applications

The Power Of DistCp

MapReduce

HBase

Still confused about different data lakes? Hadoop Vs. In-Memory Databases

Hadoop

How will Hadoop Work?

Explore topics

What is Hadoop?

HDFS

MapReduce

Latest on Data & Business

1,797 follower

Strategic Analysis: Behind Business Success

Jul 16, 2024

Three V's of Big Data

Jan 22, 2024

What data engineers Do?

Jan 21, 2024

Analysis Plan

Jun 2, 2023

What and When of RDBMS Utilization

May 20, 2023

Insights from the community

Others also viewed

Harnessing the Power of Hadoop A Guide to Effective Data Management

Hadoop Training in Hyderabad

Hadoop

Hadoop Ecosystem Applications

The Power Of DistCp

MapReduce

HBase

Still confused about different data lakes? Hadoop Vs. In-Memory Databases

Hadoop

How will Hadoop Work?

Explore topics