Monday, June 30, 2025

MapReduce

 


2004 -> GMR - MR

200 -> GFS - HDFS


Hadoop
|
|--> MR

|--> HDFS 

What is MR

Batch processing framework 

Spark also processing framework 



Language used in Mapreduce : Java by default 

Advantages  of MR : 

>  Framework : : Like gated community where all other things will be managed by the admin team 
> Cluster monitoring 

>Resource allocation 

>Cluster management 

>Scheduling 

>Execution 

>Speculative execution


Aim of  MR: 
data locality 


Use of MR:

Parallel processing 


Abstraction of MR : 
Hive ,Pig,Scoopy ,oozie will work only when MR is running 


Alternate of MR : 
Spark 



Daemons in MR: 

JobTracker and TaskTracker-V1

Resource Manager & Node Manager -V2 


MAP 


REDUCE 

Inputs for Mapper : Blocks 

Inpu





No comments:

Post a Comment