× Successfully! Added to wish list

Learn By Example: Hadoop, MapReduce for Big Data problems

By: Loonycorn A 4-Ppl Team;ex-Google.

  • 4
  • (16)
  • 13:19:00
  • 84
  • 48
  • Language: English
449 4000
Apply
Promocode successfully applied Promocode not valid

Course Summary

Taught by a 4 person team including 2 Stanford-educated, ex-Googlers  and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with Java and with billions of rows of data. 

This course is a zoom-in, zoom-o

Read More

Target Audience

What is the target audience?

  • Yep! Analysts who want to leverage the power of HDFS where traditional databases don't cut it anymore
  • Yep! Engineers who want to develop complex distributed computing applications to process lot's of data
  • Yep! Data Scientists who want to add MapReduce to their bag of tricks for processing data

Pre-Requisites

What are the requirements?

  • You'll need an IDE where you can write Java code or open the source code that's shared. IntelliJ and Eclipse are both great options.
  • You'll need some background in Object-Oriented Programming, preferably in Java. All the source code is in Java and we dive right in without going into Objects, Classes etc
  • A bit of exposure to Linux/Unix shells would be helpful, but it won't be a blocker

Curriculum

  • Downloads for Sec 1
  • You, this course and Us
    01:52
  • DOWNLOAD SECTION 2- WhyBigData
  • The Big Data Paradigm
    14:20
  • Serial vs Distributed Computing
    08:37
  • What is Hadoop?
    07:25
  • HDFS or the Hadoop Distributed File System
    11:00
  • MapReduce Introduced
    11:39
  • YARN or Yet Another Resource Negotiator
    04:00
  • Hadoop Install Modes
    08:32
  • Setup a Virtual Linux Instance (For Windows users)
  • Hadoop Standalone mode Install
    15:46
  • Hadoop Pseudo-Distributed mode Install
    11:44
  • DOWNLOAD SECTION 4-MR-IntroSimpleWordCount
  • DOWNLOAD SECTION 4- SourceCode
  • The basic philosophy underlying MapReduce
    08:49
  • MapReduce - Visualized And Explained
    09:03
  • MapReduce - Digging a little deeper at every step
    10:21
  • "Hello World" in MapReduce
    10:29
  • The Mapper
    09:48
  • The Reducer
    07:46
  • The Job
    12:27
  • Get comfortable with HDFS
    10:58
  • Run your first MapReduce Job
    14:30
  • DOWNLOAD SECTION 6-MR-CombinerStreamingAPIMultipleReduceShuffleSort
  • Parallelize the reduce phase - use the Combiner
    14:39
  • Not all Reducers are Combiners
    14:31
  • How many mappers and reducers does your MapReduce have?
    08:23
  • Parallelizing reduce using Shuffle And Sort
    14:55
  • MapReduce is not limited to the Java language - Introducing the Streaming API
    05:05
  • Python for MapReduce
    12:19
  • DOWNLOAD SECTION 7-HDFS
  • DOWNLOAD SECTION 7-YARN
  • HDFS - Protecting against data loss using replication
    15:38
  • HDFS - Name nodes and why they're critical
    06:54
  • HDFS - Checkpointing to backup name node information
    11:16
  • Yarn - Basic components
    08:39
  • Yarn - Submitting a job to Yarn
    13:16
  • Yarn - Plug in scheduling policies
    14:27
  • Yarn - Configure the scheduler
    12:32
  • Manually configuring a Hadoop cluster (Linux VMs)
    13:50
  • Getting started with Amazon Web Servicies
    06:25
  • Start a Hadoop Cluster with Cloudera Manager on AWS
    13:04
  • DOWNLOAD SECTION 9-Customizing-MR
    00:00
  • Setting up your MapReduce to accept command line arguments
    13:47
  • The Tool, ToolRunner and GenericOptionsParser
    12:35
  • Configuring properties of the Job object
    10:41
  • Customizing the Partitioner, Sort Comparator, and Group Comparator
    15:16
  • DOWNLOAD SECTION 10-MR-InvertedIndex-WritableInterface-Bigram-MRUnit
    00:00
  • The heart of search engines - The Inverted Index
    14:47
  • Generating the inverted index using MapReduce
    10:31
  • Custom data types for keys - The Writable Interface
    10:29
  • Represent a Bigram using a WritableComparable
    13:19
  • MapReduce to count the Bigrams in input text
    08:32
  • Test your MapReduce job using MRUnit
    13:47
  • DOWNLOAD SECTION 11-Formats-And-Sorting
    00:00
  • Introducing the File Input Format
    12:48
  • Text And Sequence File Formats
    10:21
  • Data partitioning using a custom partitioner
    07:11
  • Make the custom partitioner real in code
    10:25
  • Total Order Partitioning
    10:10
  • Input Sampling, Distribution, Partitioning and configuring these
    09:04
  • Secondary Sort
    14:34
  • DOWNLOAD SECTION 12-MR-CollaborativeFiltering-Recommendations
    00:00
  • Introduction to Collaborative Filtering
    07:25
  • Friend recommendations using chained MR jobs
    17:15
  • Get common friends for every pair of users - the first MapReduce
    14:50
  • Top 10 friend recommendation for every user - the second MapReduce
    13:46
  • DOWNLOAD SECTION 13-MR-Databases-Select-Grouping
    00:00
  • Structured data in Hadoop Preview
    14:08
  • Running an SQL Select with MapReduce
    15:31
  • Running an SQL Group By with MapReduce
    14:02
  • A MapReduce Join - The Map Side
    14:19
  • A MapReduce Join - The Reduce Side
    13:07
  • A MapReduce Join - Sorting and Partitioning
    08:49
  • A MapReduce Join - Putting it all together
    13:46
  • DOWNLOAD SECTION 14-MR-Kmeans-Algo
    00:00
  • What is K-Means Clustering?
    14:04
  • A MapReduce job for K-Means Clustering
    16:33
  • K-Means Clustering - Measuring the distance between points
    13:52
  • K-Means Clustering - Custom Writables for Input/Output
    08:26
  • K-Means Clustering - Configuring the Job
    10:49
  • K-Means Clustering - The Mapper and Reducer
    11:23
  • K-Means Clustering : The Iterative MapReduce Job
    03:39

About the Author

Loonycorn A 4-Ppl Team;ex-Google.,

Loonycorn is us, Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh. Between the four of us, we have studied at Stanford, IIM Ahmedabad, the IITs and have spent years (decades, actually) working in tech, in the Bay Area, New York, Singapore and Bangalore. Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too Swetha: Early Flipkart employee, IIM Ahmedabad and IIT Madras alum Navdeep: longtime Flipkart employee too, and IIT Guwahati alum We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Unanth! We hope you will try our offerings, and think you'll like them :-)

More From Author

Reviews

Kavitha J
5

Very good course, concepts explained with details. At some places pace is fast but manageable with attached documents. I recommend this course.

Sunil K
5

I really enjoyed this course, it is very informative and simple to understand. It is up to date for each and every Hadoop concepts. I learned a lot from this course.

Vibinson Victoria
5

Covered all the concepts through picture and easy to understand. Well done team. Thanks a lot guys

Amit Soni
1

There is very abstract information which we can understand from different tutorials. In depth information should be there like why and how else all your efforts to watch this tutorials are waste of time and money. I joined this course to understand internal things which I didn't get and wasted my money.

Learn By Example: Hadoop, MapReduce for Big Data problems

  • 13:19:00
  • 84
  • 48
  • Language: English
4000 449
  • 15 days Money back Gurantee
  • Unlimited Access
  • Android, iPhone and iPad Access
  • Certificate of Completion

Course Summary

Taught by a 4 person team including 2 Stanford-educated, ex-Googlers  and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with Java and with billions of rows of data. 

This course is a zoom-in, zoom-o

Read More

Target Audience

What is the target audience?

  • Yep! Analysts who want to leverage the power of HDFS where traditional databases don't cut it anymore
  • Yep! Engineers who want to develop complex distributed computing applications to process lot's of data
  • Yep! Data Scientists who want to add MapReduce to their bag of tricks for processing data

Pre-Requisites

What is the target audience?

  • Yep! Analysts who want to leverage the power of HDFS where traditional databases don't cut it anymore
  • Yep! Engineers who want to develop complex distributed computing applications to process lot's of data
  • Yep! Data Scientists who want to add MapReduce to their bag of tricks for processing data

About the Author

Loonycorn A 4-Ppl Team;ex-Google.,

Loonycorn is us, Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh. Between the four of us, we have studied at Stanford, IIM Ahmedabad, the IITs and have spent years (decades, actually) working in tech, in the Bay Area, New York, Singapore and Bangalore. Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too Swetha: Early Flipkart employee, IIM Ahmedabad and IIT Madras alum Navdeep: longtime Flipkart employee too, and IIT Guwahati alum We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Unanth! We hope you will try our offerings, and think you'll like them :-)

More From Author

Review & Rating

Kavitha J 5

Very good course, concepts explained with details. At some places pace is fast but manageable with attached documents. I recommend this course.

Sunil K 5

I really enjoyed this course, it is very informative and simple to understand. It is up to date for each and every Hadoop concepts. I learned a lot from this course.

Vibinson Victoria 5

Covered all the concepts through picture and easy to understand. Well done team. Thanks a lot guys

Amit Soni 1

There is very abstract information which we can understand from different tutorials. In depth information should be there like why and how else all your efforts to watch this tutorials are waste of time and money. I joined this course to understand internal things which I didn't get and wasted my money.