From the course: Analyzing Big Data with Hive
Unlock the full course today
Join today to access over 24,200 courses taught by industry experts.
Joining multiple tables together - Hive Tutorial
From the course: Analyzing Big Data with Hive
Joining multiple tables together
- [Narrator] So here, let's take a look at joining multiple tables together. Now, Hive joins are cumulative, meaning that they are executed in sequential order. And as you join, the results from one previous join are going to be filtering out data as you go down. So you need to be really cognizant of how you order your joins so that way you don't get invalid results. Let's take a look at a simple query here. We have SELECT a.val, b.val, c.val FROM a JOIN b and then LEFT OUTER JOIN c. So what's going to happen is it's going to join a to b and discard anything that's unmatched because we're doing an inner join. Then the results of the join from a to b are going to be joined to c. And we're only going to return those results from c that match and keep everything from that first join because we're doing a left outer join. So remember, these are cumulative. So as we go down here, our result set's going to get smaller and smaller. Let's take a look at this in real life. First, we're going…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.