From the course: Data Engineering Foundations

Unlock the full course today

Join today to access over 24,100 courses taught by industry experts.

Solution: Transforming data

Solution: Transforming data

(upbeat music) - [Tutor] So here's the solution to the challenge. The first function that you are going to use to group all the ratings is group by. Then you would further add the column on which you want to group. So that is movie underscore id. To calculate the average rating, you would use the mean function. The column name is rating. Then in order to merge the two data frames we would have to provide the common column, which is .id, and the movies underscore data fame and movie underscore id, and average rating data frame. Finally, we would have to print the final data frame which is df.show. And that is it. We would have the transformed data frame. And this is the final data frame that you would see.

Contents