Assignment Mod 10
Short Description
assignment report on map reduce...
Description
ASSIGNMENT: MODULE 10 ALGORITHM DESIGN Input format: Input file has 12 columns with headings – 1. 3. 5. 7. 9. 11.
event_epoch_time device_id pizza_name Size Price Order_Event
2. 4. 6. 8. 10. 12.
user_id user_agent isCheeseBurst AddedToppings (colon separated string) CouponCode isVeg
1. Map-only algorithm for filtering out all the records which have event_epoch_time, user_id, device_id, user_agent as NULL by taking original dataset as input. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
/* Here Row[1], Row[2], Row[3]… denotes data in the column event_epoch_time event_epoch_time, , user_id, * device_id and so on with the index as shown in the list above. */ Map(Key, Value) Row = split(value, ‘\t’) // Here, ‘\t’ t’ is is to denote tab IF(Row[1] != NULL AND Row[2] != NULL AND Row[3] != NULL AND Row[4] != NULL AND) Write(Row) EXIT Map Function /*This function will output output the Row as a 2D array of the data we got from the table. * from here onwards Map(key, Row ) will denote the output of this function is taken as * input for the map function */
2. An algorithm to read the user agent and extract OS Version and Platform from it. 1. Map(Key, Row)
// Taking input form output of question 1, 1 row at a time from 2D Array, making the input as 1D array . OS_P = split(Row[4], ‘:’) OS_version =OS_P[2] //Assuming array’s index starts from 1 instead of 0 Platform = OS_P[1] Write(OS_version,1 ) Write(Platform,1 ) // This will output the OS Version and Platform from user_agent
2. 3. 4. 5. 6. 7.
3.
getCounter(“Orders”) creates a global variable of same name if already already not available. available. 3.1. To find out the number of veg and non-veg pizzas sold.
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Map(Key, Row) // Taking input form output of question 1 getCounter(“ Veg”) Veg”) getCounter(“ Non-Veg”) Non-Veg”) IF(Row[12] == “Y”) getCounter(“Veg”).incrementBy(1) IF(Row[12] == “N”) getCounter(“Non -Veg”).incrementBy(1) ELSE EXIT Map function EXIT Map Function PRINT Veg PRINT Non-Veg /*Print statement would display the total number of Veg/Non-Veg Pizzas sold since the Veg and Non-Veg are global variables.*/
3.2 To find out the size wise distribution of pizzas sold 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.
Map(Key, Row) // Taking input form output of question 1 getCounte r(“Small”) getCounter(“ Medium”) getCounter(“ Large”) getCounter(“ Total”) IF(Row[7] == “R”) getCounter(“ Small”).incrementBy(1) getCounter(“ Total”).incrementBy(1) IF(Row[7] == “M”) getCounter(“ Medium”).incrementBy(1) getCounter(“ Total”).incrementBy(1) IF(Row[7] == “L”) getCounter(“ Large”).incrementBy(1) getCounter(“ Total”).incrementBy(1) ELSE EXIT Map Function Exit Map Function //Total, Small, Medium and Large are global variable Total = Small + Medium + Large Distribution_small = (Small / Total)*100 Distribution_medium = (Medium / Total)*100 Distribution_large = (Large / Total)*100 PRINT Distribution_small, Distribution_medium, Distribution_large //Prints the size-wise distribution as the percentage of total pizzas sold
3.3 To find out how many cheese burst pizzas were sold 1. Map(Key, Row) // Taking input form output of question 1 2. getCounter(“ Cheese_Burst_Total ”) 3. IF(Row[6] == “Y”) 4. getCounter(“ Cheese_Burst_Total ”).incrementBy(1) 5. ELSE 6. EXIT Map Function 7. EXIT Map Function 8. PRINT Cheese_Burst_Total
3.4 To f ind out how many small cheese burst pizzas were sold 1. 2. 3. 4. 5. 6. 7.
Map(Key, Row) // Taking input form output of question 1 getCounter(“ Cheese_Burst_Small ”) IF(Row[6] == “Y” AND Row[7] == “R”) getCounter(“ Cheese_Burst_Small ”).incrementBy(1) EXIT Map function PRINT Cheese_Burst_Small //Ideally Cheese_Burst_Small will be 0 as cheese burst is available for medium and //large. But if there is error in data entry that would be seen in this case.
3.5 To find out the number of cheese burst pizzas whose cost is below Rs 500 1. Map(Key, Row) // Taking input form output of question 1 2. getCounter(“ Cheese_Burst_Cheap ”) 3. IF(Row[6] == “Y” AND Row[9] < 500) 4. getCounter(“ Cheese_Burst_Cheap ”).incrementBy(1) 5. EXIT Map Function 6. PRINT Cheese_Burst_Cheap //Prints number of cheese burst pizza sold below //Rs.500
4. getCounter(“Orders”) function is not available and write the algor ithms for functions in question3. 4.1 To find out the number of veg and non-veg pizzas sold. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
Map(Key, Row) // Taking input form output of question 1 IF(Row[12] == “Y”) Pizza_type = “Veg” IF(Row[12] == “N”) Pizza_type = “Non-Veg” Write(Pizza_type,1) EXIT Map Function Reduce(key, ValueList) //Taking aggregated output of Map Function as input Pizza_count = 0 for i = 0 to ValueList.length Pizza_count = Pizza_count + 1 Write(key, Pizza_count) Exit Reduce function //output will be the number of veg/Non-veg pizzas sold
4.2 To find out the size wise distribution of pizzas sold 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34.
Map(Key, Row) // Taking input form output of question 1 IF (Row[7] == “S”) Pizza_size = “Regular” IF (Row[7] == “M”) Pizza_size = “Medium” IF (Row[7] == “L”) Pizza_size = “Large” Write(Pizza_size,1) EXIT Map function Reduce(Key, ValueList) //Taking aggregated output of Map Function as input Size_count = 0 for i = 0 to ValueList.length Size _count = Size _count + 1 Write(key, Size_count) Exit Reduce function Distribution(key, Size_count_list) // Taking the output of Reduce function as input For(i = 0 to 2){ IF(Key == “Regular”) // here Regular, Medium, Large are integer variables Regular = Size_count[i] IF(Key == “Medium”) Medium = Size_count[i] IF(Key == “Large”) Large = Size_count[i] } Total = Regular + Medium + large Distribution_small = (Regular / Total)*100 Distribution_medium = (Medium / Total)*100 Distribution_large = (Large / Total)*100 PRINT Distribution_small, Distribution_medium, Distribution_large //Prints the size-wise distribution as the percentage of total pizzas sold
4.3 To find out how many cheese burst pizzas were sold 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
Map(key, Row) // Taking input form output of question 1 IF(Row[6] == “Y”) Crust = “Cheese_burst” ELSE Crust = “other” Write(Crust, 1) EXIT Map function Reduce(Key, ValueList) //Taking aggregated output of Map Function as input CB_count = 0 for i = 0 to ValueList.length CB_count = CB_count +1 IF(Key == “Cheese_burst ”) Write(key, CB_count) //Output will be total number of Cheese burst ELSE //pizzas sold, else no output Return -1 Exit Reduce Function
4.4 To find out how many small cheese burst pizzas were sold.
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
Map(Key, Row) IF(Row[6] == “Y” AND Row[7] == “R”) CB_Pizza_Size = “Cheese burst Small” ELSE CB_Pizza_Size = “Cheese burst other ” Write(CB_Pizza_Size,1) EXIT Map function Reduce(Key, ValueList) //Taking aggregated output of Map Function as input CB_size_count = 0 for i = 0 to ValueList.length CB_size_count = CB_size_count + 1 IF(Key == “Cheese Burst Small ”) Write(Key, CB_size_count) ELSE Return -1 EXIT Reduce Function //Ideally CB_size_count will be 0 as cheese burst is available for medium and large sizes. Here the Map function would always exit before the Write command as there are no small cheese burst pizzas available. But if there is error in data-set that would be seen in this case.
4.5 To find out the number of cheese burst pizzas whose cost is below Rs.500 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Map(Key, Row) IF(Row[6] == “Y” AND Row[9] < 500) CB_cheap = “Cheese burst Price < 500 ” ELSE CB_cheap = “Cheese burst Price > 500 ” Write(CB_cheap, 1) EXIT Map function Reduce(CB_cheap, Valuelist) //Taking aggregated output of Map Function as input CB_cheap_count = 0 for i = 0 to ValueList.length CB_cheap_count = CB_cheap_count + 1 IF(Key == Cheese burst Price < 500 ”)
14. Write(CB_cheap, CB_cheap_count)
//output will be “Cheese burst Price < 500, //”. Else no output
15. ELSE 16. Return -1 17. EXIT Reduce Function
Submitted By: Animesh Anand STUDENT ID: 2017CBDE037
View more...
Comments