Retailing Analysis Using Hadoop and Apache Hive

Hiba A. Abu-Alsaad (Mustansiriyah University, Iraq)

Convenience is a very important factor for people in daily activities. This is particularly true for users who consume goods and services, and also for retailers, who provide such services. The retail industry took infant steps in the early 20th century across most of Europe and America. However, there was a considerable surge in the rise of supermarkets and hypermarkets (which are superstores combining supermarkets and department stores) in another half of the century as they provided a convenient all-in-one-stop experience for customers. This subsequently created a huge influx of data in retail stores, which creates a challenge for store owners to interpret with traditional business intelligence tools. Therefore, there was a need for real-time analytic tools, which could handle large datasets in sizes of up to Terabyte magnitudes. Thus, query languages such as Hive and Pig became prominent in the analysis of customer data to ensure the continued convenient experience for customers and quality provision of services for retailers. This paper aims to review how Hive has proved to be effective for the retail industry.

Journal: International Journal of Simulation- Systems, Science and Technology- IJSSST V20

Published: Feb 27, 2019

DOI: 10.5013/IJSSST.a.20.01.08