Hive is an open-source petabyte-level compute framework that facilitates reading, writing, and managing large datasets residing in distributed storage such as HDFS (Hadoop distributed file system) and other compatible blob stores such as Amazon S3.
While Hive was originally constructed to write MapReduce jobs, most modern implementations of Hive run on Tez, which is architecturally similar to Spark. Hive supports analysis using HiveQL, a SQL-like language and inherits all the benefits of Hadoop such as scalability, redundancy, and adeptness with large datasets.
Created by Facebook in 2008 to provide an accessible way to query their massive volume of user-generated data, Hive is the oldest and most mature of all of the SQL on Hadoop engines available. Thus Hive is the preferred choice for organizations looking for the most stable SQL on Hadoop Engine.