Hive Create Database
In Hive, a database is simply a logical container. It doesn’t store data directly—it stores tables, views, and functions. For example, you could have databases named sales_db, employee_db, or inventory_db to group related tables. Hive databases also allow you to manage permissions, locations, and metadata easily.
Why Do We Need to Create a Database in Hive?
Creating a database in Hive is not mandatory—you can create tables without it too. But using databases brings a lot of benefits:
- It helps organize your data into logical groups.
- We can apply security policies at the database level.
- Each database can have its own warehouse location in HDFS.
- It keeps your queries cleaner by using database_name.table_name format.
For example, instead of just saying orders, we could say sales_db.orders, which makes it clear that this table belongs to the sales database.
Syntax -
Hive provides a simple command to create a database. Here's the basic syntax:
CREATE DATABASE database_name;
Here’s an enhanced syntax:
CREATE DATABASE IF NOT EXISTS database_name
COMMENT 'This is the sales database'
LOCATION '/user/hive/warehouse/sales_db'
WITH DBPROPERTIES ('created_by'='Pawan Kumar', 'created_on'='2025-07-04');
Examples -
Scenario1: Create a Simple Database
CREATE DATABASE employee_db;
This creates a database called employee_db with default settings.
Scenario2: Create a Database Only If It Doesn’t Exist
CREATE DATABASE IF NOT EXISTS finance_db;
This ensures that if finance_db already exists, Hive won’t throw an error.
Scenario3: Create a Database with a Custom Location
CREATE DATABASE hr_db
LOCATION '/user/hive/warehouse/hr_db';
This stores the database in the specified folder in HDFS.
Scenario4: Create a database for HR with properties:
CREATE DATABASE hr_db
COMMENT 'Human Resources Data'
WITH DBPROPERTIES ('created_by' = 'Admin', 'created_on' = '2025-07-04');
Now the Sales team and HR team can store their tables separately.
How to Check Databases in Hive?
After creating databases, you can view them using:
SHOW DATABASES;
This lists all the databases available in Hive.
To see detailed information about a specific database:
DESCRIBE DATABASE EXTENDED marketing_db;
This shows the database location, owner, and properties.