Hive Data Types

Data Types in Hive define the kind of values that can be stored in each column of a table. For example,

  1. A name should be stored as text (STRING).
  2. A price should be stored as a number (DOUBLE).
  3. A date of birth should be stored as a DATE.

Choosing the right data type ensures that your queries work correctly and that Hive can store your data efficiently.

Types of Data Types -

Hive offers three broad categories of data types:

  • Primitive Data Types
  • Collection Data Types
  • Complex Data Types
Primitive Data Types -

These are the basic building blocks of data in Hive. They store simple values like numbers, text, or dates.

Data TypeDescriptionLengthRange
TINYINTVery small integer1 Byte-128 to 127
SMALLINTSmall integer2 Byte-32768 to 32767
INTRegular integer4 Byte-2147483648 .. 2147483647
BIGINTLarge integer8 Byte9223372036854775808 .. 9223372036854775807
FLOATSingle precision decimal4 Byte1.40129846432481707e-45 .. 3.40282346638528860e+38
DOUBLEDouble precision decimal8 Byte4.94065645841246544e-324d .. 1.79769313486231570e+308
DECIMAL(p,s)Precise decimal valuesp, sDECIMAL(10,2) stores 12345.67
STRINGText valuesMax 2GB32,767
VARCHAR(n)Text up to ‘n’ lengthn
CHAR(n)Fixed-length textnCHAR(4) stores 'Test'
BOOLEANTrue or FalseTrue or FalseTRUE or FLASE
DATECalendar date1400-01-01 to 9999-12-31
TIMESTAMPDate with time1400-01-01 00:00:00 to 9999-12-31 00:00:00

Here is the conversion table for all primitive data types:

  • "Y" Represents the data type can convert from source to destination.
  • "N" represents the data type can’t be converted from source to destination.
Data Types

Examples -

Scenario: Using different primitive types:

CREATE TABLE product_data (
  product_id INT,
  product_name STRING,
  price DOUBLE,
  available BOOLEAN
);

This stores product information, where each field has the right data type for its value.

Collection Data Types -

Sometimes, you need to store multiple values inside one field. That’s where collection types come in. They allow you to group multiple pieces of information together.

Complex TypeDescriptionExample
ArrayStores a list of values of the same typeARRAY<STRING> → ['Apple', 'Banana']
MapStores key-value pairsMAP<STRING, INT> → {'Math': 90, 'English': 85}
StructsStores multiple fields as a single objectSTRUCT<name:STRING, age:INT>

Here is the conversion table for all collection data types:

  • "Y" Represents the data type can convert from source to destination.
  • "N" represents the data type can’t be converted from source to destination.
ArrayMapStructs
ArrayYNN
MapNYN
StructsYNN

Examples -

Scenario: Using ARRAY and MAP:

CREATE TABLE student_data (
  student_id INT,
  student_name STRING,
  subjects ARRAY<STRING>,
  marks MAP<STRING, INT>
);

This table can store multiple subjects and their marks for each student in one row.

Complex Data Types -

Complex types are nested versions of the collection types where you combine them to create more advanced structures.

Examples -

CREATE TABLE company_data (
  company_id INT,
  company_name STRING,
  employees ARRAY<STRUCT<emp_name:STRING, emp_salary:DOUBLE>>
);

This means each company can have an array of employees, and for each employee, you store both name and salary.