Back to Curriculum

Advanced Data Types

📚 Lesson 2 of 15 ⏱️ 60 min

Advanced Data Types

60 min

PostgreSQL extends standard SQL with a rich set of advanced data types that enable sophisticated data modeling and efficient storage. Beyond basic types like INTEGER, VARCHAR, and DATE, PostgreSQL provides JSON/JSONB for semi-structured data, arrays for storing lists, geometric types for spatial data, network types for IP addresses, and custom types for domain-specific needs. These advanced types eliminate the need for complex workarounds and enable more natural data representation.

JSON and JSONB are powerful types for storing semi-structured data. JSON stores data as text, preserving exact formatting and allowing duplicate keys, while JSONB stores data in a binary format that's more efficient for querying and indexing. JSONB automatically removes duplicate keys and whitespace, and preserves key ordering. For most use cases, JSONB is preferred due to its performance advantages and indexing capabilities. Both types support rich querying with operators like ->, ->>, @>, and ?.

Array types enable storing lists of values in a single column. PostgreSQL supports arrays of any built-in or user-defined type. Arrays are useful for storing tags, categories, or any ordered list of values. Array operations include indexing (array[1]), slicing (array[1:3]), concatenation (array1 || array2), and various array functions. Arrays can be multidimensional, enabling matrix-like storage. Understanding array operations enables efficient handling of list data without normalization.

Geometric types (point, line, box, polygon, circle) enable spatial data storage and operations. These types are essential for geographic information systems (GIS), mapping applications, and location-based services. PostgreSQL's PostGIS extension extends geometric capabilities with advanced spatial functions, coordinate system support, and spatial indexing. Geometric types enable queries like finding points within a radius or calculating distances between locations.

Network types (inet, cidr, macaddr) are specialized for network-related data. The inet type stores IPv4 and IPv6 addresses, cidr stores network addresses with subnet masks, and macaddr stores MAC addresses. These types provide network-specific operators and functions, making network-related queries more efficient and accurate than using text types. Network types are essential for network management applications and security systems.

Custom types (domains and composite types) enable you to create specialized types tailored to your application needs. Domains are based on existing types with added constraints, while composite types combine multiple fields into a single type. Custom types improve data integrity, code readability, and enable type-specific functions and operators. Understanding when and how to create custom types helps you model complex domains effectively.

Key Concepts

  • PostgreSQL supports advanced data types beyond standard SQL.
  • JSONB provides efficient storage and querying of semi-structured data.
  • Array types enable storing lists of values in single columns.
  • Geometric types support spatial data and operations.
  • Custom types enable domain-specific data modeling.

Learning Objectives

Master

  • Using JSON and JSONB for semi-structured data
  • Working with array types and operations
  • Understanding geometric and network types
  • Creating and using custom types

Develop

  • Understanding advanced data modeling techniques
  • Choosing appropriate data types for different use cases
  • Leveraging PostgreSQL's type system effectively

Tips

  • Prefer JSONB over JSON for better performance and indexing capabilities.
  • Use array types for ordered lists that don't need separate tables.
  • Create GIN indexes on JSONB columns for efficient JSON queries.
  • Use geometric types with PostGIS extension for advanced spatial operations.

Common Pitfalls

  • Using JSON when JSONB would be more efficient, causing performance issues.
  • Overusing arrays when normalization would be better, complicating queries.
  • Not indexing JSONB columns, causing slow JSON queries.
  • Not understanding array indexing (1-based), causing off-by-one errors.

Summary

  • PostgreSQL provides rich advanced data types beyond standard SQL.
  • JSONB enables efficient storage and querying of semi-structured data.
  • Arrays, geometric, and network types support specialized use cases.
  • Custom types enable domain-specific data modeling.

Exercise

Create tables with advanced PostgreSQL data types including JSON and arrays.

-- Create a table with JSON data
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    attributes JSONB,
    tags TEXT[]
);

-- Insert data with JSON and arrays
INSERT INTO products (name, price, attributes, tags) VALUES 
    ('Laptop', 999.99, 
     '{"brand": "Dell", "screen_size": 15.6, "ram": "16GB"}',
     ARRAY['electronics', 'computer', 'portable']),
    ('Smartphone', 699.99,
     '{"brand": "Apple", "storage": "128GB", "color": "black"}',
     ARRAY['electronics', 'mobile', 'communication']);

-- Query JSON data
SELECT name, attributes->>'brand' as brand, tags[1] as primary_tag
FROM products;

Exercise Tips

  • Query JSONB: profile->>'name' for text, profile->'nested'->>'key' for nested access.
  • Use array operators: ANY(array) = value, array @> ARRAY['tag1'], array && ARRAY['tag1'].
  • Create GIN index on JSONB: CREATE INDEX ON table USING gin(jsonb_column);
  • Try geometric types: CREATE TABLE locations (id SERIAL, position POINT);

Code Editor

Output