Junk Dimension

Ch.02: DWH | DWH Components | Data Modeling | Dimension Types

Lesson Notes

Garbage (Junk) Dimensions Lesson Notes


Junk Dimension

  • It used to reduce the number of dimensions (low-cardinality columns) in the dimensional model and reduce the number of columns in the fact table. It is a collection of random transnational codes, flags, or text attributes.

  • It optimizes space as fact tables should not include low-cardinality or text fields. It mainly includes measures, foreign keys, and degenerate dimension keys.

Junk Dimension Example

Junk Without Dimensions Example

Junk Dimensions Example

Junk Dimension Table Size

  • We must split the Junk dimension into more dimensions in case the size grows by the time.

  • It is easy to calculate the expected number of rows as it is the total number of combinations between the low-cardinality attributes; 3 columns each have 3 values total = 3 * 3 = 9.

Further Reading

  • Chapter 6.3.8 Identifying Garbage (Junk) Dimensions page 282 from Dimensional Modeling: In a Business Intelligence Environment. The book is free, and you can download it from this link.

Previous Chapters

Overview | Ch.01: Intro