Data Structures

From David's Wiki
\( \newcommand{\P}[]{\unicode{xB6}} \newcommand{\AA}[]{\unicode{x212B}} \newcommand{\empty}[]{\emptyset} \newcommand{\O}[]{\emptyset} \newcommand{\Alpha}[]{Α} \newcommand{\Beta}[]{Β} \newcommand{\Epsilon}[]{Ε} \newcommand{\Iota}[]{Ι} \newcommand{\Kappa}[]{Κ} \newcommand{\Rho}[]{Ρ} \newcommand{\Tau}[]{Τ} \newcommand{\Zeta}[]{Ζ} \newcommand{\Mu}[]{\unicode{x039C}} \newcommand{\Chi}[]{Χ} \newcommand{\Eta}[]{\unicode{x0397}} \newcommand{\Nu}[]{\unicode{x039D}} \newcommand{\Omicron}[]{\unicode{x039F}} \DeclareMathOperator{\sgn}{sgn} \def\oiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x222F}\,}{\unicode{x222F}}{\unicode{x222F}}{\unicode{x222F}}}\,}\nolimits} \def\oiiint{\mathop{\vcenter{\mathchoice{\huge\unicode{x2230}\,}{\unicode{x2230}}{\unicode{x2230}}{\unicode{x2230}}}\,}\nolimits} \)

Common data structures you should know

Lists

XOR Linked List

Skip List

Technically a linked-list but looks like a tree if you squint.

Hash

Use this all the time if you don't need to iterate through the data structure in order.
Probabilistic O(1) insertion.

Linear Probing

One way to handle collisions.

Double hashing

Another way to handle collisions.

Separate chaining

Another way to handle collisions. Each bucket is a pointer to a linked-list of values.

Trees

Heap

AVL

Red-Black

Wikipedia: Red–black tree
Geeks for geeks introduction
Geeks for geeks insertion
Geeks for geeks deletion

A red-black tree follows the following rules

  1. Each node is either red or black.
  2. The root is black. This rule is sometimes omitted. Since the root can always be changed from red to black, but not necessarily vice versa, this rule has little effect on analysis.
  3. All leaves (NIL) are black.
  4. If a node is red, then both its children are black.
  5. Every path from a given node to any of its descendant NIL nodes contains the same number of black nodes.

Intuition

Consider any tree with a left and right child. From rule 4, we see that at most n/2 nodes are red. At least half will be black. Red nodes or levels are those interspersed between black levels. If the left child has \(\displaystyle m\) levels then at most the right child can have \(\displaystyle 2m\) levels.

Intuitively, this is more relaxed than AVL trees so they will have fewer operations for insert/delete but will be less balanced (i.e. longer search).
Red-black trees are used in in C++ (ordered_map, ordered_set) and Java (TreeMap, TreeSet).

Complexity

\(\displaystyle 2\log (n+1)\) height.
\(\displaystyle O(\log n)\) search, insert, delete

B-tree

Wikipedia: B-tree
B-tree visualization
A very popular data structure for filesystems and memory indexes since the maximum size of a node in the B-tree can be configured to the size of a page of memory.

2-3

A B-tree of order 3

B+ Tree

B* Tree

Treap

A tree and a heap. O(logn) with high probability. The main benefits are that it is very easy to implement.

Insertion

Insert using BST insert and then heapify using AVL rotations.

Splay Tree

Segment Trees

Wikipedia: Segment tree
Geeksforgeeks 1 sum of range

  • This is a binary tree for holding segments.
  • Leaves hold something called "elementary intervals" and internal nodes hold the union of elementary intervals of its children.
  • In addition, each leaf or node v also holds intervals which span their segment, i.e. \(\displaystyle \{X \mid Int(v) \in X\}\)

Spatial Data Structures

Point Quadtree

Extension of BST.

PM Quadtree

Polygonal Map Quadtree
Simple to implement. See my implementation here. Requires a known range (region) of values.
O(log k) where k is the region divided by the distance between the two closest points (i.e. your grid is \(\displaystyle k \times k\)).

PR Quadtree

Point Region Quadtree

MX Quadtree

K-d Tree

Restricted Quadtree

Reference
Extension of AVL quadtree where the depth differs by at most one between your 4 children.