Data Structures

Arrays

How do arrays work?

  • Arrays are a contiguous block of memory to be treated a group of like types
  • For instance an array of 10 4-byte integers will actually take up 40-bytes of contiguous memory
  • Internally, the computer stores the array as a pointer to the first element and uses your index to offset it to the value you want

Example

Consider this array:


int[] numbers = { 0, 1, 1, 2, 3, 5, 8 };
                        

When you ask for numbers[3], internally the computer does this fake pseudo-code:


func array_access(index) {
    address = numbers; // numbers is actually an address!
    shifted_address = address + index * size_of_integer;
    return value_at_memory_location(shifted_address)
}
                        

Lists

What is a list?

  • Like an array, but can grow and shrink to hold as many items as you need
  • Generally has a method like add or push to add new things to the end
  • Also has a method like remove to remove an item at a particular index

Python Example


L = ['a', 'b', 'c']
L.append('d') # add onto the end
L.extend(['e', 'f', 'g']) # add multiple items
L.insert(1, 'sneaky') # insert 'sneaky' at index 1
L.pop(1) # remove the item at index 1
L.pop() # remove the item at the end

                        

Stacks

  • Like a list, but you can only add and remove things from the end
  • Essentially like restricting yourself to pop() and append
  • Sometimes described as LIFO (Last in, first out)

Queue

  • Like a list, but can only remove from the front and add to end
  • Like restricting to only pop(0) and append
  • Sometimes described as FIFO (First in, first out)
  • Also can be extended to have "priority", where each item in the queue has a priority. The highest priority things always exit next (so no FIFO). Called a Priority Queue

Sets

What is a set

  • A unique group of items
  • An item can only occur in a set once
  • So adding 1 to the set $$ \{ 1, 2, 3 \} $$ is the same set

Set Operations

Given two sets S and T

  • $$S \cup T$$ is the union of the sets, with all elements from both set in it
  • $$S \cap T$$ is the intersection of the sets, a set that only has the common elements from both sets in it
  • $$S - T$$ is the set difference that has all elements from the first set that aren't in the second

Python Example


S = set([1, 2, 3])
T = set([3, 4, 5])
S & T # Intersection, gives set([3])
S | T # Union, gives set([1, 2, 3, 4, 5])
S - T # Difference, gives set([1, 2])
2 in S # Boolean, is 2 in the set?
6 not in S # Boolean, is 6 not in the set
                        

Example Problem: Letters Not Used

  • Problem Set
  • First create a set of all lowercase letters A = set(['a', 'b', ...])
  • Next, add all the letters (in lowercase) from the input string into another set B
  • The letters not used is A - B
  • If you sort those, then join them ''.join(sorted(A - B))

Dictionaries

What is a dictionary

  • Provides a mapping from keys to values
  • Similar to an array, but instead of numeric keys, the keys can be (mostly) any type
  • Useful for providing a lookup table, or mapping of some sort

Python Example


D = {} # empty dictionary declaration
D = { 'Georgia' : 'GA', 'Florida' : 'FL' } # declaring a dictionary with data
D['New York'] = 'NY' # adding a new entry
D['Georgia'] # Spits out 'GA'
'Alabama' in D # Is 'Alabama' a key?
D.keys() # provides an iterable over the dictionary's keys are
D.values() # provides an iterable over the dictionary's values
                        

Trees

What is a tree

Tree
  • Stores heirarchical data
  • Consists of a series of nodes
  • Each node has a reference to a parent, and its children (if any)
  • Sometimes, it's helpful to associate a value with the node (like a binary search tree)

Basic Implementation


class Node:
    def __init__(self, value):
        # initially, this node has no parent, nor any children
        self.value = value
        self.parent = None
        self.children = []

    def add_child(self, child):
        # to add a child, add the passed node to the list of children,
        # but also link it back to "self", which is the parent
        child.parent = self
        self.children.append(child)

def traverse(root, do_something):
    # This is the list of nodes we have yet to visit. We start with just
    # the root
    to_visit = [root]

    # while there are more nodes to visit
    while len(to_visit) is not 0:
        # get the next node to visit.
        # Note: to_visit.pop() gives a Depth-First-Search
        # to_visit.pop(0) gives a Breadth-First-Search
        # check out https://en.wikipedia.org/wiki/Tree_traversal
        #next_node = to_visit.pop()
        next_node = to_visit.pop(0)

        # Add all of the children of this node as nodes to visit
        to_visit.extend(next_node.children)

        # do something with the node (we passed the "action" as a parameter)
        do_something(next_node)

# Building a tree here
A = Node('A')

# First level
B = Node('B')
A.add_child(B)
C = Node('C')
A.add_child(C)

# Second level
D = Node('D')
B.add_child(D)
E = Node('E')
B.add_child(E)

F = Node('F')
C.add_child(F)

# Call the traverse function with a starting node, and a function to
# perform at each node
traverse(A, lambda node : print(node.value))
                        

Other Tree Operations

  • Is X in the tree? Depending on how we structure the tree, this can be a fast operation
  • Add a node to a tree
  • Remove a node from a tree
  • Determining how far a node is from the root (the "depth")

Custom Data Structures

  • Knowing these data structures are important
  • Sometimes, though, you have to combine them in new ways

Examples

  • Modeling a game requires a data structure to model the game and how it transitions between states (and generally lot's of Breadth First Search)
  • Binary Search Trees for fast data retrieval
  • Simulations
  • Many, many more. Essentially, don't be afraid to combine stuff to get things done.