Map Reduce

  • Input: a set of key-value pairs
  • Programmer specifies two methods:
    • Map: (k1, v1) -> (k, v)
      • Takes a key-value pair and outputs a set of key-value pairs
      • E.g., key is the filename, value is a single line in the file
    • Reduce: (k, list(v)) -> list(v)
      • All values v' with same key k' are reduced together and processed in v' order
      • There is one Reduce function call per unique key k'

Example Natural Join

Natural Join
Natural Join