The Surveying Problem Part 2: Building a Grid

If I’m going to solve this problem, the first thing I need is a grid. In my interview, the grid itself didn’t actually exist; it was all hypothetical: “assuming a grid in a format that you can dictate, write an algorithm that would solve this problem”. If I am actually going to solve the problem, I need a grid with which to work!

Building a Grid

I assumed a square grid at all times, needing only one parameter to define the size. I then wrote a pretty basic function that would return a grid in the form of a dict, where the key is a 2-tuple representing the x and y coordinate of the element, and the value is either True or False, depending on whether there is oil there or not.

import random
import itertools

THRESHOLD = 0.85

def build_grid(size, seed=None):
   """ Build a square grid of elements, where each element may or may not contain oil

    size: The number of elements along one edge of the grid
    seed: Random seed to be used to generate the grid
    """
    random.seed(seed)

    grid = dict()
    for x_coord, y_coord in itertools.product(range(0, size), repeat=2):
        grid[(x_coord, y_coord)] = random.random() > THRESHOLD

    return grid

I somewhat arbitrarily picked 0.85 as the threshold for determining whether an element contained oil or not, which seemed to give reasonable results. However, the grids that were generated from this approach often contained a lot of isolated ‘oil’ (or active) elements. I was keen to ensure I had some degree of clustering, which would ensure that my code would only return a single reservoir for adjacent active elements. Here’s an example grid that was generated with the code above:

Clustering the Grid

There are a few active elements that are adjacent to one or more other active elements, but a good number are isolated. I wanted to make sure I had something that was a bit more representative of ‘reality’, or at least my idea of reality (I’m not a geologist!), where the active elements would always be clustered together. To achieve this, I modified the build_grid() code perform an additional round of processing to cluster the grid:

""" Generate a grid of elements with a random layout of elements that contain oil
build_grid: Build the grid with an optional random seed
get_neighbors: Find the number of active neighbor elements for a given element
"""

import random
import itertools

THRESHOLD = 0.85

def get_neighbors(this_element, grid):
    """ Returns a list of neighbor location objects

     this_element: A 2-tuple representing the x and y coordinates of an element
     grid: A dictionary containing all elements and their state
     """
    x_coord, y_coord = this_element
    x_offsets = [-1, 0, 1]
    y_offsets = [-1, 0, 1]

    neighbors = list()
    for x_offset in x_offsets:
        for y_offset in y_offsets:
            if x_offset == 0 and y_offset == 0:
                continue

            coord = (x_coord + x_offset, y_coord + y_offset)
            if grid.get(coord, False):
                neighbors.append(coord)
    return neighbors


def build_grid(size, seed=None):
    """ Build a square grid of elements, where each element may or may not contain oil

     size: The number of elements along one edge of the grid
     seed: Random seed to be used to generate the grid
     """
    random.seed(seed)

    initial_grid = dict()
    for x_coord, y_coord in itertools.product(range(0, size), repeat=2):
        initial_grid[(x_coord, y_coord)] = random.random() > THRESHOLD

    grid = set()
    # Cluster the grid. If an active element is not isolated,
    # or if an inactive element has at least 4 active neighbors
    for location, state in initial_grid.items():
        sites_nearby = get_neighbors(location, initial_grid)
        neighbor_count = len(sites_nearby)
        if (state and neighbor_count != 0) or neighbor_count >= 4:
            grid.add(location)

    return grid

This method implements something inspired by Conway’s Game of Life1https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life, by getting the number of adjacent active elements for each element in turn and applying the following rules:

  • If an element is active and has at least 1 active neighbor, it remains active
  • If an element is inactive and has more than 4 active neighbors, it becomes active
  • Any other element is inactive

This tends to produce a more clustered grid, which seems more representative of reality:

Next Steps

And that’s it! I now have some code that generates procedurally generated grids that I can use as inputs to my solution. A couple of notes on this though, before I move on:

  1. I used two different seeds for the two figures in this article, so the active elements don’t line up. Sorry if that causes any confusion!
  2. The code for drawing the output will be covered in a future article

Next step, solve the problem!