More on Functional Images

The code you are implementing is based on a paper by Conal Elliott, titled Functional Images. There is no requirement to read this paper, though it may be of interest to you. A summary of background information and the paper follows.

Consider the problem of programming a computer to draw a particular image. One approach is to carefully lay out a series of instructions to draw pixels to the screen. While this approach works and is commonly done, it is not without its flaws. For one, the code that draws a particular image may not immediately convey what the image is. As an analogy to connecting the dots on a piece of paper, the code focuses on the connections between individual dots, as opposed to the picture formed by the collective connections between all the dots. This disconnect can make writing image-drawing code difficult, as one must focus on how to draw a particular image, as opposed to explaining what the image is.

Another issue with this approach relates to the typical software engineering concerns of code maintainability and reusability. Say we have code that draws some particular image, and we later decide that the image should be shifted over to the right. From a high-level standpoint, this sounds quite simple: just shift what is already there to the right. However, this is not so simple with respect to using a series of instructions to draw an image. Every single instruction that writes a pixel needs modification for the shift to work correctly, which could mean a significant amount of code change. In fact, this sort of transformation may be so painful with this approach that we may design the code in such a way as to abstract away this particular transformation. Such abstraction would gracefully handle the shift to the right. There is, however, still a problem: what about other sorts of transformations? How do we handle, say, a rotation? Can our abstractions handle all possible kinds of transformations, or are their certain kinds of transformations which break the model?

Fundamentally, the problem with the approach of using a series of instructions for drawing is that the model is too low-level. Ideally, we'd like to work at a higher level of abstraction, one that focuses on images as opposed to the technical details behind drawing images.

Functional image synthesis is a technique that aims to be as close to real images as possible. Relative to the approach of specifying a series of instructions for drawing, almost everything in the functional images approach is abstracted over, allowing for a better reflection of reality. For example, with actual images, we could have Cartesian coordinate systems, polar coordinate systems, 2D or 3D spaces, discrete or continuous intervals, etc. Indeed, even the idea of what exists at a particular point in a particular space can differ based on the image type, as with images with transparent portions. The functional images approach can encompass all these real-world concerns into a single uniform library.

The key observation for this level of abstraction is the following. Consider for a moment a 2D canvas of colored pixels addressed by integral Cartesian coordinates. In this type of environment, it is possible to represent any image as a function from coordinate to color (image: (X, Y) => Color). That is, given some coordinate, the function returns the color of the pixel at that coordinate.

This idea can be further abstracted to other contexts. For example, let's extend this to work in a 3D coordinate plane. In 3D, the function simply takes an additional Z coordinate (image: (X, Y, Z) => Color). In fact, for any kind of coordinate, we have image: Coordinate => Color. This idea works even for animations, considering the time as an additional kind of coordinate (4D).

We can abstract this idea further. Consider drawing specifically black-and-white images. In this context, every color is simply a measure of how close to black (or equivalently white) it is. As such, a color is most simply represented as a number in the interval [low, high], where white and black are represented with low and high, respectively. With this in mind, images are functions from coordinates to intervals (image: Coordinate => Interval). Abstracting the color viewpoint and the black-and-white viewpoint together yields that an image is a function from a coordinate to something that can be drawn or otherwise represented (image: Coordinate => Drawable).

The observation that images are representable as generic functions affords us a great deal of flexibility, as long as we have higher-order functions at our disposal. For example, consider the problem of shifting an image to the right by some number of pixels in a 2D Cartesian space. This can very naturally be expressed like so (using Scala-like notation):

def shiftRight(byAmount: Int)(image: (Int, Int) => Drawable): (Int, Int) => Drawable = 
  (x, y) => image(x - byAmount, y)

(One may notice that this may appear invalid as now negative coordinates could be passed to image. However, there is no reason why negative coordinates are fundamentally invalid with this technique - with the provided context, there is not even enough information to determine if the image is of finite size.) The above code returns a function that, if given a coordinate, will return whatever the original function returned at byAmount pixels to the left. For example, if we were shifting to the right by 5 pixels, then coordinate (5, 0) will now return whatever was originally at (0, 0) (i.e., 5 - 5). It should be clear that this is highly modular: shiftRight is given one image and an amount to shift by, and returns a new image based on the old without any knowledge of what the particular image is. Not only is the code short, it straightforwardly addresses what it means to shift an image to the right.