Transmissions: Sounds & Image

A book spread showing a portrait taken using the SSTV technique. The image also shows ambient sound of the scene in the form of visual noise.

A special edition of the book using silver cover paper and hand-stitched binding.

A page showing a photograph of an eye sculpture on the street near the photographer’s home.

The book’s colophon printed on silver cover paper.

A detailed image of the visual/audio noise present in an image.

I made a book for the 2023 Tokyo Art Book Fair using a combination of programing, photography, and graphic design. In short, the book’s photography is done by converting images to sound and then back to images. The images also incorporate the audio from the scene of the photographs into the final pictures in the form of visual noise.

The following is the explanatory text from the beginning of the book.


When families gathered around their televisions in July of 1969 to watch humans walk on the moon for the first time, the images they saw were sent to earth via an unexpected medium: sound.

The video transmission of the first moon landing used a technology called slow-scan television (SSTV). The method sends visual images over analog radio signals using frequencies that are audible to human ears. Each value of brightness in an image is encoded as a particular audio frequency, and the resulting stream of sound can later be decoded into a coherent image. Listening to an image represented as an audio stream isn’t particularly pleasant. The resulting sound is a series of high-pitched, electronic chirps that might be mistaken for a dial-up modem from the 1990s.

Like the first visuals from the moon, the images in this book use sound-based SSTV technology to reproduce photographic images of scenes. In contrast to the images sent from the moon which aimed to be as clear as possible, the images in this book welcome visual noise.

The visual noise seen in these images is a direct representation of the auditory noise in the scene in which the original photograph was taken. To achieve this, a custom device was built to transmit and record SSTV signals that incorporate the sounds of the scene into the SSTV signal itself. The method works like this:

  1. A photograph is taken using a standard digital SLR camera.

  2. The digital photograph is transferred to the custom SSTV device over a wire.

  3. The SSTV device converts the photograph to an audio signal using the Martin M2 SSTV format.

  4. The device broadcasts the SSTV signal over an internal speaker. The audio signal for one photograph lasts for 59 seconds.

  5. As the SSTV audio signal plays, the device re-records the audio using an internal microphone. The microphone simultaneously records the ambient audio present in the scene and merges it with the SSTV audio signal.

  6. The new SSTV audio signal, comprising both the photographic data and the ambient sound, is later decoded into a digital image using a Macintosh computer. Additional errors in color representation and visual “ghosting” can occur as the image is decoded.

The images that result from this process contain a photographic representation of the scene as captured by the camera, but they also include visual noise that directly corresponds to the ambient sound in the scene. A train quickly passing by might leave a few lines of visual noise in one part of the image, while the nearby hum of an air conditioner will sprinkle consistent specks of interference throughout the picture.

In this book, the right page of each spread features a full image and lists details about the location, date of capture, and the ambient noise present in the scene. A close-up image of the visual representation of ambient noise captured during the transmission of the image’s SSTV signal can be seen on the left page.

Posters made using the SSTV technique.