02 Oct 2017

Find What You Were Looking For: Using Neural Networks to Trawl through Newspaper Advertisements

Inhoudsblokken
Body

Advertisements offer us a sneak peek into the ideals and aspirations of past realities. They show the state of technology, the social functions of products, and provide information on the society in which a product was to be sold. In the Netherlands, mass advertising developed in the late nineteenth century. In 1880, the patent and trademark law allowed manufacturers to claim specific brand names. During the twentieth century, the Dutch advertising landscape transformed. According to Schreurs, one noticeable change was that design became more prominent, as opposed to the more factual, textual advertising of the late nineteenth and early twentieth century, see Geschiedenis van de Reclame in Nederland (Schreurs, 2001). This blog shows how neural networks can help researchers to gain insights into possible visual trends in advertisements.

Delpher contains millions of advertisements in digital form. However, querying for advertisements with a specific visual style or ads for a particular product is more cumbersome than one would expect. First, advertisements regularly contain logo’s or stylishly formatted text, which have not been properly identified by OCR-software during digitization. This makes it difficult to locate ads for particular brands or products using full-text searching. For instance, querying for C&A would not have found the ad in Figure 1.  Hence, a researcher could only find this ad by using keywords in the ad, such as ‘nylon’ or ‘pullover’.

Afbeelding
Image
Figure 1. “C&A Advertisement”, Leeuwarder Courant, December 22, 1965
Bijschrift

Figure 1. “C&A Advertisement”, Leeuwarder Courant, December 22, 1965

Body

Second, a researcher does not always know all the relevant keywords related to a product in a particular time. Text mining techniques can help extract significant keywords that relate to historic brand names or products. The brand name might not always appear as part of the OCR-ed text. How can we then find advertisements based on their visual features and detect trends in these visual features? 

SIAMESE: Image Similarity Search

To answer this question, we developed the tool SIAMESE. Siamese uses a Convolutional Neural Network (CNN) to identify similar visual trends in advertisements. A neural network comprises a set of layers of algorithms that work similar to how neurons work in our brains. They can filter out patterns in visual data, which can aid researchers in identifying visual trends. For a quick introduction to Convolutional Neural Networks click here. Researchers can analyze these visual trends or use them to dive deeper into the textual content of a specific group of advertisements to identify specific keywords related to products with a strong visual character. This is an approach similar to Yale’s Neural Neighbors project*. 

SIAMESE presents users with the ten most similar images to a source images and a timeline view consisting of the most similar image in every year between 1948 and 1995. The former allows users to detect whether the source image was part of identifiable visual style, while the latter shows a visual trend’s development. Every time a user refreshes, SIAMESE presents them with a different source image, which introduces a level of serendipity. Also by clicking on the similar images, we feed this image as a source, which facilitates further exploratory searching. Clicking on the year of an image navigates the user to the Delpher interface. Here one can further explore the image, its position in the newspaper, and the textual content. 

Afbeelding
Image
Figure 2. SIAMESE timeline view for fashion advertisements
Bijschrift

Figure 2. SIAMESE timeline view for fashion advertisements

Body

First steps with SIAMESE

SIAMESE performs especially well in grouping clearly identifiable objects in advertisements. On the one hand, this is driven by the high number of advertisements for fashion (Fig. 2) and automobiles (Fig. 3). On the other, it also reveals that these advertisements for these products featured a consistent visual trend.

Afbeelding
Image
Figure 3. SIAMESE timeline view for automobile advertisements
Bijschrift

Figure 3. SIAMESE timeline view for automobile advertisements

Body

Other products which are heavily advertised in Dutch newspapers, such as mortgages or banks, rarely pop up as part of a visual style. This leads one to conclude that these do not have a uniform visual style throughout the years. 



SIAMESE is also prone to recognize trends in layout. Advertisements that contain particular ways of framing text, or that feature a divergent width or height are grouped together. This does not always mean that the objects represented in the ads are similar. 



One issue in working with visual uniformity in advertisements is that they often consist of an assemblage of objects and visual markers. Even though the overarching visual style between two advertisements might be different, its conceptual link might lie in the visual similarity between particular objects or groups of objects in the image. For instance, a tube of toothpaste next to shining white teeth. Hence, a future step would include isolating objects in images. This would enable the study of assemblages of visually similar objects. 



SIAMESE offers users an exploratory experience through a subset of the vast archive of advertisements that the KB offers. Also, the KB provide researchers with the dataset of advertisements that was used for SIAMESE to facilitate further research. For more on the dataset, see the SIAMESET page.

Auteur
Melvin Wevers
Melvin Wevers
PhD researcher
BIO
Melvin Wevers is postdoctoral researcher in the digital humanities group of the KNAW Humanities Cluster.
Extra informatie

*For more on the underlying technique see: di Lenardo, I., Seguin, B., Kaplan, F. (2016). Visual Patterns Discovery in Large Databases of Paintings. In Digital Humanities 2016: Conference Abstracts. Jagiellonian University & Pedagogical University, Kraków, pp. 169-172.