Expanding Our Collection's Global Reach on the Spanish Wikipedia

Elena Villaespesa
March 14, 2018

Collage of public-domain images in The Met collection

«About twenty thousand new articles are added every month to Wikipedia by editors around the world. In order to increase the access and reach of their content on the platform, museums have actively been adding their collection data and information to Wikipedia. This work is undertaken in a variety of ways, which includes staff collaborating with a Wikipedian-in-residence, or by hosting edit-a-thon events. In the case of The Met, one of the biggest efforts has been uploading to Wikipedia the 375,000-plus images of public-domain artworks in the Museum's collection. So what is the impact of having all of these images available for use on the biggest free encyclopedia? How are Wikipedia editors using these images contributing to an organic increase of the Museum's presence on this site?»

A previous blog post by Loic Tallon, chief digital officer at The Met, illustrates with data the overall increase in access to collection objects on Wikipedia in comparison to that of The Met's website. It highlights the potential increase in reaching users by placing the Museum's collection and related scholarly content outside of The Met's own channels. The post also points out how a museum can benefit from the diverse community of users around the world who translate content or create new articles in different languages.

To explore the global outreach of the collection on Wikipedia, I took a deep look at data to determine the performance of The Met's Open Access images on the Spanish Wikipedia—the third-highest language on the platform in volume of pageviews and the fourth highest in number of articles with images from The Met, which is about five percent of the total number of articles. This post outlines the data analysis regarding the increase of usage and the volume of pageviews of these images, as well as an exploration into the types of articles onto which these images have been added, which includes some interesting and curious articles sometimes not directly related with art.

Increasing Access

As of December 2017, 403 images from the category "Images from Metropolitan Museum of Art" are available on different pages on the Spanish Wikipedia, which factors in the images that were uploaded as part of The Met's Open Access initiative. The number of articles that include an image from this set has significantly increased since February 2017, growing from 36 to 192. This is mostly due to the images added by the editors in the Wikipedia community and the results of some online edit-a-thons.

The impact this activity has had on the volume of potential impressions of those images is impressive. The graph below shows the growth in volume of total pageviews, from 270,000 to an average of 617,000 in the final three months. In some cases the images have been added to existing pages, while in other cases these were newly created articles. For example, an article in Spanish about William the hippo was added in July, and has since received 450 pageviews.

A chart showing data in green blocks and a red line
Number of Wikipedia articles featuring images of Met artworks and the number of pageviews per article, from February through January 2018

The volume of pageviews has increased during this period, but a question arises about which articles are getting more engagement. Looking at the distribution of the pageviews in the histogram for the month of December 2017, we can see that five articles are getting more than twenty thousand pageviews, which represents over half of the total of volume of pageviews. The long tail of the articles pageviews includes pages with a much lower volume, due to the fact that, in some cases, these are new articles recently created on the Spanish Wikipedia, and in other cases because the content covers a more niche topic. As some of these are more specific or solely dedicated to an artwork, it could be argued that Wikipedia readers will be more likely to see these images.

Chart showing data presented in red blocks
Histogram of the distribution of article pageviews (data as of December 2017)

Chart representing two sets of data: one in green blocks, the other in pink
Type of articles and pageviews on the Spanish Wikipedia (data as of December 2017)

The articles on the Spanish Wikipedia include a range of topics. The 183 articles published as of December were coded into different categories with the aim of evaluating the diversity of content and access, with just over half of those representing articles about artists or artworks. Among the artworks we find pages dedicated to a specific object from The Met collection, such as La muerte de Sócrates (The Death of Socrates), or a page dedicated to a series of artworks such as La isla de los muertos (Isle of the Dead), which includes not only the painting from The Met, but also four from other museums, providing a wider context for the reader.

Interestingly, among the articles in this category with a high volume of pageviews, we find artworks painted by Spanish artists, such as Vista de Toledo (View of Toledo) by El Greco or Virgen niña en éxtasis (The Young Virgin) by Francisco de Zurbarán. After artwork articles, the next category with a high volume of pageviews are artist articles. El Greco is at the top in number of pageviews, followed by some popular artists like Monet, Delacroix, or Vermeer. A particularly interesting page on this list is Luisa Roldan, the first documented female sculptor in Spain.

The next category in terms of pageview volume and the top category in terms of viewership is History and Culture. For instance, there is a large percentage of articles in this category about Egyptian culture and ancient and archaeological topics, such as: Cultura del antiguo Egipto, papiro, Amenmeses, KV54, and Merimdense. This tells an interesting story about how editors have been linking the objects in order to document historical periods and different cultures, but also the potential outreach of the Museum's collection in a variety of topics within this category.

Artwork images from The Met collection also illustrate places or monuments from different locations; for example, the Hôtel de Ville in Paris or the Lavapiés neighborhood in Madrid. Along the list of pages, we can find a range of topics that are not related to art. These range from household objects to food and animals. With only one or two Wikipedia articles coded under each of these broader themes, we can find some curious ones like chocolate caliente (hot chocolate), which includes four chocolate pots from the collection, or silla (chair), which lists different types of chairs: a baby stroller, an office chair, and also some historic chairs illustrated with a few images from the Museum's collection. Other pages under this "other" category include, as of this date, gallina cochinchina (Cochin chicken), Columna Morris, or natron. However, as more articles are added to Wikipedia, there will be more objects that create a new category in the coding scheme.

Screenshot of a page on the Spanish Wikipedia showing images from The Met collection
Screenshot of a Spanish Wikipedia page featuring Open Access images of four works from The Met collection

Long-Term Impact

The set of Met images featured on Wikimedia Commons currently includes more than 360,000 files. Besides the increase in reach noted in the data, so far only 0.12% of the available images under the public domain are being used on the Spanish Wikipedia. Therefore, what are the options for users to discover these images, and how can the Museum take an active role to promote this data set? In the long term, as more images are organically included in articles, the shape of the histogram chart shown above may change. The total volume of pageviews in the long tail could slowly build up with more images added to niche subjects, or maybe images will be added to articles covering broader or frequently visited topics—or perhaps both scenarios could potentially occur.

Having images on a wide range of topics not directly related to art could also have a long-term impact on reaching a more diverse audience. The bubble chart below explores this question in great detail. Each circle represents one Wikipedia article, the size of the circle shows the volume of traffic that each article is getting every month, and the color illustrates the type of article. Very likely, more data points will be added every month going forward and, as a consequence, the visibility of The Met collection to Wikipedia readers interested in a variety of themes will increase. Moreover, Wikipedia is also very well positioned on search engine results, an element that brings another advantage to having the content on this platform.

The data analyzed in this post only covers the first ten months of Open Access at The Met. Based on how rapidly new content is being added to the platform, these numbers may look very different in the long term due to the organic role of the Wikipedia community in adding these images to articles. I invite you to interact with the chart below to explore which articles with images from The Met have been added to Wikipedia in the past year, and the volume of pageviews those articles are receiving.

Word chart showing different colored bubbles with Spanish text inside the larger ones
View an interactive version of this map on Tableau

Elena Villaespesa

Elena Villaespesa is the digital analyst in the Digital Department.