Conducting research on web archives

Would you like to find a page that has disappeared from the live web? Use the French Web archives held at the BnF for your research? Do data mining on existing collections? Stabilise and perpetuate a corpus of websites that you are working on? Find out about the services and tools offered by the BnF to support you in your use of web archives

 

20 years of web archives in France (web archiving) conference - © Emmanuel Nguyen Ngoc / BnF

Web archives: a useful tool for your research

How have the digital mediation practices of French museums changed over the last 20 years? How have writers’ networks been reconfigured by digital technology? How have the Web and social media shaken up the ways in which activism is practised, and profoundly transformed “l’ecriture de soi” (writing on the self)? How can we trace the origins of the first digital art websites? How has the design of science fiction blogs evolved? How did Olympic athletes prepare to compete ten years ago? All these questions mean that we need to be able to explore the Web of the past and traces of different eras, which is what the web archives allow us to do.

The web archives are accessible from all the workstations in the BnF’s research rooms and in a network of partner libraries.

More information on how to access and re-use these collections

Discover the web legal deposit research blog

Support for research projects on web archives

Several support services are available, ranging from ad hoc assistance to the establishment of a genuine partnership.

Ad hoc support and assistance with documentary research

The “Archives de l’internet” guide gives you all the initial information you need to get to grips with this source. It gives you an initial overview of possible search methods: search by URL on all the collections (“Archives de l’internet” application), full-text search on selected collections (“Archives de l’internet Labs”) and thematic guided tours produced by archivists and partner researchers.

Are you looking for an introduction to the web archives to help you prepare your project? You can register with the BnF DataLab. You can take advantage of tailored sessions on getting to grips with the research tools, in which the collections are introduced. Appointments with experts can be arranged for ad hoc documentary assistance or more advanced introductions to the service.

For more information on registering for the BnF DataLab

Come and work with your students

We regularly welcome groups of students, from undergraduates to 2nd year Masters students. Several formats are possible: introduction to web legal deposit and the collections, practical sessions to get to grips with the consultation tools, workshops to explore web archives collectively on a given theme.

Support for all types of research project

The BnF’s hosting schemes offer you support tailored to your profile and your projects. Two schemes, associate researchers and the BnF DataLab residence, are open to applications every year.

Find current calls for applications on the BnF’s research blog

Associate researchers

Every year since 2003, the BnF has published a call for young researchers wishing to work on its collections, whether physical or digital. The status of associate researcher allows you to benefit from methodological and scientific support, as well as privileged access to data collected as part of web legal deposit. This one-year scheme is renewable twice.

For more information on the associate researcher scheme

Call for BnF DataLab projects

This call for projects is proposed in partnership with the Huma-Num Research Infrastructure. It focuses exclusively on the BnF’s digital collections. The winning projects benefit from the services of BnF DataLab and funding. The aim of the call for projects is to encourage the development and sharing of methodologies and tools to facilitate data mining, corpus building and computational analysis.
For more information on BnF DataLab services

Projects under agreement

Research partnerships are possible between the BnF and one or more research laboratories. The objectives of the project are set out in an agreement. As part of a project funded by the ANR, a LabEx or the ERC, the BnF may be involved in one of the phases of the project. In the case of web archives, it provides methodological and engineering expertise for projects involving collection, data mining and educational development.

The BnF datalab’s web archive services

The BnF DataLab has been designed as a physical and virtual support space. It is based on a common service offering combining training, project engineering, promotion of research results and more resource-specific services.

Three services have been specifically designed to support projects involving web archives.

On-demand crawl 

To design and archive a web corpus related to your subject of study. Corpuses must comply with the legal scope of web legal deposit (French Web, excluding radio and television sites falling within the scope of the Institut national de l’audiovisuel (INA – National Audiovisual Institute).

Assistance with text and data mining

To build up a corpus, create a map or produce data visualisations from web archives. The BnF provides you with tools shared by the scientific community (e.g. Hyphe, developed by the Sciences Po médialab) or developed by the archival community (SolrWayback from the Royal Danish Library), which complement the consultation applications.

The BnF DataLab offers demonstration sessions and hands-on use of these tools according to your needs.

Data extraction and metadata

For analysis purposes, you can request the extraction of archived web content or sets of metadata. Archivists can show you the formats available, as well as their technical characteristics, and provide you with initial test sets or samples. Extraction is tailored to your needs and supported by BnF experts.

An international outlook

The BnF is a member of the International Internet Preservation Consortium (IIPC), which brings together organisations from 35 countries to cooperate and share best practice.

The IIPC supports the development of common tools and standards, organises transnational collections on major topical issues and promotes learning and research use and practices on web archives. Several working groups bring together experts from different countries to discuss issues relating to training, research and the development of the collections.

The BnF is fully committed to its partners, institutions and users, and regularly presents the progress of its work at key international events: The Web Archiving Conference (WAC) organised by the IIPC and the Research Infrastructure for the Study of Archived Web Materials (RESAW) conference.

In 2024, the BnF will host the WAC. Join us and our IIPC partners on 25 and 26 April 2024.

For more information on international cooperation on web archiving

Ressources

Guide des Archives de l'internet

FR - PDF - 402.78 Ko

Contact