Research Methodology

Open Source Tools and Methods for Open Source Investigations - Digital Evidence Workflow

Research Methodology - Yemeni Archive

This page provides an overview of the Digital Evidence Workflow methodology used by the Yemeni Archive in carrying out its work. The methodology section contains the following elements:

A) Identification, collection and secure preservation B) Initial verification C) In-depth verification

A. Identification, collection and secure preservation

The Yemeni Archive identification, collection, and secure preservation methodology comprises of the following five steps:

  1. Establish database of credible sources for content
  2. Establish database of credible sources for verification
  3. Establish standardised metadata schema
  4. Record additional metadata
  5. Collect, store, hash, and timestamp visual evidence from verified sources

Step 1 : Establish database of credible sources for content

The Yemeni Archive has established its source database through following credible and verified social media accounts and channels (e.g. YouTube, Facebook) of individual citizen journalists as well as through partnerships with contributing journalists and documentation groups. It is important to note that many if not all of these sources are partisan, and thus require caution with regards to their claims.

Step 2 : Establish database of credible sources for verification

The Yemeni Archive has established a trusted team of citizen journalists and human rights defenders based in Yemen who provide additional information used for verification of content originating on social media platforms or sent from sources directly.

Step 3 : Establish standardised metadata scheme

Yemeni Archive organises preserved materials by cataloging content in a standardised format.

Additional value is added to the material by recording as much metadata and chain of custody information as possible, including location, date and origin. Information regarding the target of attack (e.g. against journalists, civilian infrastructure, cultural property, or against humanitarian relief personnel and objects, etc.), as well as alleged perpetrator is included when known. This contextualises material by addressing the questions of when, where and what happened in a specific incident which will help viewers to identify and understand it.

A full list of fields is available in the metadata section of the website.

Step 4 : Collect, store, hash and timestamp video evidence from verified sources

To ensure that the original content is not lost due to removal on corporate platforms, visual evidence from credible social media channels is scraped and stored securely on external backend servers before it goes through basic verification. Videos are hashed with SHA-256 and MD5 and timestamped to ensure they are not tampered with after scraped from social media platforms or taken directly from sources.

B. Initial verification

Similar to the archival and collection methodology, the initial verification methodology comprises of the following five steps:

  1. Aggregate metadata from visual evidence
  2. Verify the source of the video
  3. Verify the location of the video
  4. Verify the dates in which the video was filmed and uploaded, and
  5. Publish verified video evidence on the Yemeni Archive's online database.

These five steps will be discussed in further detail.

Step 1 : Aggregate metadata from visual evidence

Metadata from visual evidence sent directly or scraped from social media websites is parsed and aggregated using a predefined and standardised metadata scheme, as described above. This prepares the visual evidence for initial verification. Some of the metadata includes: Upload date and time, uploader's name, title and description of the video, location and device used to upload the video.

Step 2 : Verify the source of the video

To verify the source of the video, it needs to be established that the source of the video on the Yemeni Archive's verified list of credible sources. If the source is not an existing trusted source, determine the new source's credibility by evaluating:

  • Whether the source is familiar to the Yemeni Archive or to its existing professional network of Yemeni journalists and media activists;
  • Whether the source's content and reportage been reliable in the past. This is determined by evaluating how long the source has been reporting and how active they are;
  • Where the source is based. This is determined by evaluating whether videos uploaded are consistent and mostly from a specific location where the source is based;
  • Whether the video account uses a logo and whether this logo is consistently used across videos;
  • Whether the uploader aggregates videos from other news organisations and YouTube accounts, or whether they upload mostly user-generated content.

Step 3 : Verify the location of the video

Each video has gone through basic geolocation to verify that it has been captured in Yemen. More in-depth geolocation is conducted in order to verify that videos from a dataset were captured in specific city. This has been done by comparing reference points (e.g. buildings, mountains ranges, trees, minarets) with satellite imagery from Google Earth, Microsoft Bing, and DigitalGlobe, as well as OpenStreetMap imagery and geolocated photographs. In addition to this, the Yemeni Archive has referenced the Arabic spoken in videos against known regional accents and dialects within Yemen to further verify location of videos. When possible, the Yemeni Archive has contacted the source directly in order to confirm the location, and cross-referenced video evidence by consulting existing networks of journalists operating inside and outside Yemen to confirm the locations of specific incidents.

Step 4 : Verify the dates in which the video was filmed and uploaded

The Yemeni Archive verifies the date of capturing the video by cross referencing the publishing date on social media platforms with dates from reports concerning the same incident. Sources for reports used for cross-referencing include:

  • News reports from international and local media outlets.
  • Human rights reports published by international and local organisations;
  • Reports shared by the Yemeni Archive's network of citizen reporters on social media about the incidents.

Step 5 : Published verified video evidence to Yemeni Archive database

After videos have gone through the basic verification process, they are uploaded to the Yemeni Archive website where they are made publicly available in a free and open-source format.

C. In-depth verification

In some cases, the Yemeni Archive is able to conduct in-depth online open source investigations. Videos and other open source materials shared online are used in order to understand the incident and verify the veracity of claims made about incidents. Time and capacity limitations means not all incidents are able to be analysed in-depth, however by developing a replicable workflow it is hoped that others can assist in these efforts to investigate other incidents using similar methods.