ABC
1
2
1. Introduction
3
This spreadsheet collates data comparing the file formats used in Submission Information Packages (SIP) by different cultural heritage and research institutions around the world. The comparison is released in yearly cycles.

The comparison IS NOT to be used alone as file format recommendations, they are solely an indicator of prevalence across heritage institutions. You should instead use an analytical framework to create your own institutional file format policies, in which you may draw on information in the comparison as one among other sustainability criterias.

For the table, see the sheet "2023 SIP File Formats".
For supplementary info table, see the sheet "DP Strategies". DP stands for Digital Preservation.

The spreadsheet is maintained by the OPF in the working group ICRFF. Get in touch with us by contacting Georgia Moppett (OPF).

Webpage: OPF working group
Read about how to contribute in above link
4
2. File Format Scoring
5
The different file formats are scored in a uniform way to enable strict comparison between the practices of different institutions, internationally. The file formats may be represented in Submission Information Packages.

Potential values for SIP file formats

Preferred = 2
The value may be given if the file format has been designated by the institution as ideal for submission of data according to the content information type. The value is translated to the numerical value 2.

Accepted = 1
The value may be given if the file format can be submitted to the institution, but another file format is designated as the preferred submission file format according to the content information type of the data. The value is translated to the numerical value 1.

Accepted, but undesired = 0
The value may be given if the file format cannot be migrated to a more suitable file format in the short term, according to the content information type. The purpose of imminent transfer would be to prevent data from total loss. The value may entail that data will never be migrated. The value is translated to the numerical value 0.

Unaccepted = -1
The value may be given if the institution has explicitly designated the file format as unacceptable for submission, or the file format is not explicitly designated for submission from an exhaustive list of designated file formats and hence it implicitly follows that the file format cannot be submitted. The value is translated to the numerical value -1.

Undefined = null (no value)
The value may be given if the file format has not been assessed by the institution, or the data are not ingested by the institution due to policies regarding deletion or another national, regional or local institution is obligated to preserve the data. The value does not have a numerical translation.
6
3. Digital Preservation Strategies
7
The comparison table also collates information on which digital preservation strategies are applied by the different institutions. The strategies may be combined within an institution. The digital preservation strategies provide context for understanding the background for the values given to each file format. Typically, institutions supplement their digital preservation strategy with a method for file format assessments.

For table, see the sheet "Digital Preservation Strategies".

Potential digital preservation strategies

Early migration
Data are migrated to a designated portfolio of file formats before submission. The responsibility for the migration therefore lies with the data producer. The institution commits to migrate the data with the passing of time at an interval before current hardware and software are no longer capable of rendering the data. This may or may not include preserving a copy of the original data.

Late migration/Format agnostic/Minimal effort ingest
The institution accepts all or most data in original file formats for submission and commits to migrating the data potentially during ingest or ultimately before current hardware and software are no longer capable of rendering the data.

Bit Preservation
The institution accepts data in original file formats for submission and does not migrate data or provide any support for rendering data to the archival user. The institution only preserves the original bits and disseminates the original files to archival users to use on their own responsibility.

Emulation
Preservation of original data and dissemination based on a preservation of original software setups, which are emulated on newer operation systems and hardware.

Museum
Preservation of original data and dissemination of data based on a simultaneous preservation of the original hardware and software setups.
8
4. Credits
9
RoleNameOrganisation
10
Original creatorAadi KaljuveeNational Archives of Estonia
11
Editor, maintainerAsbjørn SkødtDanish National Archives
12
Former host, coordinatorBecky McGuinnessOpen Preservation Foundation
13
Former host, coordinatorCharlotte ArmstrongOpen Preservation Foundation
14
Host, coordinatorGeorgia MoppettOpen Preservation Foundation
15
16
5. Releases
17
VersionDateRelease Notes
18
1.02021, September1st public release.
19
1.12022, February1st review phase, 2nd sheet about "Digital Preservation Strategies" created. JPEG-2000 file format added to more content information types. Minor changes made throughout the sheets.
20
1.22022, March2nd public release. Added institution Riksarkivet, Sweden. Added section for "Regional & municipal archives" and institution "Kommunalförbundet Sydarkivera". "Total score" and "Part score" columns now have grey colours to denote any judgment on the scores. Year of last update of format guidelines and link to format guidelines have been added to the "Digital Preservation Strategies" sheet.
21
1.32022, April2nd review phase. Updated State Archives of Belgium values. Added CGM, ODI and DWF file formats. Added several file format and codec combinations for video. New column "Remarks from Institution" added to the table "Digital Preservation Strategies".
22
1.42022, JulyAdded CINES institution.
23
1.52022, October3rd public release. Updated NARA information. Updated versioning of DNG and WARC. Added new Strict conformance to OOXML file formats
24
1.62022, December3rd review phase. Updated values for 5 institutions and added institution "Digital Repository of Ireland" and "SLUB Dresden". Added new subcategory of file formats "Forensic Images" and two file formats in it. Added many new file formats.
25
20222022, DecemberFinal release of 2022 with all changes from the year.