Making your project findable is the first stage in achieving a fully accessible archive – if data and information cannot be located, even the most sparkling digital archive remains inaccessible. Creating an OASIS index record for each archaeological project is required by the majority of local authorities and national heritage bodies in England and Scotland, and is considered good practice across the heritage profession (see Infosheet #3 - Digital Archives in the UK).
OASIS is a data-capture system through which practitioners can provide information about their investigations to the wider project team, project stakeholders, researchers and the public. The system provides a unique identifier for the project that links up other important data, such as your project DOI, site code, its location, the HER event number, the museum accession ID and digital archive location. Details about the project are provided at the outset and updated regularly, signposting key documents and archive details.
At the beginning of the archaeological project, the creation of the OASIS record kickstarts the process of documentation and findability. The project will be indexed, providing information about it and signposting reports and archive elements. This enables your project to be part of a dynamic and accessible system, making sure it can be located now and in the future.
For projects undertaken in Wales and Northern Ireland, it is important to ensure that information about your site is included in the national database. In Wales, the four regional Welsh Archaeological Trusts maintain HERs, which provide an index of archaeological projects. For Northern Ireland, the Historic Environment Record of Northern Ireland (HERoNI) provides an index of archaeological projects. (see Infosheet #3 - Digital Archives in the UK).
You can find lots of helpful resources and guides on the OASIS website.
Metadata refers to all the information necessary to find, understand, interpret and use a digital dataset. It is important that metadata are clearly written to express the meaning of data so that it is reusable.
Metadata support long term preservation, providing an account of digital material that determines how different media contained in the archive can be best preserved and used. Data is structured in a way that is machine readable, making datasets easily accessible to multiple resources.
The process of documentation begins early on in your project. For planning purposes, a good place to start is the repository’s deposition requirements – just as guidelines exist for physical archive elements, a digital archive repository has the equivalent for digital archive elements.
In completing the data management plan (DMP), you will need to indicate the intended digital data repository and will also be prompted to contact them during project start-up stages. Relevant guidelines will be accessible online and provide supporting information, such as standards for documentation and metadata.
For example, data archives deposited with the Archaeology Data Service (ADS) will require:
- a collection-level metadata summary documenting the project details and summarising data included in the archive
- a series of data-specific, file-level metadata tables which will commonly include notes on the software used, lists of file names, and contextual information
When planned, the collection of metadata can be a simple process embedded into project workflows, and is more likely to be fully and clearly described by the data or system creator.
Metadata compilation follows the same general principle as other aspects of the archaeological archive. We are used to using registers for finds, photos and plans; metadata tables simply provide an index for data files.
GDPR and personal data
An important part of project documentation, in the context of both deposition and making information publicly accessible, is consideration of GDPR, data sharing and copyright.
For GDPR, your organisational policy on data sharing and privacy will be a good place to start, ensuring that the contents of your data archive are consistent with this. There are some areas where you may feel that personal data should be included in an archive, in which case you will need to ensure that permission has been granted to do so.
Examples of personal data that may be embedded within the working project archive are
- the names and addresses of landowners and other individuals
- personal contact details for specialists
- financial details relating to employees or contractors
- images of contributors or project participants
- names of project participants
Where personal, confidential and sensitive data are considered an important part of the archive for social or contextual information, then a process of informed consent and/or anonymisation may need to take place. If the data does not need to be in the deposited archive, then it should be removed or redacted.
The DMP guides and records this process. You will need to identify whether sensitive data are likely to be included as part of the working project archive and provide a summary of how those data will be managed and included in the project archive.
Data sharing and copyright
Another key area which needs to be fully considered at the outset of the project is copyright for the data embedded within your data archive. Ensuring that ownership, data sharing (including reporting and publication) and preservation (long term) has been discussed with clients and stakeholders, and that copyright agreements are in place where necessary, will help reduce any issues when it comes to deposition. This information should be recorded on your DMP.
How your project data can be accessed and reused will be considered as part of the DMP, and you will need to consider how your project data should be made available once completed. This includes consideration of any potential restrictions but, on a more positive note, how you can maximise reuse through data formats, rich metadata and findability of the archive.
The creator of the work or the organisation leading the work will generally hold the copyright for data, but you may need to review contracts and funding agreements as these can require that another party also holds copyright to works they have supported. If this is the case, you will need to secure an agreement that data can be included in the digital archive prior to deposition.
When depositing the archive, you will most likely be asked to complete a deposit licence. This provides the legal permissions and warranties needed for the repository to preserve your digital archive and to make it freely accessible. Should copyright or data-sharing agreements preclude open access to data, for example if a site is at risk, embargoes or restrictions can be placed on archived materials. This should be discussed with the repository and noted in the DMP.
Digital material, including both data and metadata, should be retrievable in a variety of formats that are usable by people and machines. Anyone with access to the internet should be able to access at least the metadata associated with a research project, and to understand the conditions under which digital data can be accessed. To meet the CIfA Standards fully, the intended trusted digital repository must also support public access to project data in perpetuity.
A data management plan, or DMP, is a document which describes how you are planning to manage the data gathered through the delivery of a project, and what will happen to that data (eg. plans for sharing and preservation) once the project is complete.
Can be located using a unique identifier. In this context, meaning that the digital archive from your project is uploaded to a public repository, and rich metadata are assigned a unique and persistent identifier. The identifier will be assigned by the repository (like a museum accession code) and can be cited in the same way a publication can be.
Metadata are data about a digital resource that is stored in a structured form suitable for machine processing. It serves many purposes in long-term preservation, providing a record of activities that have been performed upon the digital material and a basis on which future decisions on preservation activities can be made, as well as supporting discovery and use.
The form of metadata required for archive deposition may be specified by the receiving repository; for example, see ADS guidelines for depositors.
Data should be richly documented using metadata that meet relevant community standards and provide information about provenance. Data archives are released with a clear and transparent usage licence, so the data repository can manage reuse appropriately. Data formats should be limited to widely used and open formats, consistent with archive needs. In short, data should be easy to use and easily cited, meaning it can be easily integrated into future research.