Digital object quality control with ElasticSearch queries
Introduction
- The Archives Catalogue is indexed with ElasticSearch. Users can construct "expert" search queries to search individual fields in the ElasticSearch index.
- Editors and contributors can use ElasticSearch queries to support quality control activities. For example, ElasticSearch queries can check for the existence of mandatory data elements, such as name of repository, title, physical description, etc.
- Editors and contributors can use search results as a "checklist" of records with specific errors or omissions to be corrected.
- This page provides information on how to use the ElasticSearch index to search for archival descriptions with digital objects. Most of these ElasticSearch queries can be replicated using the advanced search form.
Quality control of digital objects
See Appendix A of the Digital Collections Handbook for detailed guidance on quality control of digital objects (i.e., access copies) uploaded to the Archives Catalogue and other platforms. Staff responsible for uploading digital objects to the Archives Catalogue are normally responsible for performing their own quality control throughout the digital collection workflow.
Search for digital objects
Use the following ElasticSearch queries to perform various searches related to digital objects:
Search query | Purpose | Expected results |
---|---|---|
digitalObject.filename:"filename" | Searches for records that contain a specific digital object file name. File name must be in quotations regardless of whether spaces exist. | N/A |
digitalObjectMediaTypeId | Searches for records that contain an internal ID number for digital objects. Default values are:
Media Type is also available in the search/browse interface. | |
hasDigitalObject:1 | Find all archival descriptions with a digital object. Also a filter in the advanced search panel. | Variable, should match the total number of archival descriptions with "Digital object available" |
hasDigitalObject:0 | Find all archival descriptions without a digital object. | Variable |
digitalObject.thumbnailPath | N/A | N/A |
transcript | Searches the indexed text captured from the text or OCR layer of a PDF or other text-based document uploaded as a digital object. Example: transcript:montgomery will produce search results where the term "montgomery" is found in the indexed text captured in documents uploaded as a digital object, even if the term is not included in the descriptive metadata. | The advanced search panel has an option to limit a search to "Digital object text." |