SIP Model

Definition


SIP Model is an abstract model that has to be followed by any PAIS compliant SIP.

Source: Producer-Archive Interface Specification (PAIS) - A Tutorial. Report Concerning Space Data System Standards, CCSDS 651.2-G-1. Green Book. Issue 1. Washington, D.C.: CCSDS, September 2016. [Equivalent to ISO 20104:2015]

Introduction


PAIS provides an abstract SIP Specification (CCSD0017) known as the SIP Model. The SIP Model puts constraints on all possible SIPs within a given Producer-Archive Project.

Section 5 of PAIS explains:

The abstract SIP, or SIP Model, is an abstraction that puts constraints on all possible SIPs. It conceptually conveys one or more complete Transfer Objects. It also conceptually conveys a number of attributes about the SIP.

The framework for this SIP model is based on the concept of containers. The SIP Model is a container that holds any number of internal containers which themselves may have containers, and so on, thus supporting multiple hierarchies of containers. A container may also hold attributes about itself.

At the highest level, the SIP has two mandatory containers: the "SIP global information container" and the "transfer object type container." SIPs can also have an optional container for transfer objects to delete. Figure 3-5 in the PAIS Tutorial provides a helpful illustration of the SIP Model:

Consultative Committee for Space Data Systems. Producer-Archive Interface Specification (PAIS) – A Tutorial. CCSDS 651.2-G-1 (September 2016). https://public.ccsds.org/Pubs/651x2g1.pdf

SIP Model Specification (PAIS 5.2)


PAIS 5.2 provides the SIP model specification (CCSDS0017).

SIP Global Information Container (mandatory)

SIP structure that holds a set of attributes supporting the unique identification of each SIP within the Producer-Archive Project and the ability to optionally track the sequencing of SIPs. Users may incorporate additional attributes as needed to further specialize the SIP Global Information.

  • SIP ID (mandatory) - Identifier of the delivered SIP within the context of the given Producer-Archive Project. If there are multiple Producer Sources submitting SIPs within a single Producer-Archive Project, this SIP ID must be unique across all such Producer Sources. It is inserted during SIP construction. The form shall be agreed between Producer and Archive, but the identifier shall be generated by the Producer. The Archive shall check uniqueness.

  • Producer-Archive Project ID (mandatory) - Identifier of the Producer-Archive Project that distinguishes the project from all other Producer-Archive Projects undertaken by this Archive. This ID shall be provided by the Archive for use in the SIPs (see also first paragraph PAIS section 3).

  • SIP Content Type ID (mandatory) - Identifier of the specification that defines which Transfer Object Types (i.e., Descriptor IDs) are allowed within this SIP, as well as their occurrence within the SIP. It has been defined previously in this document (see PAIS section 4).

  • SIP Sequence Number (optional) - Number indicating the order in which the SIP has been sent. This number is unique within the combined context of the Producer-Archive Project and Producer Source ID. This becomes mandatory for all SIPs sent by a Producer Source if any of the Transfer Objects to be provided by the Producer Source have a Descriptor that does not specify a unique value for the number of Transfer Objects to be delivered. It shall be generated by the Producer.

  • Any (optional) - Mechanism that allows a SIP to have any additional attributes within the structure of the SIP corresponding to this container.

Transfer Objects to Delete Container (optional)

SIP structure composed of one or more attributes giving the identification of the Transfer Objects previously sent to the Archive that must be deleted by the Archive. Users may incorporate additional attributes as needed to further specialize the Transfer Object To Delete container.

  • Transfer object to delete ID - Identifier of the Transfer Object ID of a previously sent Transfer Object that is to be deleted by the Archive.

  • Any (optional) - Mechanism that allows a SIP to have any additional attributes within the structure of the SIP corresponding to this container.

Transfer Object Container (mandatory)

SIP structure that conceptually holds two types of containers as shown schematically in PAIS figure 5-2. The Transfer Object container consists of: one Transfer Object Identification and Status container; and one or more Transfer Object Group containers.

  • Transfer object identification and status container (mandatory) - SIP structure that holds a set of attributes supporting unique identification and replacement status information about this Transfer Object in the SIP. Users may incorporate additional attributes as needed to further specialize the Transfer Object.

    • Descriptor ID (mandatory) - Identifier of the Transfer Object Type Descriptor that describes this type of Transfer Object. This is obtained from the model of objects for transfer (MOT).

    • Transfer object ID (mandatory) - Identifier for each delivered Transfer Object within the Producer-Archive Project. It is inserted during SIP building. The form shall be agreed between Producer and Archive. For example, it could be constructed by concatenating the SIP ID (or Descriptor ID) and some sequence number for each Transfer Object. It shall be generated by the Producer. Uniqueness shall be checked by the Archive.

    • Last transfer object flag (optional) - Indicator specifying that this Transfer Object is the last Transfer Object of this type (i.e., within the scope of this Descriptor) being delivered by this Producer Source. This attribute is particularly useful when the number of Transfer Objects to be delivered is not known in advance. If used with a single Producer Source for Transfer Objects of this type, this flag eliminates the need for an additional contact between the Archives and the Producer Source to verify that all such Transfer Objects have been received. If there are multiple Producer Sources that may be delivering Transfer Objects of this type, the Archive may or may not need to contact these Producer Sources to determine when all such Transfer Objects have been sent and received.

    • Replacement transfer object ID (optional) - Identifier of the Transfer Object ID of a previously sent Transfer Object that is to be replaced by this Transfer Object.

    • Any (optional) - Mechanism that allows a SIP to have any additional attributes within the structure of the SIP corresponding to this container.

    • Transfer object group identification container (mandatory) - SIP structure that holds a set of attributes identifying the type of group and optionally naming the group instance. Users may incorporate additional attributes as needed to further specialize the Transfer Object Group Identification information.

      • Associated descriptor group type ID (mandatory) - Identifier of the associated group description within the associated Descriptor.

        • If this group is an instance of a Transfer Object Group Type as specified in the Descriptor, then the value of this attribute shall be the Transfer Object Group Type ID of that Transfer Object Group Type.

        • If this group is an instance that is part of a data structure transferred under a Transfer Object Group Type whose Transfer Object Group Structure Name has the value ‘undescribed’, then the value of this Associated Descriptor ID shall be the Transfer Object Group Type ID of that Transfer Object Group Type.

      • Choice of the following two attributes (mandatory, choose one) - transfer object group instance name OR transfer object group preservation name.

        • Transfer object group instance name (mandatory unless preservation name attribute is used) - Name given to the group, such as a directory name, that is associated with the Transfer Object Group instance. It shall be provided by the Producer. If the group has been modeled as a directory (i.e., Structure Name =‘directory’), it shall be the name of the directory excluding any path information.

        • Transfer object group preservation name (mandatory unless instance name attribute is used) - Name given to the group, such as a directory name, that shall be preserved by the Archive in association with the Transfer Object Group instance. It shall be provided by the Producer. If the group has been modeled as a directory (i.e., Structure Name =‘directory’), it shall be the name of the directory excluding any path information.

      • Any (optional) - Mechanism that allows a SIP to have any additional attributes within the structure of the SIP corresponding to this container.

    • Transfer object group container (optional) - SIP structure that conceptually holds any number of additional Transfer Object Group containers.

    • Data object container (optional) - SIP structure that conceptually holds two or more containers as shown schematically in PAIS Figure 5-4. These are the: data object identification container and one or more byte stream containers.

      • Data object identification container (mandatory) - SIP structure that holds a set of attributes identifying the type of Data Object and optionally supplying a name that is to be preserved along with the associated byte stream or streams. Users may incorporate additional attributes as needed to further specialize the Data Object Identification information.

        • Associated descriptor data ID (mandatory) - Identifier of the associated data description within the associated Descriptor.

          • If this is an instance of a Data Object Type defined in the Descriptor then this shall be the Data Object Type ID of that Data Object Type.

          • If this is an instance of a Transfer Object Group Type defined in the Descriptor to be encoded and thus it results in a single Data Object, then this shall be the Transfer Object Group Type ID of that Transfer Object Group Type.

          • If this is an instance of a Data Object that is transferred within the context of a Descriptor defined Transfer Object Group Type whose Transfer Object Group Type Structure Name has the value ‘undescribed’, then this shall be the Transfer Object Group Type ID of that Transfer Object Group Type.

        • Data object preservation name (optional) - Name to be preserved in association with the Data Object instance. When the Data Object is composed of a single byte stream, it tells the Archive exactly what name is to be preserved in association with that byte stream. When the Data Object is composed of multiple byte streams, it tells the Archive exactly what name is to be preserved in association with the set of byte streams. It shall be provided by the Producer.

        • Any (optional) - Mechanism that allows this SIP structure to have any additional attributes that may be needed.

      • Byte stream container (mandatory) - SIP structure that holds a set of attributes that provide a byte stream and/or a pointer to a byte stream outside the SIP.

        • Byte stream (optional) - stream of bytes.

        • Byte stream checksum (optional) - checksum covering the stream of bytes.

        • Pointer to byte stream (optional) - A pointer to a byte stream outside the SIP.

        • Any (optional) - Mechanism that allows this SIP structure to have any additional attributes that may be needed.

Implementing the SIP Model


There are two methods by which a Producer-Archive Project can achieve SIPs that conform to the PAIS SIP Model:

As noted in Section 6.2.4 of the PAIS Tutorial:

The PAIS specifies a standard packaging mechanism for the implementation of PAIS SIPs. It is based on use of the XFDU packaging standard. When this is followed, and the semantics of PAIS section 5 are followed, the resulting implementation is said to be ‘XFDU PAIS SIP Conformant’. However, it is acceptable to use other packaging mechanisms. In this case the resulting SIP implementation can be said to be ‘Abstract PAIS SIP Conformant’ provided it also adheres to the semantics of PAIS section 5.

Relationship between SIP Model and Bag-info.txt


Use of the BagIt specification does not necessarily result in an a SIP that conforms to the abstract SIP specification defined in PAIS. Further enhancements are required.

The following table illustrates the relationship between metadata elements specified in Section 2.2.2 of the BagIt File Packaging Format and the SIP Model attributes specified in Section 5.2.4 of PAIS. In some cases, the relationship is implicit

Bag-info.txt field

Description

PAIS SIP model attribute

Notes

Bag-info.txt field

Description

PAIS SIP model attribute

Notes

Source-Organization

Organization transferring the content.

Producer source ID



Organization-Address

Mailing address of the organization.

N/A



Contact-Name

Person at the source organization who is responsible for the content transfer.

N/A



Contact-Phone

International format telephone number of person or position responsible.

N/A



Contact-Email

Fully qualified email address of person or position responsible.

N/A



External-Description

A brief explanation of the contents and provenance.

N/A

This information maps to the "transfer object type description" attribute in a transfer object type descriptor

Bagging-Date

Date (YYYY-MM-DD) that the content was prepared for transfer. This metadata element SHOULD NOT be repeated.





External-Identifier

A sender-supplied identifier for the bag.





Bag-Size

The size or approximate size of the bag being transferred, followed by an abbreviation such as MB (megabytes), GB (gigabytes), or TB (terabytes): for example, 42600 MB, 42.6 GB, or .043 TB. Compared to Payload-Oxum (described next), Bag-Size is intended for human consumption. This metadata element SHOULD NOT be repeated.



This information maps to the transfer object type size attribute in a transfer object type descriptor

Payload-Oxum

The "octetstream sum" of the payload, which is intended for the purpose of quickly detecting incomplete bags before performing checksum validation. This is strictly an optimization, and implementations MUST perform the standard checksum validation process before proclaiming a bag to be valid. This element MUST NOT be present more than once and, if present, MUST be in the form "_OctetCount_._StreamCount_", where _OctetCount_ is the total number of octets (8-bit bytes) across all payload file content and _StreamCount_ is the total number of payload files. This metadata element MUST NOT be repeated.





Bag-Group-Identifier

A sender-supplied identifier for the set, if any, of bags to which it logically belongs. This identifier SHOULD be unique across the sender's content, and if it is recognizable as belonging to a globally unique scheme, the receiver SHOULD make an effort to honor the reference to it. This metadata element SHOULD NOT be repeated.





Bag-Count

Two numbers separated by "of", in particular, "N of T", where T is the total number of bags in a group of bags and N is the ordinal number within the group. If T is not known, specify it as "?" (question mark): for example, 1 of 2, 4 of 4, 3 of ?, 89 of 145. This metadata element SHOULD NOT be repeated. If this metadata element is present, it is RECOMMENDED to also include the Bag-Group-Identifier element.





Internal-Sender-Identifier

An alternate sender-specific identifier for the content and/or bag.





Internal-Sender-Description

A sender-local explanation of the contents and provenance.