DLSpec

A Deep Learning Artifact Exchange Specification

Description

    One Specification

    For training and inference DL tasks

    Specification Description

    System

    System requirements needed to run the model

    Software

    Software stack and environment variables used

    Dataset

    Dataset used for training or validation

    Model

    Model resources and the pre- and post-processing steps

    DLSpec Enables

    Artifact Exchange Specification

    Program- and human-readable To make it possible to develop a runtime that executes DLSpec, the specification should be readable by a program. To allow a user to understand what the task does and repurpose it (e.g. use a different HW/SW stack), the specification should be easy to introspect.

    Decoupling DL task description A DL task is described from model, data, software, and hardware aspects (manifest files). This decoupling increases the reuse of manifests and enables the portability of DL tasks across datasets, software or hardware stacks. This further enables one to easily compare different DL offerings by varying one of the four aspects.

    Splitting the DL task pipeline stages Demarcating the stages of a DL task into pre-processing, run, and post-processing stages enables consistent comparison and simplifies accuracy and performance debugging. E.g. to debug accuracy, one can modify the pre- or post-processing step and observe the accuracy; and to debug performance, one can surround the run stage with the measurement code.

    Avoiding serializing intermediate data into files A naive way to transfer data between stages of a DL task is to use files. In this way, each stage reads a file containing input data and writes to a file with its output data. This approach can be impractical since it introduces high serializing/deserializing overhead. Moreover, this technique would not be able to support DL tasks that use streaming data.