SYNOPSIS

model-wizard [-h | –help] [-j | –json] json-file [-m | –model] model-file [-o | –output-folder] output-folder

The executable can be found under tools/ on Linux and Windows and under BlaceAI.app/Contents/MacOS/ on MacOS.

DESCRIPTION

The model wizard can be used to make existing Torchscript models compatible with the blace.ai framework by tying it to a standardized description file (.json). This .json file describes the models in- and output and sets some arguments. For a given .json and Torchscript file the model wizard will output the .blacemodel and .h file as well as a .bin model payload. Those can be fed to a program with blace::util::registerModel().

Note, that the model wizard does not run the provided model to check if all inputs / outputs are specified. Therefore a faulty .json might only be detected when the model is inferenced as part of a c++ executable built with blace.ai.

Arguments

-j, –json
The model meta / descriptor .json file. Must match the schema described at the bottom of the page.

-m, –model
The torchscript file.

-o, –output-folder
Where to store the final artifacts.

Prerequisites

The Torchscript model has a fixed number of input and output tensors and those tensors have a fixed number of dimensions.

Examples

We provide a sample .json and Torchscript file in the samples/model-wizard folder. Invoke the Model Wizard with
.\model_wizard --model /samples/model-wizard/depth_anything_v2.pt --json /samples/model-wizard/depth_anything_v2.json --output-folder out.

This will create a depth_anything_v2.h, depth_anything_v2.blacemodel and a .bin file in the folder out. Those files can then be used as drop-in replacement for the demo project contained in the SDK (depth estimation on butterfly image).

Model Descriptor

The provided .json makes sure the framework can interpret all models in- and outputs.

Input Types

Please check the example .json files to see how different inputs can be constructed.

MetaTensor

A tensor always has these properties set:

datatype: The bitdepth / datatype of the tensor, e.g. INT_32 or FLOAT_32
normalization: The value range of the contained data, e.g. ZERO_TO_ONE.
order: The channel order, e.g. BCHW (standard image format)
color_format: The color format of the C channel (if present), e.g. RGB.

Float / Int / Bool

For simple types you always have to provide a default value. This default value will be overriden by whatever input you pass to the model. If you have an entry

{
  "name": "float_input",
  "content": {
    "floatEntry": 15.2
  },
  "inputSizes": []
}

the value passed to the model will always be what you constructed with the blace::ops::FromFloatOp().

Input Sizes

Additionally, each input holds an array of ModelInputSize objects, describing each dimension of an input.

modulo_by: Dimension size needs to be divisable by this value.
min: Minimum dimension size.
max: Maximum dimension size.

E.g. a model taking a BCHW image with 3 channels and the width and height dimension being divisable by 32 might have this .json entry for the input tensor

"inputSizes": [
        {
          "modulo_by": 1,
          "min": 1,
          "max": 64
        },
        {
          "modulo_by": 1,
          "min": 3,
          "max": 3
        },
        {
          "modulo_by": 32,
          "min": 32,
          "max": 8196
        },
        {
          "modulo_by": 32,
          "min": 32,
          "max": 8196
        }
      ]

Output Sizes

Last but not least the .json must specify how the output dimensions relate to the input dimensions by containing an array of ModelOutputSize for every output, with these properties:

output_which_input: Which input tensor to check.
output_which_index: Dimension index of the choosen model input.
output_fixed_size: Output is always a fixed size.
output_ignore: This dimension cannot be deduced from the input.
output_multiplier_num / output_multiplier_denom: Specifies the ratio of the output dimension size to the input dimension size.

E.g. an image super resolution model (4x), taking a RGB and BCHW tensor might have this entry:

"inputSizes": [
        {
          "output_which_input": 0, # referring the first (and only) input tensor
          "output_which_index": 0, # referring the first dimension (B)
        },
        {
          "output_fixed_size": 3, # C channel will always have size of 3
        },
        {
          "output_which_input": 0, # referring the first (and only) input tensor
          "output_which_index": 2, # referring the third dimension (H)
          "output_multiplier_num": 4, # H channel will be 4 times as big as input
          "output_multiplier_denom": 1
        },
        {
          "output_which_input": 0, # referring the first (and only) input tensor
          "output_which_index": 3, # referring the fourth dimension (W)
          "output_multiplier_num": 4, # W channel will be 4 times as big as input
          "output_multiplier_denom": 1
        }
      ]

JSON Schema

The provided .json file must match this schema. You can use it if you want to automatize schema validation, but for an easy start we recommend you to modify the provided sample .json:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "definitions": {
    "DataType": {
      "type": "string",
      "description": "The data type of a given tensor.",
      "enum": [
        "INT_32",
        "FLOAT_32",
        "BLACE_BYTE",
        "BLACE_BOOL",
        "FLOAT_32_16",
        "FLOAT_16",
        "SHORT",
        "INT_64",
        "FLOAT_64"
      ]
    },
    "NormalizationEnum": {
      "type": "string",
      "description": "Possible value ranges of a tensor.",
      "enum": [
        "ZERO_TO_ONE",
        "MINUS_ONE_TO_ONE",
        "IMAGENET",
        "UNKNOWN_VALUE_RANGE",
        "ZERO_TO_255",
        "MINUS_0_5_TO_0_5"
      ]
    },
    "Normalization": {
      "type": "object",
      "description": "Normalization / value range object.",
      "properties": {
        "norm": {
          "$ref": "#/definitions/NormalizationEnum"
        },
        "allow_overflow": {
          "type": "boolean",
          "description": "Are values allowed to be outside of given range? E.g. can ZERO_TO_ONE tensor hold 1.1?"
        }
      }
    },
    "Order": {
      "type": "string",
      "description": "The ordering of the tensor. Use UNKNOWN_ORDER if order is not known.",
      "enum": [
        "BTCHW",
        "BCHW",
        "CHW",
        "HWC",
        "BHWC",
        "HW",
        "W",
        "WC",
        "C",
        "BC",
        "BWCH",
        "BHW",
        "BCH",
        "CH",
        "TBCHW",
        "BCWH",
        "BWHC",
        "NO_DIMS",
        "UNKNOWN_ORDER",
        "BOUNDING_BOX_WITH_DIMS",
        "THWC",
        "TCHW"
      ]
    },
    "ColorFormat": {
      "type": "string",
      "description": "The color format of the tensor.",
      "enum": [
        "RGB",
        "R",
        "A",
        "ARGB",
        "AB",
        "ARBITRARY_CHANNELS",
        "UV",
        "LAB"
      ]
    },
    "MetaTensor": {
      "type": "object",
      "description": "An object describing layout and characteristics of a tensor.",
      "properties": {
        "data_type": {
          "$ref": "#/definitions/DataType"
        },
        "normalization": {
          "$ref": "#/definitions/Normalization"
        },
        "color_format": {
          "$ref": "#/definitions/ColorFormat"
        },
        "order": {
          "$ref": "#/definitions/Order"
        }
      }
    },
    "Content": {
      "type": "object",
      "properties": {
        "metaTensor": {
          "$ref": "#/definitions/MetaTensor"
        }
      }
    },
    "ModelInputSize": {
      "type": "object",
      "description": "Object describing one dimension of a model input tensors shape.",
      "properties": {
        "modulo_by": {
          "type": "integer",
          "description": "Value the dimension has to be divisible by."
        },
        "min": {
          "type": "integer",
          "description": "Minimum size of this dimension."
        },
        "max": {
          "type": "integer",
          "description": "Maximum size of this dimension."
        }
      },
      "required": [
        "modulo_by",
        "min",
        "max"
      ]
    },
    "ModelOutputSize": {
      "type": "object",
      "description": "Object describing how a specific output dimension can be constructed from the models inputs.",
      "properties": {
        "output_which_input": {
          "description": "Nth model input.",
          "type": "integer"
        },
        "output_which_index": {
          "type": "integer",
          "description": "Dimension index of the nth model input."
        },
        "output_fixed_size": {
          "type": "integer",
          "description": "This dimension always has a fixed size (e.g. 3 for C channel in rgb images)."
        },
        "output_ignore": {
          "type": "boolean",
          "description": "The size of this dimension cannot be evaluated lazily and is not known at construction time."
        },
        "output_multiplier_num": {
          "type": "integer",
          "description": "Use this together with output_multiplier_denom to specify a factor of resizing. E.g. a 4x superresolution model would output a tensor with H and W dimension 4 times bigger than the input."
        },
        "output_multiplier_denom": {
          "type": "integer",
          "description": "See description of output_multiplier_num."
        }
      }
    },
    "Input": {
      "type": "object",
      "description": "Describes one (of potentially several) model inputs.",
      "properties": {
        "name": {
          "type": "string",
          "description": "Name of the input."
        },
        "content": {
          "$ref": "#/definitions/Content"
        },
        "inputSizes": {
          "type": "array",
          "items": {
            "$ref": "#/definitions/ModelInputSize"
          }
        }
      }
    },
    "Output": {
      "type": "object",
      "description": "Describes one (of potentially several) model outputs.",
      "properties": {
        "name": {
          "type": "string",
          "description": "Name of the output."
        },
        "content": {
          "$ref": "#/definitions/Content"
        },
        "output_sizes": {
          "type": "array",
          "items": {
            "$ref": "#/definitions/ModelOutputSize"
          }
        }
      }
    }
  },
  "type": "object",
  "properties": {
    "inputs": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/Input"
      }
    },
    "outputs": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/Output"
      }
    },
    "torchscriptModel": {
      "type": "object",
      "description": "Properties for Torchscript model.",
      "properties": {
        "can_use_fp16": {
          "type": "boolean",
          "description": "Can this model run with half precision inference?"
        }
      }
    }
  }
}