TFlite之格式解析2__专栏_RISC-V MCU中文社区

TFlite主要用于设备端的机器智能应用。

其工作流程如下图（见参考1）：

step1: 首先在服务器上使用TensorFLow（或pytorch、keras等）开发模型，训练出权重
step2：使用转换器将模型转换为tflite格式（flatbuffer格式）
step3：在嵌入式设备端（手机、MCU等）加载tflite模型，并执行计算，得到结果。

这篇文章继续分析tflite文件的格式。

1 Tflite 格式

TFlite主要由计算图与权重组成。如下图所示：

table Model {
    version: uint,
    operator_codes: [
    ...
    ],
    subgraphs: [
        tensors: [],
        inputs: [],
        outputs: [],
        operators: [],
    ],
    description: "MLIR Converted.",    
    buffers: [],
}

这个结构对应的图表示如右图。

其中：

operator_codes结构体定义了该模型用到的算子。

subgraphs定义了各个子图

buffer定义了数据存储区，计算图中的权重放到buffer中，通过索引来找到对应buffer，

对于subgraphs结构体：

计算图中tensors[]描述了所有的Tensor（包括input/output Tensor）

inputs与outputs指示输入输出的Tensor的id（在tensors[]结构体中的索引值）

operators结构体定义了算子，指出哪些tensor输入，经过什么算子，获得什么tensor输出，这样就可以得到数据流图。

使用flat工具将模型文件解析为json文件，这样可以清楚的看出整个整个数据的流程，参考《tflite格式解析》中的方法2，如下：以mnist_valid_q.tflite为例，用flat命令转化为json文件。

flatc -t schema.fbs -- mnist_valid_q.tflite

下面分析，mnist_valid_q.json文件

2 json文件的解析

json的主要结构（略有删节，中间添加部分注释，可与table Model结构体一一对应）：

{
  version: 3,
  /* operator_codes[] 算子索引 */
  operator_codes: [
    {
      deprecated_builtin_code: 3,
      version: 3,
      builtin_code: "CONV_2D"
    },
    {
      deprecated_builtin_code: 40,
      version: 2,
      builtin_code: "MEAN"
    },
    {
      deprecated_builtin_code: 9,
      version: 4,
      builtin_code: "FULLY_CONNECTED"
    },
    {
      deprecated_builtin_code: 25,
      version: 2,
      builtin_code: "SOFTMAX"
    }
  ],
  /* subgraphs[] 各个子图 */
  subgraphs: [
    {
      /* tensors[] */
      tensors: [
        /* tensor idx = 0 */
        {
          shape: [
            1,
            28,
            28,
            1
          ],
          type: "INT8",
          buffer: 1,
          name: "ftr0_input",
          quantization: {
            scale: [
              0.003922
            ],
            zero_point: [
              -128
            ]
          },
          shape_signature: [
            -1,
            28,
            28,
            1
          ]
        },
        /* tensor idx = 1 */
        {
          shape: [
            2
          ],
          type: "INT32",
          buffer: 2,
          name: "sequential_1/GAP/Mean/reduction_indices",
          quantization: {
          }
        },
        /* tensor idx = 2 */
        {
          shape: [
            4,
            3,
            3,
            1
          ],
          type: "INT8",
          buffer: 3,
          name: "sequential_1/ftr0/Conv2D",
          quantization: {
            scale: [
              0.012358,
              ...
            ],
            zero_point: [
              0,
              ...
            ]
          }
        },
        /* tensor idx = 3 */
        {
          shape: [
            4
          ],
          type: "INT32",
          buffer: 4,
          name: "sequential_1/relu0/Relu;sequential_1/bn0/FusedBatchNormV3;sequential_1/ftr0/BiasAdd/ReadVariableOp/resource;sequential_1/ftr0/BiasAdd;sequential_1/ftr0/Conv2D",
          quantization: {
            scale: [
              0.000048,
              ...
            ],
            zero_point: [
              0,
              ...
            ]
          }
        },
        /* tensor idx = 4 */
        {
          shape: [
            8,
            3,
            3,
            4
          ],
          type: "INT8",
          buffer: 5,
          name: "sequential_1/ftr1/Conv2D",
          quantization: {
            scale: [
              0.004238,
              ...
            ],
            zero_point: [
              0,
              ...
            ]
          }
        },
        /* tensor idx = 5 */
        {
          shape: [
            8
          ],
          type: "INT32",
          buffer: 6,
          name: "sequential_1/relu1/Relu;sequential_1/bn1/FusedBatchNormV3;sequential_1/ftr1/BiasAdd/ReadVariableOp/resource;sequential_1/ftr1/BiasAdd;sequential_1/ftr1/Conv2D",
          quantization: {
            scale: [
              0.000069,
              ...
            ],
            zero_point: [
              0,
              ...
            ]
          }
        },
        /* tensor idx = 6 */
        {
          shape: [
            16,
            ...
          ],
          type: "INT8",
          buffer: 7,
          name: "sequential_1/ftr2/Conv2D",
          quantization: {
            scale: [
              0.021189,
              ...
            ],
            zero_point: [
              0,
              ...
            ]
          }
        },
        /* tensor idx = 7 */
        {
          shape: [
            16
          ],
          type: "INT32",
          buffer: 8,
          name: "sequential_1/activation/Relu;sequential_1/batch_normalization/FusedBatchNormV3;sequential_1/ftr2/BiasAdd/ReadVariableOp/resource;sequential_1/ftr2/BiasAdd;sequential_1/ftr2/Conv2D",
          quantization: {
            scale: [
              0.000342,
              ...
            ],
            zero_point: [
              0,
              ...
            ]
          }
        },
        /* tensor idx = 8 */
        {
          shape: [
            10,
            16
          ],
          type: "INT8",
          buffer: 9,
          name: "sequential_1/fc1/MatMul",
          quantization: {
            scale: [
              0.021471
            ],
            zero_point: [
              0
            ]
          }
        },
        /* tensor idx = 9 */
        {
          shape: [
            10
          ],
          type: "INT32",
          buffer: 10,
          name: "sequential_1/fc1/BiasAdd/ReadVariableOp/resource",
          quantization: {
            scale: [
              0.00048
            ],
            zero_point: [
              0
            ]
          }
        },
        /* tensor idx = 10 */
        {
          shape: [
            1,
            13，
            13，
            4
          ],
          type: "INT8",
          buffer: 11,
          name: "sequential_1/relu0/Relu;sequential_1/bn0/FusedBatchNormV3;sequential_1/ftr0/BiasAdd/ReadVariableOp/resource;sequential_1/ftr0/BiasAdd;sequential_1/ftr0/Conv2D1",
          quantization: {
            scale: [
              0.016224
            ],
            zero_point: [
              -128
            ]
          },
          shape_signature: [
            -1,
            ...
          ]
        },
        /* tensor idx = 11 */
        {
          shape: [
            1,
            ...
          ],
          type: "INT8",
          buffer: 12,
          name: "sequential_1/relu1/Relu;sequential_1/bn1/FusedBatchNormV3;sequential_1/ftr1/BiasAdd/ReadVariableOp/resource;sequential_1/ftr1/BiasAdd;sequential_1/ftr1/Conv2D1",
          quantization: {
            scale: [
              0.016134
            ],
            zero_point: [
              -128
            ]
          },
          shape_signature: [
            -1,
            ...
          ]
        },
        /* tensor idx = 12 */
        {
          shape: [
            1,
            2,
            2,
            16
          ],
          type: "INT8",
          buffer: 13,
          name: "sequential_1/activation/Relu;sequential_1/batch_normalization/FusedBatchNormV3;sequential_1/ftr2/BiasAdd/ReadVariableOp/resource;sequential_1/ftr2/BiasAdd;sequential_1/ftr2/Conv2D1",
          quantization: {
            scale: [
              0.056649
            ],
            zero_point: [
              -128
            ]
          },
          shape_signature: [
            -1,
            ...
          ]
        },
        /* tensor idx = 13 */
        {
          shape: [
            1,
            16
          ],
          type: "INT8",
          buffer: 14,
          name: "sequential_1/GAP/Mean",
          quantization: {
            scale: [
              0.022351
            ],
            zero_point: [
              -128
            ]
          },
          shape_signature: [
            -1,
            16
          ]
        },
        /* tensor idx = 14 */
        {
          shape: [
            1,
            10
          ],
          type: "INT8",
          buffer: 15,
          name: "sequential_1/fc1/MatMul;sequential_1/fc1/BiasAdd",
          quantization: {
            scale: [
              0.151394
            ],
            zero_point: [
              42
            ]
          },
          shape_signature: [
            -1,
            10
          ]
        },
        /* tensor idx = 15 */
        {
          shape: [
            1,
            10
          ],
          type: "INT8",
          buffer: 16,
          name: "Identity",
          quantization: {
            scale: [
              0.003906
            ],
            zero_point: [
              -128
            ]
          },
          shape_signature: [
            -1,
            10
          ]
        }
      ],
      /* inputs tensor id */
      inputs: [
        0
      ],
      /* outputs tensor id */
      outputs: [
        15
      ],
      
      /* operators[] */
      operators: [
        {
          inputs: [
            0,
            2,
            3
          ],
          outputs: [
            10
          ],
          builtin_options_type: "Conv2DOptions",
          builtin_options: {
            padding: "VALID",
            stride_w: 2,
            stride_h: 2,
            fused_activation_function: "RELU"
          }
        },
        {
          inputs: [
            10,
            4,
            5
          ],
          outputs: [
            11
          ],
          builtin_options_type: "Conv2DOptions",
          builtin_options: {
            padding: "VALID",
            stride_w: 2,
            stride_h: 2,
            fused_activation_function: "RELU"
          }
        },
        {
          inputs: [
            11,
            6,
            7
          ],
          outputs: [
            12
          ],
          builtin_options_type: "Conv2DOptions",
          builtin_options: {
            padding: "VALID",
            stride_w: 2,
            stride_h: 2,
            fused_activation_function: "RELU"
          }
        },
        {
          opcode_index: 1, /* 在 operator_codes[] 中找，为mean */
          inputs: [
            12,
            1
          ],
          outputs: [
            13
          ],
          builtin_options_type: "ReducerOptions",
          builtin_options: {
          }
        },
        {
          opcode_index: 2, /* 在 operator_codes[] 中找，为fully_connected */
          inputs: [
            13,
            8,
            9
          ],
          outputs: [
            14
          ],
          builtin_options_type: "FullyConnectedOptions",
          builtin_options: {
          }
        },
        {
          opcode_index: 3, /* 在 operator_codes[] 中找，为softmax */
          inputs: [
            14
          ],
          outputs: [
            15
          ],
          builtin_options_type: "SoftmaxOptions",
          builtin_options: {
            beta: 1.0
          }
        }
      ],
      name: "main"
    }
  ],
  description: "MLIR Converted.",
  buffers: [
    /* buffer id = 0 */
    {
    },
    /* buffer id = 1 */
    {
    },
    /* buffer id = 2 */
    {
      data: [
        1,
        0,
        0,
        0,
        ...
      ]
    },
    /* buffer id = 3 */
    {
      data: [
        47,
        94,
        91,
        70,
        ...
      ]
    },
    /* buffer id = 4 */
    {
      data: [
        20,
        254,
        255,
        255,
        ...
      ]
    },
    /* buffer id = 5 */
    {
      data: [
        241,
        173,
        220,
        195,
        ...
      ]
    },
    /* buffer id = 6 */
    {
      data: [
        36,
        55,
        0,
        0,
        ...
      ]
    },
    /* buffer id = 7 */
    {
      data: [
        251,
        223,
        14,
        1,
        ...
      ]
    },
    /* buffer id = 8 */
    {
      data: [
        104,
        226,
        255,
        255,
        ...
      ]
    },
    /* buffer id = 9 */
    {
      data: [
        42,
        226,
        41,
        191,
        ...
      ]
    },
    /* buffer id = 10 */
    {
    },
    /* buffer id = 11 */
    {
    },
    /* buffer id = 12 */
    {
    },
    /* buffer id = 13 */
    {
    },
    /* buffer id = 14 */
    {
    },
    /* buffer id = 15 */
    {
    },
    /* buffer id = 16 */
    {
      data: [
        49,
        46,
        49,
        52,
        ...
      ]
    },
    /* buffer id = 17 */
    {
      data: [
        12,
        0,
        0,
        0,
        ...
      ]
    }
  ],
  metadata: [
    {
      name: "min_runtime_version",
      buffer: 17
    },
    {
      name: "CONVERSION_METADATA",
      buffer: 18
    }
  ],
  signature_defs: [

  ]
}

以operators为切入点，可以获取整个计算流图，第一个算子描述如下：

       {
          inputs: [
            0,
            2,
            3
          ],
          outputs: [
            10
          ],
          builtin_options_type: "Conv2DOptions",
          builtin_options: {
            padding: "VALID",
            stride_w: 2,
            stride_h: 2,
            fused_activation_function: "RELU"
          }
        },

表示：算子的输入为tensor0，tensor2，和tensor3，输出为tensor10，算子为Conv2D，无padding，stride_w与stride_h为2，激活函数为relu

在tensor[]中寻找输入输出测tensor描述。

tensor0描述为：

        {
          shape: [
            1,
            28,
            28,
            1
          ],
          type: "INT8",
          buffer: 1,
          name: "ftr0_input",
          quantization: {
            scale: [
              0.003922
            ],
            zero_point: [
              -128
            ]
          },
          shape_signature: [
            -1,
            28,
            28,
            1
          ]
        },