Making sense of ONNX weight file

5 min readJun 6, 2024

This story is a by-product of my ongoing project trying to infer an ML model from weight files with my code in C. I chose ONNX because its weight file seems easier to parse, and it kind of is true but only after I finished installing everything. I don’t see a lot of tutorials on this thus I will be trying to fill in the gap.

Installing

This part will only apply to Windows, and I am using Visual Studio IDE.

As always, it is necessary to install all necessary dependencies. If you are using the same environment as me, using vcpkg might be the easiest to do it. Create a new project with vcpkg (“vcpkg new — application”)in the root of the visual studio project, put protobuf into vcpkg.json then install it(“vcpkg install”). Content inside vcpkg.json:

{
"dependencies": [
    {
      "name": "protobuf",
      "version>=": "3.21.12"
    }
  ]
}

With vcpkg on hand put the onnx.proto3 file into vcpkg_installed->x64-windows->tools->protobuf and use protoc to create onnx.proto3.pb.cc and onnx.proto3.pb.h. Then it’s time to put both of the generated files into the project and link protobuf. Do not use debug mode because the protobuf that comes with vcpkg is not compiled for it, be sure to use releases mode. I was stuck here for half a day simply because of this mistake.

Get the graph

Now, it's time for real programming. Using protobuf to parse the model is quite simple.

The first step is to load the graph and it goes like this.

std::ifstream input("resnet18-v1–7.onnx", std::ios::in | std::ios::binary); // Open file
onnx::ModelProto model;
model.ParseFromIstream(&input); // parse file
::onnx::GraphProto graph = model.graph(); // the gragh

Inputs and outputs

The inputs and outputs are ValueInfo. They depict the inputs and outputs of nodes. For example, an image classifier model may take an input of a float array with a dimension of (N, 3,224,224), this information will be provided as a ValueInfo. The weights of neurons are also often shown as a ValueInfo and will be shown in the output, but the actual weight values are not here, they are in the initializers.

The ValueInfo can be parsed like this:

void print_io_info(const ::google::protobuf::RepeatedPtrField< ::onnx::ValueInfoProto >& info)
{
 for (auto input_data : info)
 {
  auto shape = input_data.type().tensor_type().shape();

  std::cout << "  " << input_data.name() << ":";
  std::cout << "[";
  if (shape.dim_size() != 0)
  {
   int size = shape.dim_size();
   for (int i = 0; i < size - 1; ++i)
   {
    print_dim(shape.dim(i));
    std::cout << ", ";
   }
   print_dim(shape.dim(size - 1));
  }
  std::cout << "]\n";
 }
}

What are the nodes?

Finally, check out what the nodes are. This is done by accessing graph.node(). These nodes depict the layers of the model, their type(e.g. Relu, Conv, Add) , name, inputs, and outputs. They also contain attributes such as the kernel size, and dilation rate of CNNs. Be sure to check whether all the Inputs and outputs exist as ValueInfo, if not, they should be created. If the nodes are traversed sequentially, all the inputs should exist, but some of the outputs may not. If they are not created the next traversed node will lack some of the inputs.

void print_attributes(const ::google::protobuf::RepeatedPtrField< ::onnx::AttributeProto >& attr)
{
 for (auto a : attr)
 {
  std::cout << "Name: " << a.name() << ",  Type: " << a.type() << "\n";
 }
 std::cout << "\n";
}

// Read repeated strings
void print_repeat_string(const ::google::protobuf::RepeatedPtrField< std::string >& info)
{
 for (auto input_data : info)
 {
  std::cout << input_data << " ";
 }
 std::cout << "\n";
}

// Print the node
void print_node_info(const ::google::protobuf::RepeatedPtrField< ::onnx::NodeProto >& nd)
{
 for (auto input_data : nd)
 {
  std::cout << "Node name: " << input_data.name() << " Type: " << input_data.op_type() << "\n";
  std::cout << "Node input: \n";
  print_repeat_string(input_data.input());
  std::cout << "Node output: \n";
  print_repeat_string(input_data.output());
  std::cout << "Attributes: \n";
  print_attributes(input_data.attribute());
 }
}

Weight tensors

The weights are in graph.initializer() and with their names, they can be linked to their respective ValueInfo.

void print_tensor_info(const ::google::protobuf::RepeatedPtrField< ::onnx::TensorProto >& tensors)
{
    for (auto t : tensors)
    {
        std::cout << t.raw_data();
        std::cout << "Data type: " << t.data_type() << " ,name: " << t.name() << " ,dim size: " << t.dims_size() << " ,dims: ";

        for (int i = 0; i < t.dims_size(); i++) {
            std::cout << t.dims()[i] << " ";
        }
        std::cout << "\n";
    }
}out << "Data type: " << t.data_type() << " ,name: " << t.name() << " ,dim size " << t.dims_size()<<" ,dims: ";

With the semantics of the model clear and all the weight and parameters presented. The journey of writing inference code and optimization can begin! God speed!

Thank you for watching!


// Expanded from the answer in https://stackoverflow.com/questions/67301475/parse-an-onnx-model-using-c-extract-layers-input-and-output-shape-from-an-on

#include <fstream>
#include <cassert>

#include "onnx.proto3.pb.h"

void print_dim(const ::onnx::TensorShapeProto_Dimension& dim)
{
 switch (dim.value_case())
 {
 case onnx::TensorShapeProto_Dimension::ValueCase::kDimParam:
  std::cout << dim.dim_param();
  break;
 case onnx::TensorShapeProto_Dimension::ValueCase::kDimValue:
  std::cout << dim.dim_value();
  break;
 default:
  assert(false && "should never happen");
 }
}

void print_io_info(const ::google::protobuf::RepeatedPtrField< ::onnx::ValueInfoProto >& info)
{
 for (auto input_data : info)
 {
  auto shape = input_data.type().tensor_type().shape();

  std::cout << "  " << input_data.name() << ":";
  std::cout << "[";
  if (shape.dim_size() != 0)
  {
   int size = shape.dim_size();
   for (int i = 0; i < size - 1; ++i)
   {
    print_dim(shape.dim(i));
    std::cout << ", ";
   }
   print_dim(shape.dim(size - 1));
  }
  std::cout << "]\n";
 }
}

void print_repeat_string(const ::google::protobuf::RepeatedPtrField< std::string >& strings)
{
 for (auto input_data : strings)
 {
  std::cout << input_data << " ";
 }
 std::cout << "\n";
}

void print_tensor_info(const ::google::protobuf::RepeatedPtrField< ::onnx::TensorProto >& tensors)
{
 for (auto t : tensors)
 {
  std::cout << t.raw_data();
  std::cout << "Data type: " << t.data_type() << " ,name: " << t.name() << " ,dim size: " << t.dims_size() << " ,dims: ";

  for (int i = 0; i < t.dims_size(); i++) {
   std::cout << t.dims()[i] << " ";
  }
  std::cout << "\n";
 }
}

void print_attributes(const ::google::protobuf::RepeatedPtrField< ::onnx::AttributeProto >& attrs)
{
 for (auto a : attrs)
 {
  std::cout << "Name: " << a.name() << ",  Type: " << a.type() << "\n";
 }
 std::cout << "\n";
}

void print_node_info(const ::google::protobuf::RepeatedPtrField< ::onnx::NodeProto >& nodes)
{
 for (auto input_data : nodes)
 {
  std::cout << "Node name: " << input_data.name() << " Type: " << input_data.op_type() << "\n";
  std::cout << "Node input: \n";
  print_repeat_string(input_data.input());
  std::cout << "Node output: \n";
  print_repeat_string(input_data.output());
  std::cout << "Attributes: \n";
  print_attributes(input_data.attribute());
 }
}

int main(int argc, char** argv)
{
 // Get some simpler pretrained model to begin with https://github.com/onnx/models
 std::ifstream input("resnet18-v1-7.onnx", std::ios::in | std::ios::binary);     // Open file
 onnx::ModelProto model;
 model.ParseFromIstream(&input);           
 ::onnx::GraphProto graph = model.graph(); 

 std::cout << "graph inputs:\n";
 print_io_info(graph.input());

 std::cout << "graph outputs:\n";
 print_io_info(graph.output());

 std::cout << "node info:\n";
 print_node_info(graph.node());

 std::cout << "Initializer: \n";
 print_tensor_info(graph.initializer());
 return 0;
}

If your final goal is to parse and inference the model file. I will suggest parsing inputs and outputs first to populate the “ValueInfo”s first. Then parse the nodes last so that the ValueInfo can be created . Finally, parse the initializer to associate input values and dimensions with the ValueInfo.