In search of my own ML model inference library

Scott Jin
3 min readSep 23, 2024

--

Since the last few stories, I have been trying to create some way to infer trained machine-learning models on edge devices. I am glad to say that I have reached a milestone as I am able to run the DTLN model with pure C code and only ONEMKL as a dependency. Sadly, my code is written too hastily and I can’t keep all the variables' names coherent and clean of debug codes, furthermore, I haven’t finished
“ifdef” ing platform-specific code, and I should implement fall-back functions for every operation, so my pride in programming doesn't want me to make it fully public “yet”. However, I do not have enough time due to starting of the semester for a graduate program, but I am eager to share what I have learned. I do have a small working example possible to try out, and the link is here https://github.com/JINSCOTT/DTLN-in-C.

Project environment and dependency

While the goal seems to be to perform these computations on an edge device, I am still in the concept development phase, thus I program and deploy it on my normal work computer. As stated previously, I am still using oneMKL thus I am still programming on an Intel CPU, and to be frank, I am still using Windows as OS. The reason to use oneMKL is that it provides a GEMM implementation, and the Discrete Fourier transform(DFT) is necessary for me to run the DTLN model. I wish to use FFTW3 and openBLAS to fill in the place of oneMKL on Linux-based platforms if I have the time.

The Library: from top to bottom

To use the library, put all the files and the generated header into your project as one of its components.

From top to bottom, the hierarchy is “model”, “nodes”, and “ops”. The nodes consist of one or multiple “tensor”s. There are also other files that contain data structures and utility functions.

The generated header file contains a function to create a “model”. It pushes the “nodes”, the type of layers, and their associated attributes onto a linked list. Then the “inference_model” function is used to traverse the linked list and run the nodes one by one. The nodes are run by using the “function_ops” and “ops”, the first one is for functions with tenors as input and the second one is for raw pointer inputs.

The GitHub repo I provided is an example of how it is envisioned to be deployed.

Current Implemented data types

Current implemented nodes

Here is a list of implemented nodes, they should match the definition of ONNX, however, many of them aren’t very well-tested and are only partially implemented, I might try to make them better if I have time. Here is the list of currently implemented functions.

  1. ADD
  2. SUB
  3. DIV
  4. MUL
  5. SQRT
  6. TANH
  7. SIGMOID
  8. RELU
  9. ABS
  10. ACOS
  11. ACOSH
  12. ATAN
  13. ATANH
  14. ASIN
  15. ASINH
  16. TRANSPOSE
  17. SLICE
  18. SQUEEZE
  19. LSTM
  20. CONCAT
  21. MATMUL
  22. UNSQUEEZE
  23. CONV only one and two dimensions.
  24. REDUCEMEAN
  25. SPLIT
  26. PAD
  27. GEMM
  28. RESHAPE
  29. CONSTANT
  30. CLIP
  31. ARGMAX
  32. ARGMIN
  33. AVERAGE POOL

Final words

From now on I will try to enhance every part of them and create a better description of each of the parts, meanwhile, I’m learning HPC and have a better understanding of OS, hopefully, I will be much more resourceful in optimizing code in the near future.

If you are interested, please give me advice or tell me what things you want to me clarify. Go check the code if you think it might be useful for your use case!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Scott Jin
Scott Jin

Written by Scott Jin

Graduate student from Taiwan in Computer Science at the University of California, Riverside. Passionate about HPC, ML, and embedded software development.

No responses yet

Write a response