A quick tour of Torch internals
来源:互联网 发布:人类学 知乎 编辑:程序博客网 时间:2024/06/03 21:58
Recently, I have been kind of confused. I couldn’t find myself anything to work on and had no ideas for new projects (apparently, I just had to wait for the new academic year to start - I have plenty of ideas now, but no time for them).
Anyway, I often get the impression that many people are using Machine Learning libraries as a kind of black-boxes with only a high-level API. It’s as if they weren’t interested at all in how they work, but solely in the output (this is why I like Torch so much - it’s hackable to the bone). I’ve been using Torch for a few months now and I’ve always been curious how it’s built. This is why I decided to get down to it and browse the code of TH library, which is at the core of Torch.
It’s really a great thing to do. I’ll write more about it in the end of this post, but you should seriously consider doing it with your favourite library or framework too.
Torch’s source is written in plain C, which was very pleasing to me. I don’t really like many C++ features and although I find it very powerful and flexible, it often seems confusing. C’s extremely minimal syntax allows you to read and quickly grasp what exactly happens at any moment. However, if C++ is the way to go for you, there is also a wrapper around TH called thpp.
Where can you get it?
You can find the TH
library in two places:
- In its standalone repository (git subtree of torch7; outdated at the time of writing)
- In torch7 repository in
lib/TH
folder (always up to date)
The folder structure is very simple. There are some cmake tests and definitions in cmake
directory while the code is located both in generic
directory and at the repo root.
Interesting findings
Before going into the details and describing functionality implemented in individual files I’d like to point out some really cool techniques that I’ve found in the implementation.
Code generation
First thing that appeared really strange to me was that many files existed both in the root folder as well as in generic
. If you opened them, you would quickly notice that copies in generic
contain the actual code, while at the root they all look very similar. Here is THStorage.c
for example:
#include "THAtomic.h"#include "THStorage.h"#include "generic/THStorage.c"#include "THGenerateAllTypes.h"#include "generic/THStorageCopy.c"#include "THGenerateAllTypes.h"
Quite unusual for a .c
file, right?
THGenerateAllTypes
sounds interesting so i looked it up and this if what I’ve found:
#ifndef TH_GENERIC_FILE#error "You must define TH_GENERIC_FILE before including THGenerateAllTypes.h"#endif#define real unsigned char#define accreal long#define Real Byte#define TH_REAL_IS_BYTE#line 1 TH_GENERIC_FILE#include TH_GENERIC_FILE#undef real#undef accreal#undef Real#undef TH_REAL_IS_BYTE#define real char#define accreal long#define Real Char#define TH_REAL_IS_CHAR#line 1 TH_GENERIC_FILE#include TH_GENERIC_FILE#undef real#undef accreal#undef Real#undef TH_REAL_IS_CHAR...
Which continued for a few more types. At first I was puzzled, but then I suddenly realized what it does and how brilliant this is! There are no templates in C, but objects like THStorage should be available for different types. It would be a terrible waste to repeat the same implementation with just a few words replaced and this is what this piece achieves! In generic files you can see variables of type real
all over the place. At first it was obvious to me that it’s probably a matter of some compile time optimizations whether it was chosen to be a float or a double, but apparently it’s different - it allows code generation for many other types too!
Clever usage of macros also makes the generic files more readable. Take this example from generic/THStorage.c
:
real* THStorage_(data)(const THStorage *self)
It looks nice, but what about name conflicts for different types? It can’t be THStorage
and THStorage_data
all the time! Worry not, macros take care of that as well:
#define THStorage TH_CONCAT_3(TH,Real,Storage)#define THStorage_(NAME) TH_CONCAT_4(TH,Real,Storage_,NAME)
During preprocessing this function name will be expanded to something like THByteStorage_data
and THStorage
will be replaced with THByteStorage
. Super cool x2!
It’s also smart to use a #line 1 TH_GENERIC_FILE
directive, because if there would be any errors they will appear in the compiler as if they were in the original generic file - not in the middle of the implementation pasted over and over.
I think that these are some awesome ways to make C code more type-agnostic.
OOP & Virtual tables
TH also implements a file API, where you can find good examples of how you could implement some basic OOP patterns in C. There are four files that I’ll be talking about here:
THFilePrivate.h
- defines basic structsTHFile.c
- contains some generic implementationTHDiskFile.c
- code for handling disk filesTHMemoryFile.c
- implementation of in-memory files
Let’s start with the private header file.
struct THFile__{ struct THFileVTable *vtable; int isQuiet; ...};struct THFileVTable{ int (*isOpened)(THFile *self); long (*readByte)(THFile *self, unsigned char *data, long n); long (*readChar)(THFile *self, char *data, long n); long (*readShort)(THFile *self, short *data, long n); long (*readInt)(THFile *self, int *data, long n); ...};
You can see that it defines a virtual method table with pointers to functions that THFile
subclasses will have to implement (THFile
is an abstract class - it has no constructors). Other structs are defined as such:
typedef struct THDiskFile__{ THFile file; FILE *handle; char *name; int isNativeEncoding;} THDiskFile;
What makes this struct interesting is that because it’s first member is of type THFile it’s actually valid to cast struct THDiskFile *
to struct THFile *
and use it normally. What’s more, because THDiskFile
’s constructor fills in the function pointers in file
field’s virtual table, it will behave as THDiskFile
object even when casted to THFile
!
Shared memory
I had little knowledge about UNIX process management and threading until now, when I took up an operating systems course at my university, so it was really interesting to learn about mmap
(maps a file to memory, so you can use it like an array) and to see how memory can be shared between processes with shm_open
. I even wrote a piece of code to try it out. You can find it here.
SIMD
Another cool thing you can find in TH
are vector instructions. There are some cmake tests that check if they are available on your CPU (cmake/FindSSE.cmake
) and several files implementing convolution operations using them (generic/simd/*
). I can’t understand it yet - function that takes 10 lines of code is expanded to unrolled vectorized loop taking more than 120 lines and using APIs with unreadable function names for a SSE beginner. This code spans 134 lines after macro expansion:
void convolve_5x5_1_avx(float* output, float* image, float* weight, long count, long outputStride, long inputStride) { long i = 0; long alignedCount = count & 0xFFFFFFF8; DECLARE_OUTPUT_1() for (; i < alignedCount; i+=8) { CONVOLVE_8COLS_XROWS(1, i) }}
Anyway, it’s definitely a thing worth learning so I will probably write more about it soon!
Allocators
TH
declares it’s own function for memory allocation called THAlloc
. It tries to allocate a properly aligned chunks if you allocate big blocks and handles out-of-memory errors. Before reading Torch’s source I didn’t know about the concept of allocators. They are just small virtual tables providing their own memory management API (alloc, realloc, free). It’s cool that you can pass an Allocator to THStorage
or THTensor
and construct it not only in the regular heap region, but also allocate it in the shared memory.
Random module
It’s natural to have a pseudorandom number generator in all programming languages, but I’ve never read an implementation of one (ok, except the linear congruential generator). In THRandom.c
you can find a full implementation of Mersenne twister, which (according to Wikipedia) is a default implementation for R, Python, Ruby, PHP, CMU Common Lisp, GLib, MATLAB and some more. There are also several methods which convert returned uniform distribution into other shapes.
Quick library overview
In this section I will briefly describe most of the functionalities provided by TH
.
- THAllocator
- creates a default default allocator, which just calls
TH
memory management functions and, if possible, aTHMapAllocator
that can map files or shared memory objects into memory.
- creates a default default allocator, which just calls
- THAtomic
- multiplatform implementation of atomic operations
- THTensor
- defines a general Tensor type
- supports lots of indexing, linear algebra and math operations
- available for all primitive datatypes (
TH<type>Tensor
, e.g.THFloatTensor
)
- THBlas
- wraps BLAS library for use in
THTensor
- provides a general implementation as a fallback
- wraps BLAS library for use in
- THLapack
- wraps LAPACK library for use in
THTensor
- DOESN’T provide fallbacks - throws errors if called
- wraps LAPACK library for use in
- THFile
- abstract file class
- only creates wrappers for calling methods contained in virtual table
- THDiskFile
- concrete file class
- wraps disk file APIs
- THMemoryFile
- concrete file class
- operates on an in-memory buffer and fakes file operations
- THGeneral
- implements general utilities
- contains memory management routines
- can notify external GCs
- THRandom
- implements a random number generator
- can sample from many distributions
- THStorage
- defines a general storage object
- contains mainly bookkeeping code
- available for all primitive datatypes (
TH<type>Storage
, e.g.THFloatStorage
)
How to use it
If you want to install TH
you can either perform a full Torch installation or you can follow these steps:
# clone Torch repositorygit clone https://github.com/torch/torch7mkdir th_buildcd th_build# configure TH buildcmake ../torch7/lib/TH# compile librarymake# install shared library and header filesmake install
Then, you only have to #include <TH/TH.h>
in your program and link the library during the compilation process (-lTH
).
Example program
To wrap up I just wanted to show you an example program using TH
. It will simply load 10 floats from two files into tensors, compute their dot product and add to it a sum of all values in one of them. This is the code:
#include "TH/TH.h"int main(){ THFile *x_file = THDiskFile_new("x", "r", 0); THFile *y_file = THDiskFile_new("y", "r", 0); THFloatTensor *x = THFloatTensor_newWithSize1d(10); THFloatTensor *y = THFloatTensor_newWithSize1d(10); THFile_readFloat(x_file, x->storage); THFile_readFloat(y_file, y->storage); double result = THFloatTensor_dot(x, y) + THFloatTensor_sumall(x); printf("%f\n", result); THFloatTensor_free(x); THFloatTensor_free(y); THFile_free(x_file); THFile_free(y_file); return 0;}
All input parsing and possible errors are handled by Torch. Convenient, isn’t it?
Afterthoughts
I actually enjoy reading other’s source code - especially if it’s well written. If you have some spare time, then seriously, consider picking your favourite library or framework, and try to understand how it works - even the tiniest bits of it. I guarantee that you will find many fascinating things and learn many concepts and ways of structuring your code that you had no idea existed. I haven’t learned that much in such short period of time for a while. I liked it so that I’m thinking about doing this on a more regular basis.
TH
has no documentation at the moment. Since I’ve already studied most of it’s code, I’ll probably try to write at least a bit. I’ve used Torch for so long that it’s time to make some contribution myself.
Thanks for reading! I hope that you liked it!
Source: https://apaszke.github.io/torch-internals.html
- A quick tour of Torch internals
- a quick tour of many tools
- Chapter 1 A Quick Tour
- A Tour of C++
- Gensim and LDA: a quick tour
- A Tour of Golang (二)
- [torch]Getting the Output of a Layer
- Hadoop Internals - Anatomy of a MapReduce Job
- A Quick Overview of MSAA
- Introduction A Guided Tour of the POCO
- A Tour of Go Named results
- A tour of Go的疑问
- A Tour of Machine Learning Algorithms
- Chapter 1 - A Tour of Computer System
- A Tour of Golang (一)
- A Tour of Go: Exercise: Rot13 Reader
- A Tour of Machine Learning Algorithms
- A Tour of Go---Exercise: Fibonacci closure
- eMMC分区详解
- 【HDU
- POJ 2062 Card Game Cheater
- 2017-10-02(一个水杯引发的思考)
- Android Studio安装genymotion插件并运行
- A quick tour of Torch internals
- java中如何按输入的顺序遍历一个map和set
- Android JNI 通过C/C++调用JAVA方法
- [PAT] 1015. 德才论 (25)
- 关于switch case C语言
- 国庆郑州集训day1 下午:基本算法
- Hibernate中的@Entity注解
- ubuntu软件安装问题
- 【物理分析】2017.10.2杂题[电阻]题解