Native CUDA array to TorchLib Tensor
Published:
A small problem I encountered was how to convert a native CUDA array to a TorchLib Tensor in C++. Some posts such as this give a very good example of doing so, but the “tricky part” of the image step is not clearly stated. According to the OpenCV doc, step means “a distance between successive rows in bytes; includes the gap if any”, so it should be the length of one row in bytes. So in the orginal github issue, long long step = image.step / sizeof(float);
basically means the actual length of the row.
In my current project, I need to put a native CUDA array of uchar4 into a tensor. The correct way to do that should be:
auto options = torch::TensorOptions().dtype(torch::kUInt8).device(torch::kCUDA, 0);
long long step = 256 * sizeof(uchar4) / sizeof(unsigned char); //should be 256 * 4 in short. Just to techinically show how to get the step.
std::vector<int64_t> strides = {step, 4, 1};
torch::Tensor test = torch::from_blob(g_dstBuffer, dims,strides, deleter,options);// g_dstBuffer is the uchar4* CUDA native array.