❔ reduce CPU/GPU transfer ComputeSharp
I'm working on a project with ComputeSharp, and I'm struggling with the problem where I need to run a lot of compute shaders one after another. Is there a way to control this process from the GPU, so that my code isn't significantly slowed down? Or how should I go about this?
I can't just move it all into one shader because some of them depend on the results of the previous one and there is obviously no way to synchronize.
12 Replies
I think a ComputeContext from ComputeSharp 2.0 is what I want
@Sergio Maybe something for you
Correct, you should use that to create a single pipeline with all your shaders and execute them in a single batch 🙂
Remember to also insert the appropriate barriers where needed
I wasn't able to find any documentation on those pipelines, how do they work?
Also, what do I do if I need a list of texture2ds? I can't have an array on the CPU obv but there's no arrays on the GPU and the dimensions of a texture3D are too limiting
Unless you think I should flatten weights into a 1d buffer?
@Sergio I legit have no idea how I'm meant to store my neural net layers on the GPU...
Have you read the wiki in the repo?
the regular github wiki? it doesn't seem to have much info beyond the getting started section, and none of the 5 pages even mention piplines or ComputeContext
or is there another wiki I'm missing?
Oh, I thought I had added wikis for that too
Anyway it's relatively easy, you can see an example here: https://github.com/Sergio0694/ComputeSharp/blob/78c5ae0e6ce8acfa6c5760ade8aee997fe7089f8/samples/ComputeSharp.ImageProcessing/Processors/HlslGaussianBlurProcessor%7BTPixel%7D.cs#L107
Just create a compute context in a using, and use methods on it. There's lots of XML docs on all the APIs with more info
You can also find lots of examples in the unit tests
Also if you have a specific question on something I can help maybe, this otherwise is pretty general
right now I have a list of texture2D references on the CPU that I loop through and do stuff on one at a time, and the calculation depends on the previous one
what I want is a way to do this without having to go back to a list on the CPU each time
I don't know how I can save the data on the GPU though since the structure doesn't play nicely, do I just flatten it?
as for the compute context, can I just put a loop there and run the shader in the context that many times? Or how do I set up a pipeline with a dynamic amount of steps like that?
@Sergio
HLSL supports a Texture2DArray, is there a way to use that from ComputeSharp? I don't see one...
I don't support texture arrays, no
But why are you using textures at all
If you're working with ND tensors in a neural network, you'll want to just use structured buffers
Textures are for when you need to do sampling, which is not the case here
Use a buffer and manually index things
I have a sample for this too
Eg. here's a fully connected neural network layer (with no activation): https://github.com/Sergio0694/ComputeSharp/blob/78c5ae0e6ce8acfa6c5760ade8aee997fe7089f8/samples/ComputeSharp.Benchmark/Blas/BlasHelpers.cs#L84
Maybe I could build a full neural network sample with ComputeSharp at some point, even just something simple like a network to classify images from the MNIST dataset 🙂
what about convolutional layers? lol
textures would be pretty useful there...
is there a particular reason for that? or just never found the need to?
they seem pretty useful in some contexts
Not really, no
You'd still not need to sample, so you still don't need a texture
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.