OpenCL: Difference between revisions

No edit summary
 
(19 intermediate revisions by the same user not shown)
Line 6: Line 6:
===Linux===
===Linux===
https://gist.github.com/Brainiarc7/dc80b023af5b4e0d02b33923de7ba1ed
https://gist.github.com/Brainiarc7/dc80b023af5b4e0d02b33923de7ba1ed
<pre>
sudo apt install ocl-icd-opencl-dev opencl-headers
sudo apt install opencl-c-headers opencl-clhpp-headers
</pre>


==Getting Started==
==Getting Started==
Line 13: Line 17:
See https://www.eriksmistad.no/getting-started-with-opencl-and-gpu-computing/
See https://www.eriksmistad.no/getting-started-with-opencl-and-gpu-computing/


{{hidden | C example |
vector_add_kernel.cl
<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
__kernel void vector_add(__global const int *A, __global const int *B, __global int *C) {
__kernel void vector_add(__global const int *A, __global const int *B, __global int *C) {
Line 23: Line 29:
}
}
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="cpp">
<syntaxhighlight lang="c">
#include <stdio.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdlib.h>
   
   
#define CL_USE_DEPRECATED_OPENCL_1_2_APIS
#ifdef __APPLE__
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#include <OpenCL/opencl.h>
Line 135: Line 142:
}
}
</syntaxhighlight>
</syntaxhighlight>
 
}}


===C++===
===C++===
[https://github.khronos.org/OpenCL-CLHPP/index.html#intro C++ Bindings]<br>
[https://github.khronos.org/OpenCL-CLHPP/index.html#intro C++ Bindings]<br>
While you can use the C bindings in your C++ application, Khronos also provides a set of C++ bindings in <code>CL/cl2.hpp</code> which are much easier to use alongside std containers such as <code>std::vector</code>. When using C++ bindings, you also do not need to worry about releasing buffers since these are reference-counted.
While you can use the C bindings in your C++ application, Khronos also provides a set of C++ bindings in <code>CL/cl.hpp</code> (or <code>CL/cl2.hpp</code>) which are much easier to use alongside std containers such as <code>std::vector</code>.
When using C++ bindings, you also do not need to worry about releasing buffers since these are reference-counted.
{{hidden | C++ example |
<syntaxhighlight lang="cpp">
 
#include <CL/cl.hpp>
#include <fstream>
#include <iostream>
 
int main(void) {
  int ret = 0;
  // Create the two input vectors
  const int LIST_SIZE = 1024;
  std::vector<int> A(LIST_SIZE);
  std::vector<int> B(LIST_SIZE);
  for (int i = 0; i < LIST_SIZE; i++) {
    A[i] = i;
    B[i] = LIST_SIZE - i;
  }
 
  // Load the kernel source code into the string source_str
  std::string source_str;
  {
    std::ifstream file("vector_add_kernel.cl");
    file.seekg(0, std::ios::end);
    source_str.resize(file.tellg());
    file.seekg(0, std::ios::beg);
    file.read(&source_str[0], source_str.size());
  }
 
  // Get platform and device information
  std::vector<cl::Platform> platforms;
  ret = cl::Platform::get(&platforms);
 
  std::vector<cl::Device> devices;
  ret = platforms[0].getDevices(CL_DEVICE_TYPE_ALL, &devices);
 
  // Create an OpenCL context
  cl::Context context(devices[0], NULL, NULL, NULL, &ret);
 
  // Create a command queue
  cl::CommandQueue command_queue(context, devices[0], 0UL, &ret);
 
  // Create memory buffers on the device for each vector
  cl::Buffer a_mem_obj(context, CL_MEM_READ_ONLY, LIST_SIZE * sizeof(int));
  cl::Buffer b_mem_obj(context, CL_MEM_READ_ONLY, LIST_SIZE * sizeof(int));
  cl::Buffer c_mem_obj(context, CL_MEM_READ_WRITE, LIST_SIZE * sizeof(int));
 
  // Copy the lists A and B to their respective memory buffers
  ret = cl::copy(command_queue, A.begin(), A.end(), a_mem_obj);
  ret = cl::copy(command_queue, B.begin(), B.end(), b_mem_obj);
 
  // Create a program from the kernel source
  cl::Program program(context, source_str);
 
  // Build the program
  ret = program.build(std::vector<cl::Device>{devices[0]});
  if (ret != CL_SUCCESS) {
    std::cerr << "Error building program" << std::endl;
    exit(EXIT_FAILURE);
  }
 
  // Create the OpenCL kernel
  cl::Kernel kernel(program, "vector_add", &ret);
  if (ret != CL_SUCCESS) {
    std::cerr << "Error creating kernel" << std::endl;
    exit(EXIT_FAILURE);
  }
 
  // Set the arguments of the kernel
  ret = kernel.setArg(0, sizeof(cl_mem), &a_mem_obj());
  ret = kernel.setArg(1, sizeof(cl_mem), &b_mem_obj());
  ret = kernel.setArg(2, sizeof(cl_mem), &c_mem_obj());
 
  // Execute the OpenCL kernel on the list
  cl::NDRange global_item_size(LIST_SIZE); // Process the entire lists
  cl::NDRange local_item_size(64); // Divide work items into groups of 64
  ret = command_queue.enqueueNDRangeKernel(kernel, 0, global_item_size,
                                          local_item_size, NULL, NULL);
  if (ret != CL_SUCCESS) {
    std::cerr << "Error starting kernel" << std::endl;
    exit(EXIT_FAILURE);
  }
 
  // Read the memory buffer C on the device to the local variable C
  std::vector<int> C(LIST_SIZE);
  ret = cl::copy(command_queue, c_mem_obj, C.begin(), C.end());
  if (ret != CL_SUCCESS) {
    std::cerr << "Error copying C from gpu to memory " << ret << std::endl;
    exit(EXIT_FAILURE);
  }
 
  // Display the result to the screen
  for (int i = 0; i < LIST_SIZE; i++)
    printf("%d + %d = %d\n", A[i], B[i], C[i]);


TODO: Provide an example.
  return 0;
}
</syntaxhighlight>
}}
 
===Python===
See [https://documen.tician.de/pyopencl/index.html pyopencl].


===Julia===
===Julia===
Line 147: Line 254:


==Usage==
==Usage==
===Types===
===Scalar Types===
[https://www.khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/scalarDataTypes.html Scalar Data Types]
[https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/scalarDataTypes.html OpenCL 1.2 Scalar Data Types]<br>
While all OpenCL devices support single-precision floats, not all support double-precision doubles.<br>
While all OpenCL devices support single-precision floats, not all support double-precision doubles.<br>
===Vector Types===
[https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/dataTypes.html OpenCL Data Types]<br>
[https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/vectorDataTypes.html OpenCL 1.2 Vector Data Types]<br>
Just like glsl, OpenCL supports vector types such  
Just like glsl, OpenCL supports vector types such  
<syntaxhighlight lang="c">float3 my_vec = (float3)(1.0);</syntaxhighlight>
<syntaxhighlight lang="c">float3 my_vec = (float3)(1.0);</syntaxhighlight>
where its elements are accessed using x,y,z as <code>my_vec.x</code>.<br>
where its elements are accessed using x,y,z as <code>my_vec.x</code>.<br>
To convert between vector types, use <code>convert_T()</code><br>
;Notes
* 3-component data types are aligned to 4 components. I.e. an array of <code>uchar3</code> with 4 elements will be equivalent to an array of <code>uchar4</code> with 4 elements.


==OpenGL Interop==
Setting up OpenCL/OpenGL interop is fairly complicated and very hard to debug. 
You will also need to manage synchronizing OpenGL/OpenCL so they do not access the same memory at the same time. 
If you can, just use OpenGL compute shaders rather than OpenCL to simplify your life.


===Textures===
See [https://software.intel.com/content/www/us/en/develop/articles/opencl-and-opengl-interoperability-tutorial.html OpenCL™ and OpenGL* Interoperability Tutoria].
In C++, you can use [https://github.khronos.org/OpenCL-CLHPP/classcl_1_1_image_g_l.html <code>cl::ImageGL</code>] to access textures in OpenGL. 
Note that <code>cl::Image</code> and <code>cl::Buffer</code> are not the same thing. Interchanging them will result in <code>CL_INVALID_MEM_OBJECT</code> errors or similar.
I recommend writing to a separate buffer and copying to images.
See [https://www.khronos.org/registry/OpenCL/sdk/2.2/docs/man/html/clCreateFromGLTexture.html clCreateFromGLTexture] to get a list of compatible pixel formats. 
If in doubt, use <code>GL_RGBA8</code> which is the most likely format to be supported.
===Buffers===
[https://web.engr.oregonstate.edu/~mjb/cs575/Handouts/opencl.opengl.vbo.1pp.pdf Oregon State VBO Interop] 
[https://github.khronos.org/OpenCL-CLHPP/classcl_1_1_buffer_g_l.html cl::BufferGL]


==Advanced Topics==
==Advanced Topics==
====Local Memory v. Global Memory====
====Local Memory v. Global Memory====
[[Category:Programming languages]]
[[Category:GPU Programming languages]]