Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
11 years ago

OpenCL kernel is freezing

Hello.

I'm using Centos 6 and picocomputing m506 board.

What is maximum size of kernel's arguments? Summary of cl_mem buffer's size must be less than CL_DEVICE_GLOBAL_MEM_SIZE? Or there is some reserved space?

I have some strange behavior:

input_size = 4288000000;

output_size = 1340000000;

global_work_size = 67000000;

unsigned int* buffer = (unsigned int*) alignedMalloc(input_size);

unsigned int* digest = (unsigned int*) alignedMalloc(output_size);

cl_mem inputBuffer = clCreateBuffer(context, CL_MEM_READ_ONLY, input_size, NULL, &status);

cl_mem outputBuffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY, output_size, NULL, &status);

status = clEnqueueWriteBuffer(queue, inputBuffer, CL_FALSE, 0,input_size, (void *) buffer, 0, NULL, &write_event[0]);

status = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *) &inputBuffer);

status = clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *) &outputBuffer);

size_t global_work_size[1] = { global_work_size };

status = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, NULL, 1, write_event, &kernel_event);

status = clEnqueueReadBuffer(queue, outputBuffer, CL_FALSE, 0, output_size, buffer_out, 1, &kernel_event, &finish_event);

clReleaseEvent(write_event[0]);

clWaitForEvents(1, &finish_event);

This pseudo code will work.

But same code with another values for input&output sizes will freeze.

Values:

input_size = 4352000000;

output_size = 1360000000;

global_work_size = 68000000;

clWaitForEvents(1, &finish_event); will never exit.

4 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The m506 has only 4GB. I'm not sure how the first one even passed, since input_size + output_size > 4GB. OpenCL also reserves some memory for it's own use, so no you don't get the full 4GB. You can query CL_DEVICE_MAX_MEM_ALLOC_SIZE to get the actual available size.

    Make sure you check for SUCCESS after the CreateBuffer calls.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Our m506 has 8GB.

    Verified that the kernel mode driver is installed on the host machine.

    Using platform: Altera SDK for OpenCL

    Using Device with name: m506 : m506

    Using Device from vendor: Pico Computing Inc.

    clGetDeviceInfo CL_DEVICE_GLOBAL_MEM_SIZE = 8589934592

    clGetDeviceInfo CL_DEVICE_MAX_MEM_ALLOC_SIZE = 8588886016

    Memory consumed for internal use = 1048576

    Actual maximum buffer size = 8588886016 bytes

    Writing 8191 MB to global memory ...

    Write speed: 2179.79 MB/s [2178.55 -> 2180.43]

    Reading and verifying 8191 MB from global memory ...

    Read speed: 3121.04 MB/s [3120.68 -> 3121.29]

    Successfully wrote and readback 8191 MB buffer

    Transferring 8192 KBs in 16 512 KB blocks ... 2625.61 MB/s

    Transferring 8192 KBs in 8 1024 KB blocks ... 2680.01 MB/s

    Transferring 8192 KBs in 4 2048 KB blocks ... 2875.58 MB/s

    Transferring 8192 KBs in 2 4096 KB blocks ... 2995.90 MB/s

    Transferring 8192 KBs in 1 8192 KB blocks ... 3048.22 MB/s

    PCIe Gen2.0 peak speed: 500MB/s/lane

    Writing 8192 KBs with block size (in bytes) below:

    Block_Size Avg Max Min End-End (MB/s)

    524288 1791.03 1890.40 1718.31 1784.67

    1048576 1885.19 1918.46 1837.53 1880.12

    2097152 2013.79 2042.29 1960.92 2011.01

    4194304 2100.40 2107.05 2093.78 2099.05

    8388608 2115.81 2115.81 2115.81 2115.81

    Reading 8192 KBs with block size (in bytes) below:

    Block_Size Avg Max Min End-End (MB/s)

    524288 2528.00 2625.61 2357.35 2514.59

    1048576 2630.47 2680.01 2604.23 2619.32

    2097152 2848.14 2875.58 2826.59 2839.07

    4194304 2980.27 2995.90 2964.80 2977.00

    8388608 3048.22 3048.22 3048.22 3048.22

    Write top speed = 2115.81 MB/s

    Read top speed = 3048.22 MB/s

    Throughput = 2582.02 MB/s

    DIAGNOSTIC_PASSED
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    What version of the SDK are you using? 13.x only works with <= 4GB. Make sure you're using 14.0 or 14.1.