Altera_Forum
Honored Contributor
9 years agoEmulation of "asiam_option" example appears hangs
​Hi,
I want to learn how to write an application composed of multiple kernels communicated through channels. For that purpose, I am using the "asian_option" example downloaded from: https://www.altera.com/support/support-resources/design-examples/design-software/opencl/black-scholes.html This is the compiler version:
>>aocl version
aocl 16.1.0.196 (Intel(R) FPGA SDK for OpenCL(TM), Version 16.1.0 Build 196, Copyright (C) 2016 Intel Corporation)
This is how I compiled the very original code (both host and device)
>>make
>>aoc -march=emulator -v --board a10gx device/asian_option.cl -o bin/asian_option.aocx
Emulation:
>>CL_CONTEXT_EMULATOR_DEVICE_ALTERA=1 ./host
Querying platform for info:
==========================
CL_PLATFORM_NAME = Intel(R) FPGA SDK for OpenCL(TM)
CL_PLATFORM_VENDOR = Altera Corporation
CL_PLATFORM_VERSION = OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 16.1
Programming Device(s)
Using AOCX: asian_option.aocx
Starting Computations
Here is where NO OUTPUT appears. Checking the corresponding htop output for the emulation, I saw three processes utilizing almost 100% of CPU, and one process using almost 300% of CPU. Using gdb, I discovered the program hangs on an specific API call: main.cpp: 337, inside get_result():
double get_result(int device_id)
{# if USE_SVM_API == 0
// Read back the single result from the kernel
status = clEnqueueReadBuffer(accumulate_queue, kernel_result, CL_TRUE, 0, sizeof(cl_double), &X, 0, NULL, NULL);
...
Commenting that part, let the emulation complete, but obviously the "Resulting Price" value is wrong:
CL_CONTEXT_EMULATOR_DEVICE_ALTERA=1 ./host
Querying platform for info:
==========================
CL_PLATFORM_NAME = Intel(R) FPGA SDK for OpenCL(TM)
CL_PLATFORM_VENDOR = Altera Corporation
CL_PLATFORM_VERSION = OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 16.1
Programming Device(s)
Using AOCX: asian_option.aocx
Starting Computations
DEVICE 0: r=0.08 sigma=0.30 T=1.0 S0=30.0 K=29.0 : Resulting Price is 0.000000
1 Devices ran a total of 2.09715e+11 Simulations
Throughput = 275295.56 Billion Simulations / second
I do not understand the reason of this. The ran before the "hello_world" example without problems. But I am always facing hanging-related problem when using designs composed of multiple kernels and channels. Could anyone please give some tips to solve this? ​Regards, Leo