Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
14 years ago

Crash

Hi,

some months ago I made a program in C++ in MicroC/OS-II installed on FPGA programmed with NIOS processor on Altera board.

The software was composed of three threads that made nothing: in each thread there was only a cout and before each cout a semaphore was brought and then, after the cout, the semaphore was released. The three threads was synchronized by semaphores.

If the thread was named A, B, C:

1) A run, B and C aspected

2) A released a semaphore, B took the semaphore and run; A and C aspected

3) B release a semaphore, C took the semaphore and run; A and B aspected

4) C release a semaphore, A took the semaphore and run; B and C aspected

and so on...

After two hours the program crashed, and this fact happened every time that I launched the program, and I didn't understand why.

Is there someone that has an idea?

Thank you very much

18 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I suspect the code spends most of its time blocked waiting for the JTAG uart inside 'cout' - This probably isn't the intention of the test!

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Although I don't know what's the exact purpose the fflush function, it seems your tasks have no idle/sleep status; I mean they continuously rush in switching from one to the other at the maximum possible speed allowed by the scheduler.

    IMHO this can generate two issues:

    1. jtag allows a rather slow throughtput; the huge amount of output traffic generated by this relay race among tasks can easily choke the interface and affect the operation of the whole system

    2. as I said in the previous post, the intrinsic absence of sleep instructions or significative execution times, will also make higher priorities tasks to immediately fall in the pending state, even before the lower priority one. In such a situation you could have inversion of the expected flow and possibly overlapped messages.

    If tasks are actually not required to switch at that incredible rate, a possible solution would be to insert a TK_SLEEP(ticks) instruction between the cout print and the next OSSemPost.

    A few ticks delay will ensure tasks to sequence in the correct order.

    A better solution would be implementing a 3 state machine in a single task, if your complete project allows it; now I can't see the point in running 3 tasks if only one runs at a time.

    --- Quote End ---

    Thank you for this explanation. The problem happened also with many instructions in each thread, then I tried without instructions to see if I made an error in the instructions. With those instructions the threadd aren't so fast, because there was the formatting of packets, calculations, search and store of data,... before each Post(). With the instructions the problem appears in half an hour, in the posted code in 2-3 hours.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Possibly you were having problems with stack overflow. Which might require an interrupt to happen at the maximum stack depth - so wouldn't be that common.

    Adding extra code might have increased the stack depth - making the overflow more likely.

    I don't know how much stack things like 'cout' end up using - but it could be considerable.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    With those instructions the threadd aren't so fast, because there was the formatting of packets, calculations, search and store of data,... before each Post(). With the instructions the problem appears in half an hour, in the posted code in 2-3 hours.

    --- Quote End ---

    The problem I conjectured is independent from how many istructions you have between the OSSemPend and OSSemPost; it's rather caused by the fact you have a higher priority task which runs without never releasing the control to low priority ones.

    Since OS-II is a preemptive OS, whenever the high priority task is scheduled just after the semaphore was signaled by the other task, it takes completely the control and runs undisturbed until it reaches the next OSSemPend.

    Then inserting more instructions simply delays the time when the event occurs, like in your case (in other words, you need the same number of cycles, but longer time, because each cycle takes longer)

    About cout, I agree with dsl: if lot of data is queue for transmission out of jtag uart, your system may run short of stack or heap space.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    The problem I conjectured is independent from how many istructions you have between the OSSemPend and OSSemPost; it's rather caused by the fact you have a higher priority task which runs without never releasing the control to low priority ones.

    Since OS-II is a preemptive OS, whenever the high priority task is scheduled just after the semaphore was signaled by the other task, it takes completely the control and runs undisturbed until it reaches the next OSSemPend.

    Then inserting more instructions simply delays the time when the event occurs, like in your case (in other words, you need the same number of cycles, but longer time, because each cycle takes longer)

    About cout, I agree with dsl: if lot of data is queue for transmission out of jtag uart, your system may run short of stack or heap space.

    --- Quote End ---

    Thank you, I have understood the first part. Instead I didn't understand the cout problem, because with fflush() I free the buffer. Is not concerned?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Thank you, I have understood the first part. Instead I didn't understand the cout problem, because with fflush() I free the buffer. Is not concerned?

    --- Quote End ---

    As I said, I didn't know what the purpose of fflush() was. If it indeed waits for the cout buffer to free, it could avoid all the above problems, since this wait delay would keep the tasks synchronized.

    However, I don't know how cout and its send buffer are managed at the lower levels, then I'd suggest you run your code without cout/fflush calss (add TKSLEEP(10) instead), in order to test the tasks sequencing itself. This way you can discriminate what really originates the crash problem.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Because of task priorities it is possible that both high priority tasks repost the semaphore between the OSSemPost and the next OSSemPend of the low priority one. This will happen whenever the scheduler is activated exactly between the to instructions.

    I'm not sure, but I think such a situation could generate anomalies in the normal flow.

    --- Quote End ---

    I'm reading this old post, and I realize that I didn't understand why if both high priority tasks repost the semaphore between the OSSemPost and the next OSSemPend of the low priority one there can be anomalies.

    Thanks
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    I'm reading this old post, and I realize that I didn't understand why if both high priority tasks repost the semaphore between the OSSemPost and the next OSSemPend of the low priority one there can be anomalies.

    Thanks

    --- Quote End ---

    Could it be a scheduler's bug?