Hello @all,
i found a solution to fix the problem regarding the select()-call in an multitask enviroment.
How did i found the Bug:
At first i modified the tcp_wakeup() function wich is located in the file "tk_crnos.c" to get an proper debug-output i insert a printf just before the folowing code-line:
//my modification:
dprintf("w %u %d\n", OSTCBCur->OSTCBPrio ,i);
//END my modification
/* we found the TCB with our cookie */
error = OSSemPost(WEP->wake_sem);
This printf reports the the task from wich the tcp_wakeup()-function was called, and the priority of the task wich will be woken up.
In my case i have the folowing tasks running on my little test-implementation:
"inet_main" Prio 2 --> NicheStack network task
"clock_tick" Prio 3 --> NicheStack network task
"net_task_1" Prio 4 --> My first user task
"net_task_2" Prio 5 --> My second user task
On startup each user-task creates it's own udp-socket and performs a blocking select()-call on readability like this:
result = select(fd + 1, &readfds, NULL, NULL, NULL);
When the select returns, the data will be read from the socket and displayed on the nios2-terminal.
However when i run my implementation, i saw the folowing printout:
w 3 4
w 3 5
w 3 4
w 3 5
and so on.
This means, that the task with the priority-number 3 (clock tick) wakes up all tasks wich are curently sleeping in an select(). This wakeup is needed, to monitor a select-timeout if a timeout is passed to the select-call by the user.
And now here is the debug-output when sending data to the socket of the higher-priority networktask (the task where everything is fine):
w 3 4
w 3 5
w 2 4
[net_task_1] Received data from the socket!
w 3 4
w 3 5
As you can see, the "inet_main" task from the network stack calls the tcp_wakeup()-function when there is data on the socket! The select() then returns to the user-space, the data reception can be handled and every thing is fine!
Let's have a look at the other socket:
w 3 4
w 3 5
w 2 4
w 3 4
w 3 4
w 3 4
This simply means, that the "inet_main" task wakes up the wrong task with prio 4 instead of prio 5. And as you can see, after that the "net_task_2" (prio 5) will never be woken up again!
So i decided to take a closer look at the tcp_wakeup() function:
void
tcp_wakeup(void * event)
{
int i; /* task table index */
INT8U error;
/*
* gain control of the global wakeup mutex
*/
OSMutexPend(global_wakeup_Mutex, 0, &error);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_wakeup, OSMutexPend = %d\n", error);
dtrap();
} # ifdef TK_CRON_DIAGS
dprintf("+++ tcp_wakeup = %lx\n", event);# endif
/*
* we are now in mutex
* -----------------------------------
*/
/*
* Loop through task tables, try to find the cookie.
*/
for (i = 0; i < OS_LOWEST_PRIO; i++)
{
struct wake_event *WEP;
OS_TCB *tcb;
if ((tcb = (OS_TCB *)OSTCBPrioTbl) == (OS_TCB *)NULL)
continue; /* unassigned priority */
/* use extension */
WEP =tcb->OSTCBExtPtr;
if (WEP->soc_event == event)
{# ifdef TK_CRON_DIAGS
dprintf("+++ tcp_wakeup OSSemPost = %lx\n", event);# endif
//TBD
dprintf("w %u %d\n", OSTCBCur->OSTCBPrio ,i);
//END TBD
/* we found the TCB with our cookie */
error = OSSemPost(WEP->wake_sem);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_wakeup, OSSemPost = %d, %p\n", error, WEP->wake_sem);
dtrap();
}
/* clear the cookie */
WEP->soc_event = NULL;
/*
* give up mutex
*/
error = OSMutexPost(global_wakeup_Mutex);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_wakeup, OSMutexPost = %d\n", error);
dtrap();
}
return; /* we woke it up ! */
}
} /* for() */
/*
* we didn't find the cookie in the wake set.
* Q it up.
*/
insertWakeSetEntry(event);
/*
* give up mutex
*/
error = OSMutexPost(global_wakeup_Mutex);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_sleep, OSMutexPost = %d\n", error);
dtrap();
}
/*
* we are now out of the mutex
* -----------------------------------
*/
return;
}
I noticed, that the tcp_wakeup() function simply loops through all tasks (from high-prio to low-prio) and searches for a cookie wich indicates, that this task is curently waiting on a select(). If this cookie is found, the coresponding taks will be waked up and tcp_sleep() returns.
When two tasks are pending on a select, only the task with the highest priority will be woken (although the received data on the socket isn't for this task).
I have modified the tcp_wakeup()-function so that the search loop won't brack at the first task. In other words: Now every task pending on a select will be waked up if there is data on the socket! Note, taht this wakeup doesn't result in a wrong return from select() within the user space, it's just an "inner-select-wakeup".
Now everything works fine....
Here is the modified code:
/*
* tcp_wakeup(void * event) - wakeup TCB with this event,
* else put in wake set.
*/
void
tcp_wakeup(void * event)
{
//Declaration:
int i; //task table index
INT8U error; //Error-flag (needed for semaphore access
int cnt = 0; //Counter for woken-up tasks
//Gain control of the global wakeup mutex
OSMutexPend(global_wakeup_Mutex, 0, &error);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_wakeup, OSMutexPend = %d\n", error);
dtrap();
}
//We are now in mutex# ifdef TK_CRON_DIAGS
dprintf("+++ tcp_wakeup = %lx\n", event);# endif
//Loop through task tables, try to find the cookie.
for (i = 0; i < OS_LOWEST_PRIO; i++){
struct wake_event *WEP;
OS_TCB *tcb;
if ((tcb = (OS_TCB *)OSTCBPrioTbl) == (OS_TCB *)NULL)
continue; //unassigned priority
//use extension
WEP =tcb->OSTCBExtPtr;
if (WEP->soc_event == event)
{# ifdef TK_CRON_DIAGS
dprintf("+++ tcp_wakeup OSSemPost = %lx\n", event);# endif
//We found the TCB with our cookie */
error = OSSemPost(WEP->wake_sem);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_wakeup, OSSemPost = %d, %p\n", error, WEP->wake_sem);
dtrap();
}
//Clear the cookie:
WEP->soc_event = NULL;
//Count this wakeup:
cnt ++;
}
}
//Check fore woken tasks:
if (cnt != 0) {
//Tasks have been woken, so give up mutex...
error = OSMutexPost(global_wakeup_Mutex);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_wakeup, OSMutexPost = %d\n", error);
dtrap();
}
//...and get out of here:
return;
}
//We didn't find the cookie in the wake set.
insertWakeSetEntry(event);
//Give up mutex
error = OSMutexPost(global_wakeup_Mutex);
if (error != OS_NO_ERR)
{
dprintf("*** tcp_sleep, OSMutexPost = %d\n", error);
dtrap();
}
//We are now out of the mutex, so leave:
return;
}
As you can see, the search-loop wil not be exit at the first task found pending on a select(). The variable "cnt" is only used to check, if any task has been woken up during the search-loop. If not, this event is queued up.
I don't know, if this is important to do, but the original code does the same.
ANY FEEDBACK OUT THERE ?