mirror of
https://git.rtems.org/rtems-docs/
synced 2025-07-02 07:20:16 +08:00
c-user: Add SMP application issues section
This commit is contained in:
parent
7b1c63cf91
commit
b033e3960b
@ -25,7 +25,7 @@ Glossary
|
|||||||
manager are used to service signals.
|
manager are used to service signals.
|
||||||
|
|
||||||
:dfn:`atomic operations`
|
:dfn:`atomic operations`
|
||||||
Atomic operations are defined in terms of *ISO/IEC 9899:2011*.
|
Atomic operations are defined in terms of :ref:`C11 <C11>`.
|
||||||
|
|
||||||
:dfn:`awakened`
|
:dfn:`awakened`
|
||||||
A term used to describe a task that has been unblocked and may be scheduled
|
A term used to describe a task that has been unblocked and may be scheduled
|
||||||
@ -61,6 +61,16 @@ Glossary
|
|||||||
:dfn:`buffer`
|
:dfn:`buffer`
|
||||||
A fixed length block of memory allocated from a partition.
|
A fixed length block of memory allocated from a partition.
|
||||||
|
|
||||||
|
.. _C11:
|
||||||
|
|
||||||
|
:dfn:`C11`
|
||||||
|
The standard ISO/IEC 9899:2011.
|
||||||
|
|
||||||
|
.. _C++11:
|
||||||
|
|
||||||
|
:dfn:`C++11`
|
||||||
|
The standard ISO/IEC 14882:2011.
|
||||||
|
|
||||||
:dfn:`calling convention`
|
:dfn:`calling convention`
|
||||||
The processor and compiler dependent rules which define the mechanism used
|
The processor and compiler dependent rules which define the mechanism used
|
||||||
to invoke subroutines in a high-level language. These rules define the
|
to invoke subroutines in a high-level language. These rules define the
|
||||||
@ -701,6 +711,13 @@ Glossary
|
|||||||
:dfn:`timeslice`
|
:dfn:`timeslice`
|
||||||
The application defined unit of time in which the processor is allocated.
|
The application defined unit of time in which the processor is allocated.
|
||||||
|
|
||||||
|
.. _TLS:
|
||||||
|
|
||||||
|
:dfn:`TLS`
|
||||||
|
An acronym for Thread-Local Storage :cite:`Drepper:2013:TLS`. TLS is
|
||||||
|
available in :ref:`C11 <C11>` and :ref:`C++11 <C++11>`. The support for
|
||||||
|
TLS depends on the CPU port :cite:`RTEMS:CPU`.
|
||||||
|
|
||||||
:dfn:`TMCB`
|
:dfn:`TMCB`
|
||||||
An acronym for Timer Control Block.
|
An acronym for Timer Control Block.
|
||||||
|
|
||||||
|
@ -271,156 +271,6 @@ tree of the task needing help and other resource trees in case tasks in need
|
|||||||
for help are produced during this operation. Thus the worst-case latency in
|
for help are produced during this operation. Thus the worst-case latency in
|
||||||
the system depends on the maximum resource tree size of the application.
|
the system depends on the maximum resource tree size of the application.
|
||||||
|
|
||||||
Critical Section Techniques and SMP
|
|
||||||
-----------------------------------
|
|
||||||
|
|
||||||
As discussed earlier, SMP systems have opportunities for true parallelism which
|
|
||||||
was not possible on uniprocessor systems. Consequently, multiple techniques
|
|
||||||
that provided adequate critical sections on uniprocessor systems are unsafe on
|
|
||||||
SMP systems. In this section, some of these unsafe techniques will be
|
|
||||||
discussed.
|
|
||||||
|
|
||||||
In general, applications must use proper operating system provided mutual
|
|
||||||
exclusion mechanisms to ensure correct behavior. This primarily means the use
|
|
||||||
of binary semaphores or mutexes to implement critical sections.
|
|
||||||
|
|
||||||
Disable Interrupts and Interrupt Locks
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
A low overhead means to ensure mutual exclusion in uni-processor configurations
|
|
||||||
is to disable interrupts around a critical section. This is commonly used in
|
|
||||||
device driver code and throughout the operating system core. In SMP
|
|
||||||
configurations, however, disabling the interrupts on one processor has no
|
|
||||||
effect on other processors. So, this is insufficient to ensure system wide
|
|
||||||
mutual exclusion. The macros
|
|
||||||
|
|
||||||
- ``rtems_interrupt_disable()``,
|
|
||||||
|
|
||||||
- ``rtems_interrupt_enable()``, and
|
|
||||||
|
|
||||||
- ``rtems_interrupt_flush()``
|
|
||||||
|
|
||||||
are disabled in SMP configurations and its use will lead to compiler warnings
|
|
||||||
and linker errors. In the unlikely case that interrupts must be disabled on
|
|
||||||
the current processor, then the
|
|
||||||
|
|
||||||
- ``rtems_interrupt_local_disable()``, and
|
|
||||||
|
|
||||||
- ``rtems_interrupt_local_enable()``
|
|
||||||
|
|
||||||
macros are now available in all configurations.
|
|
||||||
|
|
||||||
Since disabling of interrupts is not enough to ensure system wide mutual
|
|
||||||
exclusion on SMP, a new low-level synchronization primitive was added - the
|
|
||||||
interrupt locks. They are a simple API layer on top of the SMP locks used for
|
|
||||||
low-level synchronization in the operating system core. Currently they are
|
|
||||||
implemented as a ticket lock. On uni-processor configurations they degenerate
|
|
||||||
to simple interrupt disable/enable sequences. It is disallowed to acquire a
|
|
||||||
single interrupt lock in a nested way. This will result in an infinite loop
|
|
||||||
with interrupts disabled. While converting legacy code to interrupt locks care
|
|
||||||
must be taken to avoid this situation.
|
|
||||||
|
|
||||||
.. code-block:: c
|
|
||||||
:linenos:
|
|
||||||
|
|
||||||
void legacy_code_with_interrupt_disable_enable( void )
|
|
||||||
{
|
|
||||||
rtems_interrupt_level level;
|
|
||||||
rtems_interrupt_disable( level );
|
|
||||||
/* Some critical stuff */
|
|
||||||
rtems_interrupt_enable( level );
|
|
||||||
}
|
|
||||||
|
|
||||||
RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" );
|
|
||||||
|
|
||||||
void smp_ready_code_with_interrupt_lock( void )
|
|
||||||
{
|
|
||||||
rtems_interrupt_lock_context lock_context;
|
|
||||||
rtems_interrupt_lock_acquire( &lock, &lock_context );
|
|
||||||
/* Some critical stuff */
|
|
||||||
rtems_interrupt_lock_release( &lock, &lock_context );
|
|
||||||
}
|
|
||||||
|
|
||||||
The ``rtems_interrupt_lock`` structure is empty on uni-processor
|
|
||||||
configurations. Empty structures have a different size in C
|
|
||||||
(implementation-defined, zero in case of GCC) and C++ (implementation-defined
|
|
||||||
non-zero value, one in case of GCC). Thus the
|
|
||||||
``RTEMS_INTERRUPT_LOCK_DECLARE()``, ``RTEMS_INTERRUPT_LOCK_DEFINE()``,
|
|
||||||
``RTEMS_INTERRUPT_LOCK_MEMBER()``, and ``RTEMS_INTERRUPT_LOCK_REFERENCE()``
|
|
||||||
macros are provided to ensure ABI compatibility.
|
|
||||||
|
|
||||||
Highest Priority Task Assumption
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
On a uniprocessor system, it is safe to assume that when the highest priority
|
|
||||||
task in an application executes, it will execute without being preempted until
|
|
||||||
it voluntarily blocks. Interrupts may occur while it is executing, but there
|
|
||||||
will be no context switch to another task unless the highest priority task
|
|
||||||
voluntarily initiates it.
|
|
||||||
|
|
||||||
Given the assumption that no other tasks will have their execution interleaved
|
|
||||||
with the highest priority task, it is possible for this task to be constructed
|
|
||||||
such that it does not need to acquire a binary semaphore or mutex for protected
|
|
||||||
access to shared data.
|
|
||||||
|
|
||||||
In an SMP system, it cannot be assumed there will never be a single task
|
|
||||||
executing. It should be assumed that every processor is executing another
|
|
||||||
application task. Further, those tasks will be ones which would not have been
|
|
||||||
executed in a uniprocessor configuration and should be assumed to have data
|
|
||||||
synchronization conflicts with what was formerly the highest priority task
|
|
||||||
which executed without conflict.
|
|
||||||
|
|
||||||
Disable Preemption
|
|
||||||
~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
On a uniprocessor system, disabling preemption in a task is very similar to
|
|
||||||
making the highest priority task assumption. While preemption is disabled, no
|
|
||||||
task context switches will occur unless the task initiates them
|
|
||||||
voluntarily. And, just as with the highest priority task assumption, there are
|
|
||||||
N-1 processors also running tasks. Thus the assumption that no other tasks will
|
|
||||||
run while the task has preemption disabled is violated.
|
|
||||||
|
|
||||||
Task Unique Data and SMP
|
|
||||||
------------------------
|
|
||||||
|
|
||||||
Per task variables are a service commonly provided by real-time operating
|
|
||||||
systems for application use. They work by allowing the application to specify a
|
|
||||||
location in memory (typically a ``void *``) which is logically added to the
|
|
||||||
context of a task. On each task switch, the location in memory is stored and
|
|
||||||
each task can have a unique value in the same memory location. This memory
|
|
||||||
location is directly accessed as a variable in a program.
|
|
||||||
|
|
||||||
This works well in a uniprocessor environment because there is one task
|
|
||||||
executing and one memory location containing a task-specific value. But it is
|
|
||||||
fundamentally broken on an SMP system because there are always N tasks
|
|
||||||
executing. With only one location in memory, N-1 tasks will not have the
|
|
||||||
correct value.
|
|
||||||
|
|
||||||
This paradigm for providing task unique data values is fundamentally broken on
|
|
||||||
SMP systems.
|
|
||||||
|
|
||||||
Classic API Per Task Variables
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
The Classic API provides three directives to support per task variables. These are:
|
|
||||||
|
|
||||||
- ``rtems_task_variable_add`` - Associate per task variable
|
|
||||||
|
|
||||||
- ``rtems_task_variable_get`` - Obtain value of a a per task variable
|
|
||||||
|
|
||||||
- ``rtems_task_variable_delete`` - Remove per task variable
|
|
||||||
|
|
||||||
As task variables are unsafe for use on SMP systems, the use of these services
|
|
||||||
must be eliminated in all software that is to be used in an SMP environment.
|
|
||||||
The task variables API is disabled on SMP. Its use will lead to compile-time
|
|
||||||
and link-time errors. It is recommended that the application developer consider
|
|
||||||
the use of POSIX Keys or Thread Local Storage (TLS). POSIX Keys are available
|
|
||||||
in all RTEMS configurations. For the availablity of TLS on a particular
|
|
||||||
architecture please consult the *RTEMS CPU Architecture Supplement*.
|
|
||||||
|
|
||||||
The only remaining user of task variables in the RTEMS code base is the Ada
|
|
||||||
support. So basically Ada is not available on RTEMS SMP.
|
|
||||||
|
|
||||||
OpenMP
|
OpenMP
|
||||||
------
|
------
|
||||||
|
|
||||||
@ -521,6 +371,197 @@ the heir thread must be used during interrupt processing. For this purpose a
|
|||||||
temporary per-processor stack is set up which may be used by the interrupt
|
temporary per-processor stack is set up which may be used by the interrupt
|
||||||
prologue before the stack is switched to the interrupt stack.
|
prologue before the stack is switched to the interrupt stack.
|
||||||
|
|
||||||
|
Application Issues
|
||||||
|
==================
|
||||||
|
|
||||||
|
Most operating system services provided by the uni-processor RTEMS are
|
||||||
|
available in SMP configurations as well. However, applications designed for an
|
||||||
|
uni-processor environment may need some changes to correctly run in an SMP
|
||||||
|
configuration.
|
||||||
|
|
||||||
|
As discussed earlier, SMP systems have opportunities for true parallelism which
|
||||||
|
was not possible on uni-processor systems. Consequently, multiple techniques
|
||||||
|
that provided adequate critical sections on uni-processor systems are unsafe on
|
||||||
|
SMP systems. In this section, some of these unsafe techniques will be
|
||||||
|
discussed.
|
||||||
|
|
||||||
|
In general, applications must use proper operating system provided mutual
|
||||||
|
exclusion mechanisms to ensure correct behavior.
|
||||||
|
|
||||||
|
Task variables
|
||||||
|
--------------
|
||||||
|
|
||||||
|
Task variables are ordinary global variables with a dedicated value for each
|
||||||
|
thread. During a context switch from the executing thread to the heir thread,
|
||||||
|
the value of each task variable is saved to the thread control block of the
|
||||||
|
executing thread and restored from the thread control block of the heir thread.
|
||||||
|
This is inherently broken if more than one executing thread exists.
|
||||||
|
Alternatives to task variables are POSIX keys and :ref:`TLS <TLS>`. All use
|
||||||
|
cases of task variables in the RTEMS code base were replaced with alternatives.
|
||||||
|
The task variable API has been removed in RTEMS 4.12.
|
||||||
|
|
||||||
|
Highest Priority Thread Never Walks Alone
|
||||||
|
-----------------------------------------
|
||||||
|
|
||||||
|
On a uni-processor system, it is safe to assume that when the highest priority
|
||||||
|
task in an application executes, it will execute without being preempted until
|
||||||
|
it voluntarily blocks. Interrupts may occur while it is executing, but there
|
||||||
|
will be no context switch to another task unless the highest priority task
|
||||||
|
voluntarily initiates it.
|
||||||
|
|
||||||
|
Given the assumption that no other tasks will have their execution interleaved
|
||||||
|
with the highest priority task, it is possible for this task to be constructed
|
||||||
|
such that it does not need to acquire a mutex for protected access to shared
|
||||||
|
data.
|
||||||
|
|
||||||
|
In an SMP system, it cannot be assumed there will never be a single task
|
||||||
|
executing. It should be assumed that every processor is executing another
|
||||||
|
application task. Further, those tasks will be ones which would not have been
|
||||||
|
executed in a uni-processor configuration and should be assumed to have data
|
||||||
|
synchronization conflicts with what was formerly the highest priority task
|
||||||
|
which executed without conflict.
|
||||||
|
|
||||||
|
Disabling of Thread Pre-Emption
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
A thread which disables pre-emption prevents that a higher priority thread gets
|
||||||
|
hold of its processor involuntarily. In uni-processor configurations, this can
|
||||||
|
be used to ensure mutual exclusion at thread level. In SMP configurations,
|
||||||
|
however, more than one executing thread may exist. Thus, it is impossible to
|
||||||
|
ensure mutual exclusion using this mechanism. In order to prevent that
|
||||||
|
applications using pre-emption for this purpose, would show inappropriate
|
||||||
|
behaviour, this feature is disabled in SMP configurations and its use would
|
||||||
|
case run-time errors.
|
||||||
|
|
||||||
|
Disabling of Interrupts
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
A low overhead means that ensures mutual exclusion in uni-processor
|
||||||
|
configurations is the disabling of interrupts around a critical section. This
|
||||||
|
is commonly used in device driver code. In SMP configurations, however,
|
||||||
|
disabling the interrupts on one processor has no effect on other processors.
|
||||||
|
So, this is insufficient to ensure system-wide mutual exclusion. The macros
|
||||||
|
|
||||||
|
* :ref:`rtems_interrupt_disable() <rtems_interrupt_disable>`,
|
||||||
|
|
||||||
|
* :ref:`rtems_interrupt_enable() <rtems_interrupt_enable>`, and
|
||||||
|
|
||||||
|
* :ref:`rtems_interrupt_flash() <rtems_interrupt_flash>`.
|
||||||
|
|
||||||
|
are disabled in SMP configurations and its use will cause compile-time warnings
|
||||||
|
and link-time errors. In the unlikely case that interrupts must be disabled on
|
||||||
|
the current processor, the
|
||||||
|
|
||||||
|
* :ref:`rtems_interrupt_local_disable() <rtems_interrupt_local_disable>`, and
|
||||||
|
|
||||||
|
* :ref:`rtems_interrupt_local_enable() <rtems_interrupt_local_enable>`.
|
||||||
|
|
||||||
|
macros are now available in all configurations.
|
||||||
|
|
||||||
|
Since disabling of interrupts is insufficient to ensure system-wide mutual
|
||||||
|
exclusion on SMP a new low-level synchronization primitive was added --
|
||||||
|
interrupt locks. The interrupt locks are a simple API layer on top of the SMP
|
||||||
|
locks used for low-level synchronization in the operating system core.
|
||||||
|
Currently, they are implemented as a ticket lock. In uni-processor
|
||||||
|
configurations, they degenerate to simple interrupt disable/enable sequences by
|
||||||
|
means of the C pre-processor. It is disallowed to acquire a single interrupt
|
||||||
|
lock in a nested way. This will result in an infinite loop with interrupts
|
||||||
|
disabled. While converting legacy code to interrupt locks, care must be taken
|
||||||
|
to avoid this situation to happen.
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
:linenos:
|
||||||
|
|
||||||
|
#include <rtems.h>
|
||||||
|
|
||||||
|
void legacy_code_with_interrupt_disable_enable( void )
|
||||||
|
{
|
||||||
|
rtems_interrupt_level level;
|
||||||
|
|
||||||
|
rtems_interrupt_disable( level );
|
||||||
|
/* Critical section */
|
||||||
|
rtems_interrupt_enable( level );
|
||||||
|
}
|
||||||
|
|
||||||
|
RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" )
|
||||||
|
|
||||||
|
void smp_ready_code_with_interrupt_lock( void )
|
||||||
|
{
|
||||||
|
rtems_interrupt_lock_context lock_context;
|
||||||
|
|
||||||
|
rtems_interrupt_lock_acquire( &lock, &lock_context );
|
||||||
|
/* Critical section */
|
||||||
|
rtems_interrupt_lock_release( &lock, &lock_context );
|
||||||
|
}
|
||||||
|
|
||||||
|
An alternative to the RTEMS-specific interrupt locks are POSIX spinlocks. The
|
||||||
|
:c:type:`pthread_spinlock_t` is defined as a self-contained object, e.g. the
|
||||||
|
user must provide the storage for this synchronization object.
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
:linenos:
|
||||||
|
|
||||||
|
#include <assert.h>
|
||||||
|
#include <pthread.h>
|
||||||
|
|
||||||
|
pthread_spinlock_t lock;
|
||||||
|
|
||||||
|
void smp_ready_code_with_posix_spinlock( void )
|
||||||
|
{
|
||||||
|
int error;
|
||||||
|
|
||||||
|
error = pthread_spin_lock( &lock );
|
||||||
|
assert( error == 0 );
|
||||||
|
/* Critical section */
|
||||||
|
error = pthread_spin_unlock( &lock );
|
||||||
|
assert( error == 0 );
|
||||||
|
}
|
||||||
|
|
||||||
|
In contrast to POSIX spinlock implementation on Linux or FreeBSD, it is not
|
||||||
|
allowed to call blocking operating system services inside the critical section.
|
||||||
|
A recursive lock attempt is a severe usage error resulting in an infinite loop
|
||||||
|
with interrupts disabled. Nesting of different locks is allowed. The user
|
||||||
|
must ensure that no deadlock can occur. As a non-portable feature the locks
|
||||||
|
are zero-initialized, e.g. statically initialized global locks reside in the
|
||||||
|
``.bss`` section and there is no need to call :c:func:`pthread_spin_init`.
|
||||||
|
|
||||||
|
Interrupt Service Routines Execute in Parallel With Threads
|
||||||
|
-----------------------------------------------------------
|
||||||
|
|
||||||
|
On a machine with more than one processor, interrupt service routines (this
|
||||||
|
includes timer service routines installed via :ref:`rtems_timer_fire_after()
|
||||||
|
<rtems_timer_fire_after>`) and threads can execute in parallel. Interrupt
|
||||||
|
service routines must take this into account and use proper locking mechanisms
|
||||||
|
to protect critical sections from interference by threads (interrupt locks or
|
||||||
|
POSIX spinlocks). This likely requires code modifications in legacy device
|
||||||
|
drivers.
|
||||||
|
|
||||||
|
Timers Do Not Stop Immediately
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
Timer service routines run in the context of the clock interrupt. On
|
||||||
|
uni-processor configurations, it is sufficient to disable interrupts and remove
|
||||||
|
a timer from the set of active timers to stop it. In SMP configurations,
|
||||||
|
however, the timer service routine may already run and wait on an SMP lock
|
||||||
|
owned by the thread which is about to stop the timer. This opens the door to
|
||||||
|
subtle synchronization issues. During destruction of objects, special care
|
||||||
|
must be taken to ensure that timer service routines cannot access (partly or
|
||||||
|
fully) destroyed objects.
|
||||||
|
|
||||||
|
False Sharing of Cache Lines Due to Objects Table
|
||||||
|
-------------------------------------------------
|
||||||
|
|
||||||
|
The Classic API and most POSIX API objects are indirectly accessed via an
|
||||||
|
object identifier. The user-level functions validate the object identifier and
|
||||||
|
map it to the actual object structure which resides in a global objects table
|
||||||
|
for each object class. So, unrelated objects are packed together in a table.
|
||||||
|
This may result in false sharing of cache lines. The effect of false sharing
|
||||||
|
of cache lines can be observed with the `TMFINE 1
|
||||||
|
<https://git.rtems.org/rtems/tree/testsuites/tmtests/tmfine01>`_ test program
|
||||||
|
on a suitable platform, e.g. QorIQ T4240. High-performance SMP applications
|
||||||
|
need full control of the object storage :cite:`Drepper:2007:Memory`.
|
||||||
|
Therefore, self-contained synchronization objects are now available for RTEMS.
|
||||||
|
|
||||||
Directives
|
Directives
|
||||||
==========
|
==========
|
||||||
|
|
||||||
|
@ -284,3 +284,7 @@
|
|||||||
year = {2016},
|
year = {2016},
|
||||||
url = {https://hal.archives-ouvertes.fr/hal-01295194/document},
|
url = {https://hal.archives-ouvertes.fr/hal-01295194/document},
|
||||||
}
|
}
|
||||||
|
@misc{RTEMS:CPU,
|
||||||
|
title = {{RTEMS CPU Architecture Supplement}},
|
||||||
|
url = {https://docs.rtems.org/branches/master/cpu-supplement.pdf},
|
||||||
|
}
|
||||||
|
Loading…
x
Reference in New Issue
Block a user