This commit is contained in:
Chris Johns 2016-02-04 10:19:13 +13:00 committed by Amar Takhar
parent 8ef6ea80ef
commit c9aaf3145f
6 changed files with 1726 additions and 1683 deletions

File diff suppressed because it is too large Load Diff

View File

@ -15,9 +15,9 @@ user-supplied device driver. In a multiprocessor configuration, this manager
also initializes the interprocessor communications layer. The directives
provided by the Initialization Manager are:
- ``rtems_initialize_executive`` - Initialize RTEMS
- rtems_initialize_executive_ - Initialize RTEMS
- ``rtems_shutdown_executive`` - Shutdown RTEMS
- rtems_shutdown_executive_ - Shutdown RTEMS
Background
==========
@ -104,7 +104,7 @@ The ``rtems_fatal_error_occurred`` directive will be invoked from
successfully.
A discussion of RTEMS actions when a fatal error occurs may be found
`Announcing a Fatal Error`_.
:ref:`Announcing a Fatal Error`.
Operations
==========
@ -129,7 +129,7 @@ consists of
The ``rtems_initialize_executive`` directive uses a system initialization
linker set to initialize only those parts of the overall RTEMS feature set that
is necessary for a particular application. See `Linker Sets`_. Each RTEMS
is necessary for a particular application. See :ref:`Linker Sets`. Each RTEMS
feature used the application may optionally register an initialization handler.
The system initialization API is available via``#included <rtems/sysinit.h>``.
@ -184,7 +184,7 @@ initialization stack may be re-used for interrupt processing.
Many of RTEMS actions during initialization are based upon the contents of the
Configuration Table. For more information regarding the format and contents of
this table, please refer to the chapter `Configuring a System`_.
this table, please refer to the chapter :ref:`Configuring a System`.
The final action in the initialization sequence is the initiation of
multitasking. When the scheduler and dispatcher are enabled, the highest
@ -205,6 +205,8 @@ This section details the Initialization Manager's directives. A subsection is
dedicated to each of this manager's directives and describes the calling
sequence, related constants, usage, and status codes.
.. _rtems_initialize_executive:
INITIALIZE_EXECUTIVE - Initialize RTEMS
---------------------------------------
.. index:: initialize RTEMS
@ -234,6 +236,8 @@ This directive should be called by ``boot_card`` only.
This directive *does not return* to the caller. Errors in the initialization
sequence are usually fatal and lead to a system termination.
.. _rtems_shutdown_executive:
SHUTDOWN_EXECUTIVE - Shutdown RTEMS
-----------------------------------
.. index:: shutdown RTEMS

View File

@ -243,7 +243,9 @@ The development of responsive real-time applications requires an understanding
of how RTEMS maintains and supports time-related operations. The basic unit of
time in RTEMS is known as a tick. The frequency of clock ticks is completely
application dependent and determines the granularity and accuracy of all
interval and calendar time operations... index:: rtems_interval
interval and calendar time operations.
.. index:: rtems_interval
By tracking time in units of ticks, RTEMS is capable of supporting interval
timing functions such as task delays, timeouts, timeslicing, the delayed

View File

@ -1,3 +1,7 @@
.. COMMENT: COPYRIGHT (c) 1988-2008.
.. COMMENT: On-Line Applications Research Corporation (OAR).
.. COMMENT: All rights reserved.
Multiprocessing Manager
#######################
@ -6,255 +10,233 @@ Multiprocessing Manager
Introduction
============
In multiprocessor real-time systems, new
requirements, such as sharing data and global resources between
processors, are introduced. This requires an efficient and
reliable communications vehicle which allows all processors to
communicate with each other as necessary. In addition, the
ramifications of multiple processors affect each and every
characteristic of a real-time system, almost always making them
more complicated.
In multiprocessor real-time systems, new requirements, such as sharing data and
global resources between processors, are introduced. This requires an
efficient and reliable communications vehicle which allows all processors to
communicate with each other as necessary. In addition, the ramifications of
multiple processors affect each and every characteristic of a real-time system,
almost always making them more complicated.
RTEMS addresses these issues by providing simple and
flexible real-time multiprocessing capabilities. The executive
easily lends itself to both tightly-coupled and loosely-coupled
configurations of the target system hardware. In addition,
RTEMS supports systems composed of both homogeneous and
RTEMS addresses these issues by providing simple and flexible real-time
multiprocessing capabilities. The executive easily lends itself to both
tightly-coupled and loosely-coupled configurations of the target system
hardware. In addition, RTEMS supports systems composed of both homogeneous and
heterogeneous mixtures of processors and target boards.
A major design goal of the RTEMS executive was to
transcend the physical boundaries of the target hardware
configuration. This goal is achieved by presenting the
application software with a logical view of the target system
where the boundaries between processor nodes are transparent.
As a result, the application developer may designate objects
such as tasks, queues, events, signals, semaphores, and memory
blocks as global objects. These global objects may then be
accessed by any task regardless of the physical location of the
object and the accessing task. RTEMS automatically determines
that the object being accessed resides on another processor and
performs the actions required to access the desired object.
Simply stated, RTEMS allows the entire system, both hardware and
software, to be viewed logically as a single system.
A major design goal of the RTEMS executive was to transcend the physical
boundaries of the target hardware configuration. This goal is achieved by
presenting the application software with a logical view of the target system
where the boundaries between processor nodes are transparent. As a result, the
application developer may designate objects such as tasks, queues, events,
signals, semaphores, and memory blocks as global objects. These global objects
may then be accessed by any task regardless of the physical location of the
object and the accessing task. RTEMS automatically determines that the object
being accessed resides on another processor and performs the actions required
to access the desired object. Simply stated, RTEMS allows the entire system,
both hardware and software, to be viewed logically as a single system.
The directives provided by the Manager are:
- rtems_multiprocessing_announce_ - A multiprocessing communications packet has
arrived
Background
==========
.. index:: multiprocessing topologies
RTEMS makes no assumptions regarding the connection
media or topology of a multiprocessor system. The tasks which
compose a particular application can be spread among as many
processors as needed to satisfy the application's timing
requirements. The application tasks can interact using a subset
of the RTEMS directives as if they were on the same processor.
These directives allow application tasks to exchange data,
communicate, and synchronize regardless of which processor they
reside upon.
RTEMS makes no assumptions regarding the connection media or topology of a
multiprocessor system. The tasks which compose a particular application can be
spread among as many processors as needed to satisfy the application's timing
requirements. The application tasks can interact using a subset of the RTEMS
directives as if they were on the same processor. These directives allow
application tasks to exchange data, communicate, and synchronize regardless of
which processor they reside upon.
The RTEMS multiprocessor execution model is multiple
instruction streams with multiple data streams (MIMD). This
execution model has each of the processors executing code
independent of the other processors. Because of this
parallelism, the application designer can more easily guarantee
deterministic behavior.
The RTEMS multiprocessor execution model is multiple instruction streams with
multiple data streams (MIMD). This execution model has each of the processors
executing code independent of the other processors. Because of this
parallelism, the application designer can more easily guarantee deterministic
behavior.
By supporting heterogeneous environments, RTEMS
allows the systems designer to select the most efficient
processor for each subsystem of the application. Configuring
RTEMS for a heterogeneous environment is no more difficult than
for a homogeneous one. In keeping with RTEMS philosophy of
providing transparent physical node boundaries, the minimal
heterogeneous processing required is isolated in the MPCI layer.
By supporting heterogeneous environments, RTEMS allows the systems designer to
select the most efficient processor for each subsystem of the application.
Configuring RTEMS for a heterogeneous environment is no more difficult than for
a homogeneous one. In keeping with RTEMS philosophy of providing transparent
physical node boundaries, the minimal heterogeneous processing required is
isolated in the MPCI layer.
Nodes
-----
.. index:: nodes, definition
A processor in a RTEMS system is referred to as a
node. Each node is assigned a unique non-zero node number by
the application designer. RTEMS assumes that node numbers are
assigned consecutively from one to the ``maximum_nodes``
configuration parameter. The node
number, node, and the maximum number of nodes, maximum_nodes, in
a system are found in the Multiprocessor Configuration Table.
The maximum_nodes field and the number of global objects,
maximum_global_objects, is required to be the same on all nodes
in a system.
A processor in a RTEMS system is referred to as a node. Each node is assigned
a unique non-zero node number by the application designer. RTEMS assumes that
node numbers are assigned consecutively from one to the ``maximum_nodes``
configuration parameter. The node number, node, and the maximum number of
nodes, ``maximum_nodes``, in a system are found in the Multiprocessor
Configuration Table. The ``maximum_nodes`` field and the number of global
objects, ``maximum_global_objects``, is required to be the same on all nodes in
a system.
The node number is used by RTEMS to identify each
node when performing remote operations. Thus, the
Multiprocessor Communications Interface Layer (MPCI) must be
able to route messages based on the node number.
The node number is used by RTEMS to identify each node when performing remote
operations. Thus, the Multiprocessor Communications Interface Layer (MPCI)
must be able to route messages based on the node number.
Global Objects
--------------
.. index:: global objects, definition
All RTEMS objects which are created with the GLOBAL
attribute will be known on all other nodes. Global objects can
be referenced from any node in the system, although certain
directive specific restrictions (e.g. one cannot delete a remote
object) may apply. A task does not have to be global to perform
operations involving remote objects. The maximum number of
global objects is the system is user configurable and can be
found in the maximum_global_objects field in the Multiprocessor
Configuration Table. The distribution of tasks to processors is
performed during the application design phase. Dynamic task
All RTEMS objects which are created with the GLOBAL attribute will be known on
all other nodes. Global objects can be referenced from any node in the system,
although certain directive specific restrictions (e.g. one cannot delete a
remote object) may apply. A task does not have to be global to perform
operations involving remote objects. The maximum number of global objects is
the system is user configurable and can be found in the maximum_global_objects
field in the Multiprocessor Configuration Table. The distribution of tasks to
processors is performed during the application design phase. Dynamic task
relocation is not supported by RTEMS.
Global Object Table
-------------------
.. index:: global objects table
RTEMS maintains two tables containing object
information on every node in a multiprocessor system: a local
object table and a global object table. The local object table
on each node is unique and contains information for all objects
created on this node whether those objects are local or global.
The global object table contains information regarding all
global objects in the system and, consequently, is the same on
every node.
RTEMS maintains two tables containing object information on every node in a
multiprocessor system: a local object table and a global object table. The
local object table on each node is unique and contains information for all
objects created on this node whether those objects are local or global. The
global object table contains information regarding all global objects in the
system and, consequently, is the same on every node.
Since each node must maintain an identical copy of
the global object table, the maximum number of entries in each
copy of the table must be the same. The maximum number of
entries in each copy is determined by the
maximum_global_objects parameter in the Multiprocessor
Configuration Table. This parameter, as well as the
maximum_nodes parameter, is required to be the same on all
nodes. To maintain consistency among the table copies, every
node in the system must be informed of the creation or deletion
of a global object.
Since each node must maintain an identical copy of the global object table, the
maximum number of entries in each copy of the table must be the same. The
maximum number of entries in each copy is determined by the
maximum_global_objects parameter in the Multiprocessor Configuration Table.
This parameter, as well as the maximum_nodes parameter, is required to be the
same on all nodes. To maintain consistency among the table copies, every node
in the system must be informed of the creation or deletion of a global object.
Remote Operations
-----------------
.. index:: MPCI and remote operations
When an application performs an operation on a remote
global object, RTEMS must generate a Remote Request (RQ) message
and send it to the appropriate node. After completing the
requested operation, the remote node will build a Remote
Response (RR) message and send it to the originating node.
Messages generated as a side-effect of a directive (such as
deleting a global task) are known as Remote Processes (RP) and
do not require the receiving node to respond.
When an application performs an operation on a remote global object, RTEMS must
generate a Remote Request (RQ) message and send it to the appropriate node.
After completing the requested operation, the remote node will build a Remote
Response (RR) message and send it to the originating node. Messages generated
as a side-effect of a directive (such as deleting a global task) are known as
Remote Processes (RP) and do not require the receiving node to respond.
Other than taking slightly longer to execute
directives on remote objects, the application is unaware of the
location of the objects it acts upon. The exact amount of
overhead required for a remote operation is dependent on the
media connecting the nodes and, to a lesser degree, on the
efficiency of the user-provided MPCI routines.
Other than taking slightly longer to execute directives on remote objects, the
application is unaware of the location of the objects it acts upon. The exact
amount of overhead required for a remote operation is dependent on the media
connecting the nodes and, to a lesser degree, on the efficiency of the
user-provided MPCI routines.
The following shows the typical transaction sequence
during a remote application:
The following shows the typical transaction sequence during a remote
application:
# The application issues a directive accessing a
remote global object.
#. The application issues a directive accessing a remote global object.
# RTEMS determines the node on which the object
resides.
#. RTEMS determines the node on which the object resides.
# RTEMS calls the user-provided MPCI routine
GET_PACKET to obtain a packet in which to build a RQ message.
#. RTEMS calls the user-provided MPCI routine ``GET_PACKET`` to obtain a packet
in which to build a RQ message.
# After building a message packet, RTEMS calls the
user-provided MPCI routine SEND_PACKET to transmit the packet to
the node on which the object resides (referred to as the
destination node).
#. After building a message packet, RTEMS calls the user-provided MPCI routine
``SEND_PACKET`` to transmit the packet to the node on which the object
resides (referred to as the destination node).
# The calling task is blocked until the RR message
arrives, and control of the processor is transferred to another
task.
#. The calling task is blocked until the RR message arrives, and control of the
processor is transferred to another task.
# The MPCI layer on the destination node senses the
arrival of a packet (commonly in an ISR), and calls the``rtems_multiprocessing_announce``
directive. This directive readies the Multiprocessing Server.
#. The MPCI layer on the destination node senses the arrival of a packet
(commonly in an ISR), and calls the ``rtems_multiprocessing_announce``
directive. This directive readies the Multiprocessing Server.
# The Multiprocessing Server calls the user-provided
MPCI routine RECEIVE_PACKET, performs the requested operation,
builds an RR message, and returns it to the originating node.
#. The Multiprocessing Server calls the user-provided MPCI routine
``RECEIVE_PACKET``, performs the requested operation, builds an RR message,
and returns it to the originating node.
# The MPCI layer on the originating node senses the
arrival of a packet (typically via an interrupt), and calls the RTEMS``rtems_multiprocessing_announce`` directive. This directive
readies the Multiprocessing Server.
#. The MPCI layer on the originating node senses the arrival of a packet
(typically via an interrupt), and calls the
RTEMS``rtems_multiprocessing_announce`` directive. This directive readies
the Multiprocessing Server.
# The Multiprocessing Server calls the user-provided
MPCI routine RECEIVE_PACKET, readies the original requesting
task, and blocks until another packet arrives. Control is
transferred to the original task which then completes processing
of the directive.
#. The Multiprocessing Server calls the user-provided MPCI routine
``RECEIVE_PACKET``, readies the original requesting task, and blocks until
another packet arrives. Control is transferred to the original task which
then completes processing of the directive.
If an uncorrectable error occurs in the user-provided
MPCI layer, the fatal error handler should be invoked. RTEMS
assumes the reliable transmission and reception of messages by
the MPCI and makes no attempt to detect or correct errors.
If an uncorrectable error occurs in the user-provided MPCI layer, the fatal
error handler should be invoked. RTEMS assumes the reliable transmission and
reception of messages by the MPCI and makes no attempt to detect or correct
errors.
Proxies
-------
.. index:: proxy, definition
A proxy is an RTEMS data structure which resides on a
remote node and is used to represent a task which must block as
part of a remote operation. This action can occur as part of the``rtems_semaphore_obtain`` and``rtems_message_queue_receive`` directives. If the
object were local, the task's control block would be available
for modification to indicate it was blocking on a message queue
or semaphore. However, the task's control block resides only on
the same node as the task. As a result, the remote node must
A proxy is an RTEMS data structure which resides on a remote node and is used
to represent a task which must block as part of a remote operation. This action
can occur as part of the ``rtems_semaphore_obtain`` and
``rtems_message_queue_receive`` directives. If the object were local, the
task's control block would be available for modification to indicate it was
blocking on a message queue or semaphore. However, the task's control block
resides only on the same node as the task. As a result, the remote node must
allocate a proxy to represent the task until it can be readied.
The maximum number of proxies is defined in the
Multiprocessor Configuration Table. Each node in a
multiprocessor system may require a different number of proxies
to be configured. The distribution of proxy control blocks is
application dependent and is different from the distribution of
tasks.
The maximum number of proxies is defined in the Multiprocessor Configuration
Table. Each node in a multiprocessor system may require a different number of
proxies to be configured. The distribution of proxy control blocks is
application dependent and is different from the distribution of tasks.
Multiprocessor Configuration Table
----------------------------------
The Multiprocessor Configuration Table contains
information needed by RTEMS when used in a multiprocessor
system. This table is discussed in detail in the section
Multiprocessor Configuration Table of the Configuring a System
chapter.
The Multiprocessor Configuration Table contains information needed by RTEMS
when used in a multiprocessor system. This table is discussed in detail in the
section Multiprocessor Configuration Table of the Configuring a System chapter.
Multiprocessor Communications Interface Layer
=============================================
The Multiprocessor Communications Interface Layer
(MPCI) is a set of user-provided procedures which enable the
nodes in a multiprocessor system to communicate with one
another. These routines are invoked by RTEMS at various times
in the preparation and processing of remote requests.
Interrupts are enabled when an MPCI procedure is invoked. It is
assumed that if the execution mode and/or interrupt level are
altered by the MPCI layer, that they will be restored prior to
returning to RTEMS... index:: MPCI, definition
The Multiprocessor Communications Interface Layer (MPCI) is a set of
user-provided procedures which enable the nodes in a multiprocessor system to
communicate with one another. These routines are invoked by RTEMS at various
times in the preparation and processing of remote requests. Interrupts are
enabled when an MPCI procedure is invoked. It is assumed that if the execution
mode and/or interrupt level are altered by the MPCI layer, that they will be
restored prior to returning to RTEMS.
The MPCI layer is responsible for managing a pool of
buffers called packets and for sending these packets between
system nodes. Packet buffers contain the messages sent between
the nodes. Typically, the MPCI layer will encapsulate the
packet within an envelope which contains the information needed
by the MPCI layer. The number of packets available is dependent
on the MPCI layer implementation... index:: MPCI entry points
.. index:: MPCI, definition
The entry points to the routines in the user's MPCI
layer should be placed in the Multiprocessor Communications
Interface Table. The user must provide entry points for each of
the following table entries in a multiprocessor system:
The MPCI layer is responsible for managing a pool of buffers called packets and
for sending these packets between system nodes. Packet buffers contain the
messages sent between the nodes. Typically, the MPCI layer will encapsulate
the packet within an envelope which contains the information needed by the MPCI
layer. The number of packets available is dependent on the MPCI layer
implementation.
- initialization initialize the MPCI
.. index:: MPCI entry points
- get_packet obtain a packet buffer
The entry points to the routines in the user's MPCI layer should be placed in
the Multiprocessor Communications Interface Table. The user must provide entry
points for each of the following table entries in a multiprocessor system:
- return_packet return a packet buffer
.. list-table::
:class: rtems-table
- send_packet send a packet to another node
- receive_packet called to get an arrived packet
* - initialization
- initialize the MPCI
* - get_packet
- obtain a packet buffer
* - return_packet
- return a packet buffer
* - send_packet
- send a packet to another node
* - receive_packet
- called to get an arrived packet
A packet is sent by RTEMS in each of the following situations:
@ -270,153 +252,144 @@ A packet is sent by RTEMS in each of the following situations:
- during system initialization to check for system consistency.
If the target hardware supports it, the arrival of a
packet at a node may generate an interrupt. Otherwise, the
real-time clock ISR can check for the arrival of a packet. In
any case, the``rtems_multiprocessing_announce`` directive must be called
to announce the arrival of a packet. After exiting the ISR,
control will be passed to the Multiprocessing Server to process
the packet. The Multiprocessing Server will call the get_packet
entry to obtain a packet buffer and the receive_entry entry to
copy the message into the buffer obtained.
If the target hardware supports it, the arrival of a packet at a node may
generate an interrupt. Otherwise, the real-time clock ISR can check for the
arrival of a packet. In any case, the ``rtems_multiprocessing_announce``
directive must be called to announce the arrival of a packet. After exiting
the ISR, control will be passed to the Multiprocessing Server to process the
packet. The Multiprocessing Server will call the get_packet entry to obtain a
packet buffer and the receive_entry entry to copy the message into the buffer
obtained.
INITIALIZATION
--------------
The INITIALIZATION component of the user-provided
MPCI layer is called as part of the ``rtems_initialize_executive``
directive to initialize the MPCI layer and associated hardware.
It is invoked immediately after all of the device drivers have
been initialized. This component should be adhere to the
following prototype:.. index:: rtems_mpci_entry
The INITIALIZATION component of the user-provided MPCI layer is called as part
of the ``rtems_initialize_executive`` directive to initialize the MPCI layer
and associated hardware. It is invoked immediately after all of the device
drivers have been initialized. This component should be adhere to the
following prototype:
.. index:: rtems_mpci_entry
.. code:: c
rtems_mpci_entry user_mpci_initialization(
rtems_configuration_table \*configuration
rtems_configuration_table *configuration
);
where configuration is the address of the user's
Configuration Table. Operations on global objects cannot be
performed until this component is invoked. The INITIALIZATION
component is invoked only once in the life of any system. If
the MPCI layer cannot be successfully initialized, the fatal
error manager should be invoked by this routine.
where configuration is the address of the user's Configuration Table.
Operations on global objects cannot be performed until this component is
invoked. The INITIALIZATION component is invoked only once in the life of any
system. If the MPCI layer cannot be successfully initialized, the fatal error
manager should be invoked by this routine.
One of the primary functions of the MPCI layer is to
provide the executive with packet buffers. The INITIALIZATION
routine must create and initialize a pool of packet buffers.
There must be enough packet buffers so RTEMS can obtain one
One of the primary functions of the MPCI layer is to provide the executive with
packet buffers. The INITIALIZATION routine must create and initialize a pool
of packet buffers. There must be enough packet buffers so RTEMS can obtain one
whenever needed.
GET_PACKET
----------
The GET_PACKET component of the user-provided MPCI
layer is called when RTEMS must obtain a packet buffer to send
or broadcast a message. This component should be adhere to the
following prototype:
The GET_PACKET component of the user-provided MPCI layer is called when RTEMS
must obtain a packet buffer to send or broadcast a message. This component
should be adhere to the following prototype:
.. code:: c
rtems_mpci_entry user_mpci_get_packet(
rtems_packet_prefix \**packet
rtems_packet_prefix **packet
);
where packet is the address of a pointer to a packet.
This routine always succeeds and, upon return, packet will
contain the address of a packet. If for any reason, a packet
cannot be successfully obtained, then the fatal error manager
should be invoked.
where packet is the address of a pointer to a packet. This routine always
succeeds and, upon return, packet will contain the address of a packet. If for
any reason, a packet cannot be successfully obtained, then the fatal error
manager should be invoked.
RTEMS has been optimized to avoid the need for
obtaining a packet each time a message is sent or broadcast.
For example, RTEMS sends response messages (RR) back to the
originator in the same packet in which the request message (RQ)
RTEMS has been optimized to avoid the need for obtaining a packet each time a
message is sent or broadcast. For example, RTEMS sends response messages (RR)
back to the originator in the same packet in which the request message (RQ)
arrived.
RETURN_PACKET
-------------
The RETURN_PACKET component of the user-provided MPCI
layer is called when RTEMS needs to release a packet to the free
packet buffer pool. This component should be adhere to the
following prototype:
The RETURN_PACKET component of the user-provided MPCI layer is called when
RTEMS needs to release a packet to the free packet buffer pool. This component
should be adhere to the following prototype:
.. code:: c
rtems_mpci_entry user_mpci_return_packet(
rtems_packet_prefix \*packet
rtems_packet_prefix *packet
);
where packet is the address of a packet. If the
packet cannot be successfully returned, the fatal error manager
should be invoked.
where packet is the address of a packet. If the packet cannot be successfully
returned, the fatal error manager should be invoked.
RECEIVE_PACKET
--------------
The RECEIVE_PACKET component of the user-provided
MPCI layer is called when RTEMS needs to obtain a packet which
has previously arrived. This component should be adhere to the
following prototype:
The RECEIVE_PACKET component of the user-provided MPCI layer is called when
RTEMS needs to obtain a packet which has previously arrived. This component
should be adhere to the following prototype:
.. code:: c
rtems_mpci_entry user_mpci_receive_packet(
rtems_packet_prefix \**packet
rtems_packet_prefix **packet
);
where packet is a pointer to the address of a packet
to place the message from another node. If a message is
available, then packet will contain the address of the message
from another node. If no messages are available, this entry
where packet is a pointer to the address of a packet to place the message from
another node. If a message is available, then packet will contain the address
of the message from another node. If no messages are available, this entry
packet should contain NULL.
SEND_PACKET
-----------
The SEND_PACKET component of the user-provided MPCI
layer is called when RTEMS needs to send a packet containing a
message to another node. This component should be adhere to the
following prototype:
The SEND_PACKET component of the user-provided MPCI layer is called when RTEMS
needs to send a packet containing a message to another node. This component
should be adhere to the following prototype:
.. code:: c
rtems_mpci_entry user_mpci_send_packet(
uint32_t node,
rtems_packet_prefix \**packet
uint32_t node,
rtems_packet_prefix **packet
);
where node is the node number of the destination and packet is the
address of a packet which containing a message. If the packet cannot
be successfully sent, the fatal error manager should be invoked.
where node is the node number of the destination and packet is the address of a
packet which containing a message. If the packet cannot be successfully sent,
the fatal error manager should be invoked.
If node is set to zero, the packet is to be
broadcasted to all other nodes in the system. Although some
MPCI layers will be built upon hardware which support a
broadcast mechanism, others may be required to generate a copy
of the packet for each node in the system.
If node is set to zero, the packet is to be broadcasted to all other nodes in
the system. Although some MPCI layers will be built upon hardware which
support a broadcast mechanism, others may be required to generate a copy of the
packet for each node in the system.
.. COMMENT: XXX packet_prefix structure needs to be defined in this document
Many MPCI layers use the ``packet_length`` field of the``rtems_packet_prefix`` portion
of the packet to avoid sending unnecessary data. This is especially
Many MPCI layers use the ``packet_length`` field of the ``rtems_packet_prefix``
portion of the packet to avoid sending unnecessary data. This is especially
useful if the media connecting the nodes is relatively slow.
The ``to_convert`` field of the ``rtems_packet_prefix`` portion of the
packet indicates how much of the packet in 32-bit units may require conversion
in a heterogeneous system.
The ``to_convert`` field of the ``rtems_packet_prefix`` portion of the packet
indicates how much of the packet in 32-bit units may require conversion in a
heterogeneous system.
Supporting Heterogeneous Environments
-------------------------------------
.. index:: heterogeneous multiprocessing
Developing an MPCI layer for a heterogeneous system
requires a thorough understanding of the differences between the
processors which comprise the system. One difficult problem is
the varying data representation schemes used by different
processor types. The most pervasive data representation problem
is the order of the bytes which compose a data entity.
Processors which place the least significant byte at the
smallest address are classified as little endian processors.
Little endian byte-ordering is shown below:
Developing an MPCI layer for a heterogeneous system requires a thorough
understanding of the differences between the processors which comprise the
system. One difficult problem is the varying data representation schemes used
by different processor types. The most pervasive data representation problem
is the order of the bytes which compose a data entity. Processors which place
the least significant byte at the smallest address are classified as little
endian processors. Little endian byte-ordering is shown below:
.. code:: c
@ -426,9 +399,10 @@ Little endian byte-ordering is shown below:
| | | | |
+---------------+----------------+---------------+----------------+
Conversely, processors which place the most
significant byte at the smallest address are classified as big
endian processors. Big endian byte-ordering is shown below:
Conversely, processors which place the most significant byte at the smallest
address are classified as big endian processors. Big endian byte-ordering is
shown below:
.. code:: c
+---------------+----------------+---------------+----------------+
@ -437,47 +411,45 @@ endian processors. Big endian byte-ordering is shown below:
| | | | |
+---------------+----------------+---------------+----------------+
Unfortunately, sharing a data structure between big
endian and little endian processors requires translation into a
common endian format. An application designer typically chooses
the common endian format to minimize conversion overhead.
Unfortunately, sharing a data structure between big endian and little endian
processors requires translation into a common endian format. An application
designer typically chooses the common endian format to minimize conversion
overhead.
Another issue in the design of shared data structures
is the alignment of data structure elements. Alignment is both
processor and compiler implementation dependent. For example,
some processors allow data elements to begin on any address
boundary, while others impose restrictions. Common restrictions
are that data elements must begin on either an even address or
on a long word boundary. Violation of these restrictions may
cause an exception or impose a performance penalty.
Another issue in the design of shared data structures is the alignment of data
structure elements. Alignment is both processor and compiler implementation
dependent. For example, some processors allow data elements to begin on any
address boundary, while others impose restrictions. Common restrictions are
that data elements must begin on either an even address or on a long word
boundary. Violation of these restrictions may cause an exception or impose a
performance penalty.
Other issues which commonly impact the design of
shared data structures include the representation of floating
point numbers, bit fields, decimal data, and character strings.
In addition, the representation method for negative integers
could be one's or two's complement. These factors combine to
increase the complexity of designing and manipulating data
structures shared between processors.
Other issues which commonly impact the design of shared data structures include
the representation of floating point numbers, bit fields, decimal data, and
character strings. In addition, the representation method for negative
integers could be one's or two's complement. These factors combine to increase
the complexity of designing and manipulating data structures shared between
processors.
RTEMS addressed these issues in the design of the
packets used to communicate between nodes. The RTEMS packet
format is designed to allow the MPCI layer to perform all
necessary conversion without burdening the developer with the
details of the RTEMS packet format. As a result, the MPCI layer
must be aware of the following:
RTEMS addressed these issues in the design of the packets used to communicate
between nodes. The RTEMS packet format is designed to allow the MPCI layer to
perform all necessary conversion without burdening the developer with the
details of the RTEMS packet format. As a result, the MPCI layer must be aware
of the following:
- All packets must begin on a four byte boundary.
- Packets are composed of both RTEMS and application data. All RTEMS data
is treated as 32-bit unsigned quantities and is in the first ``to_convert``
32-bit quantities of the packet. The ``to_convert`` field is part of the``rtems_packet_prefix`` portion of the packet.
- Packets are composed of both RTEMS and application data. All RTEMS data is
treated as 32-bit unsigned quantities and is in the first ``to_convert``
32-bit quantities of the packet. The ``to_convert`` field is part of the
``rtems_packet_prefix`` portion of the packet.
- The RTEMS data component of the packet must be in native
endian format. Endian conversion may be performed by either the
sending or receiving MPCI layer.
- The RTEMS data component of the packet must be in native endian format.
Endian conversion may be performed by either the sending or receiving MPCI
layer.
- RTEMS makes no assumptions regarding the application
data component of the packet.
- RTEMS makes no assumptions regarding the application data component of the
packet.
Operations
==========
@ -485,19 +457,19 @@ Operations
Announcing a Packet
-------------------
The ``rtems_multiprocessing_announce`` directive is called by
the MPCI layer to inform RTEMS that a packet has arrived from
another node. This directive can be called from an interrupt
service routine or from within a polling routine.
The ``rtems_multiprocessing_announce`` directive is called by the MPCI layer to
inform RTEMS that a packet has arrived from another node. This directive can
be called from an interrupt service routine or from within a polling routine.
Directives
==========
This section details the additional directives
required to support RTEMS in a multiprocessor configuration. A
subsection is dedicated to each of this manager's directives and
describes the calling sequence, related constants, usage, and
status codes.
This section details the additional directives required to support RTEMS in a
multiprocessor configuration. A subsection is dedicated to each of this
manager's directives and describes the calling sequence, related constants,
usage, and status codes.
.. _rtems_multiprocessing_announce:
MULTIPROCESSING_ANNOUNCE - Announce the arrival of a packet
-----------------------------------------------------------
@ -517,23 +489,14 @@ NONE
**DESCRIPTION:**
This directive informs RTEMS that a multiprocessing
communications packet has arrived from another node. This
directive is called by the user-provided MPCI, and is only used
in multiprocessor configurations.
This directive informs RTEMS that a multiprocessing communications packet has
arrived from another node. This directive is called by the user-provided MPCI,
and is only used in multiprocessor configurations.
**NOTES:**
This directive is typically called from an ISR.
This directive will almost certainly cause the
calling task to be preempted.
This directive will almost certainly cause the calling task to be preempted.
This directive does not generate activity on remote nodes.
.. COMMENT: COPYRIGHT (c) 2014.
.. COMMENT: On-Line Applications Research Corporation (OAR).
.. COMMENT: All rights reserved.

View File

@ -1,3 +1,7 @@
.. COMMENT: COPYRIGHT (c) 1988-2008.
.. COMMENT: On-Line Applications Research Corporation (OAR).
.. COMMENT: All rights reserved.
PCI Library
###########
@ -6,19 +10,19 @@ PCI Library
Introduction
============
The Peripheral Component Interconnect (PCI) bus is a very common computer
bus architecture that is found in almost every PC today. The PCI bus is
normally located at the motherboard where some PCI devices are soldered
directly onto the PCB and expansion slots allows the user to add custom
devices easily. There is a wide range of PCI hardware available implementing
all sorts of interfaces and functions.
The Peripheral Component Interconnect (PCI) bus is a very common computer bus
architecture that is found in almost every PC today. The PCI bus is normally
located at the motherboard where some PCI devices are soldered directly onto
the PCB and expansion slots allows the user to add custom devices easily. There
is a wide range of PCI hardware available implementing all sorts of interfaces
and functions.
This section describes the PCI Library available in RTEMS used to access the
PCI bus in a portable way across computer architectures supported by RTEMS.
The PCI Library aims to be compatible with PCI 2.3 with a couple of
limitations, for example there is no support for hot-plugging, 64-bit
memory space and cardbus bridges.
limitations, for example there is no support for hot-plugging, 64-bit memory
space and cardbus bridges.
In order to support different architectures and with small foot-print embedded
systems in mind the PCI Library offers four different configuration options
@ -26,7 +30,7 @@ listed below. It is selected during compile time by defining the appropriate
macros in confdefs.h. It is also possible to enable PCI_LIB_NONE (No
Configuration) which can be used for debuging PCI access functions.
- Auto Configuration (do Plug & Play)
- Auto Configuration (Plug & Play)
- Read Configuration (read BIOS or boot loader configuration)
@ -37,24 +41,24 @@ Configuration) which can be used for debuging PCI access functions.
Background
==========
The PCI bus is constructed in a way where on-board devices and devices
in expansion slots can be automatically found (probed) and configured
using Plug & Play completely implemented in software. The bus is set up once
during boot up. The Plug & Play information can be read and written from
PCI configuration space. A PCI device is identified in configuration space by
a unique bus, slot and function number. Each PCI slot can have up to 8
functions and interface to another PCI sub-bus by implementing a PCI-to-PCI
bridge according to the PCI Bridge Architecture specification.
The PCI bus is constructed in a way where on-board devices and devices in
expansion slots can be automatically found (probed) and configured using Plug &
Play completely implemented in software. The bus is set up once during boot
up. The Plug & Play information can be read and written from PCI configuration
space. A PCI device is identified in configuration space by a unique bus, slot
and function number. Each PCI slot can have up to 8 functions and interface to
another PCI sub-bus by implementing a PCI-to-PCI bridge according to the PCI
Bridge Architecture specification.
Using the unique \[bus:slot:func] any device can be configured regardless of how
PCI is currently set up as long as all PCI buses are enumerated correctly. The
enumeration is done during probing, all bridges are given a bus number in
order for the bridges to respond to accesses from both directions. The PCI
library can assign address ranges to which a PCI device should respond using
Plug & Play technique or a static user defined configuration. After the
configuration has been performed the PCI device drivers can find devices by
the read-only PCI Class type, Vendor ID and Device ID information found in
configuration space for each device.
Using the unique \[bus:slot:func] any device can be configured regardless of
how PCI is currently set up as long as all PCI buses are enumerated
correctly. The enumeration is done during probing, all bridges are given a bus
number in order for the bridges to respond to accesses from both
directions. The PCI library can assign address ranges to which a PCI device
should respond using Plug & Play technique or a static user defined
configuration. After the configuration has been performed the PCI device
drivers can find devices by the read-only PCI Class type, Vendor ID and Device
ID information found in configuration space for each device.
In some systems there is a boot loader or BIOS which have already configured
all PCI devices, but on embedded targets it is quite common that there is no
@ -65,14 +69,13 @@ translate the \[bus:slot:func] into a valid PCI configuration space access.
If the target is not a host, but a peripheral, configuration space can not be
accessed, the peripheral is set up by the host during start up. In complex
embedded PCI systems the peripheral may need to access other PCI boards than
the host. In such systems a custom (static) configuration of both the host
and peripheral may be a convenient solution.
the host. In such systems a custom (static) configuration of both the host and
peripheral may be a convenient solution.
The PCI bus defines four interrupt signals INTA#..INTD#. The interrupt signals
must be mapped into a system interrupt/vector, it is up to the BSP or host
driver to know the mapping, however the BIOS or boot loader may use the
8-bit read/write "Interrupt Line" register to pass the knowledge along to the
OS.
driver to know the mapping, however the BIOS or boot loader may use the 8-bit
read/write "Interrupt Line" register to pass the knowledge along to the OS.
The PCI standard defines and recommends that the backplane route the interupt
lines in a systematic way, however in standard there is no such requirement.
@ -105,8 +108,8 @@ PCI Configuration
During start up the PCI bus must be configured in order for host and
peripherals to access one another using Memory or I/O accesses and that
interrupts are properly handled. Three different spaces are defined and
mapped separately:
interrupts are properly handled. Three different spaces are defined and mapped
separately:
# I/O space (IO)
@ -114,14 +117,14 @@ mapped separately:
# prefetchable Memory space (MEM)
Regions of the same type (I/O or Memory) may not overlap which is guaranteed
by the software. MEM regions may be mapped into MEMIO regions, but MEMIO
regions can not be mapped into MEM, for that could lead to prefetching of
registers. The interrupt pin which a board is driving can be read out from
PCI configuration space, however it is up to software to know how interrupt
signals are routed between PCI-to-PCI bridges and how PCI INT[A..D]# pins are
mapped to system IRQ. In systems where previous software (boot loader or BIOS)
has already set up this the configuration is overwritten or simply read out.
Regions of the same type (I/O or Memory) may not overlap which is guaranteed by
the software. MEM regions may be mapped into MEMIO regions, but MEMIO regions
can not be mapped into MEM, for that could lead to prefetching of
registers. The interrupt pin which a board is driving can be read out from PCI
configuration space, however it is up to software to know how interrupt signals
are routed between PCI-to-PCI bridges and how PCI INT[A..D]# pins are mapped to
system IRQ. In systems where previous software (boot loader or BIOS) has
already set up this the configuration is overwritten or simply read out.
In order to support different configuration methods the following configuration
libraries are selectable by the user:
@ -138,7 +141,8 @@ libraries are selectable by the user:
A host driver can be made to support all three configuration methods, or any
combination. It may be defined by the BSP which approach is used.
The configuration software is called from the PCI driver (pci_config_init()).
The configuration software is called from the PCI driver
(``pci_config_init()``).
Regardless of configuration method a PCI device tree is created in RAM during
initialization, the tree can be accessed to find devices and resources without
@ -148,14 +152,14 @@ device tree at compile time when using the static/peripheral method.
RTEMS Configuration selection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The active configuration method can be selected at compile time in the same
way as other project parameters by including rtems/confdefs.h and setting
The active configuration method can be selected at compile time in the same way
as other project parameters by including rtems/confdefs.h and setting
- CONFIGURE_INIT
- ``CONFIGURE_INIT``
- RTEMS_PCI_CONFIG_LIB
- ``RTEMS_PCI_CONFIG_LIB``
- CONFIGURE_PCI_LIB = PCI_LIB_(AUTO,STATIC,READ,PERIPHERAL)
- ``CONFIGURE_PCI_LIB`` = PCI_LIB_(AUTO,STATIC,READ,PERIPHERAL)
See the RTEMS configuration section how to setup the PCI library.
@ -163,17 +167,17 @@ Auto Configuration
~~~~~~~~~~~~~~~~~~
The auto configuration software enumerates PCI buses and initializes all PCI
devices found using Plug & Play. The auto configuration software requires
that a configuration setup has been registered by the driver or BSP in order
to setup the I/O and Memory regions at the correct address ranges. PCI
interrupt pins can optionally be routed over PCI-to-PCI bridges and mapped
to a system interrupt number. BAR resources are sorted by size and required
alignment, unused "dead" space may be created when PCI bridges are present
due to the PCI bridge window size does not equal the alignment. To cope with
that resources are reordered to fit smaller BARs into the dead space to minimize
the PCI space required. If a BAR or ROM register can not be allocated a PCI
address region (due to too few resources available) the register will be given
the value of pci_invalid_address which defaults to 0.
devices found using Plug & Play. The auto configuration software requires that
a configuration setup has been registered by the driver or BSP in order to
setup the I/O and Memory regions at the correct address ranges. PCI interrupt
pins can optionally be routed over PCI-to-PCI bridges and mapped to a system
interrupt number. BAR resources are sorted by size and required alignment,
unused "dead" space may be created when PCI bridges are present due to the PCI
bridge window size does not equal the alignment. To cope with that resources
are reordered to fit smaller BARs into the dead space to minimize the PCI space
required. If a BAR or ROM register can not be allocated a PCI address region
(due to too few resources available) the register will be given the value of
pci_invalid_address which defaults to 0.
The auto configuration routines support:
@ -185,8 +189,7 @@ The auto configuration routines support:
- memory space (MEMIO)
- prefetchable memory space (MEM), if not present MEM will be mapped into
MEMIO
- prefetchable memory space (MEM), if not present MEM will be mapped into MEMIO
- multiple PCI buses - PCI-to-PCI bridges
@ -224,8 +227,8 @@ Static Configuration
~~~~~~~~~~~~~~~~~~~~
To support custom configurations and small-footprint PCI systems, the user may
provide the PCI device tree which contains the current configuration. The
PCI buses are enumerated and all resources are written to PCI devices during
provide the PCI device tree which contains the current configuration. The PCI
buses are enumerated and all resources are written to PCI devices during
initialization. When this approach is selected PCI boards must be located at
the same slots every time and devices can not be removed or added, Plug & Play
is not performed. Boards of the same type may of course be exchanged.
@ -240,13 +243,13 @@ Peripheral Configuration
On systems where a peripheral PCI device needs to access other PCI devices than
the host the peripheral configuration approach may be handy. Most PCI devices
answers on the PCI host's requests and start DMA accesses into the Hosts memory,
however in some complex systems PCI devices may want to access other devices
on the same bus or at another PCI bus.
answers on the PCI host's requests and start DMA accesses into the Hosts
memory, however in some complex systems PCI devices may want to access other
devices on the same bus or at another PCI bus.
A PCI peripheral is not allowed to do PCI configuration cycles, which
means that it must either rely on the host to give it the addresses it
needs, or that the addresses are predefined.
A PCI peripheral is not allowed to do PCI configuration cycles, which means
that it must either rely on the host to give it the addresses it needs, or that
the addresses are predefined.
This configuration approach is very similar to the static option, however the
configuration is never written to PCI bus, instead it is only used for drivers
@ -256,8 +259,8 @@ PCI Access
----------
The PCI access routines are low-level routines provided for drivers,
configuration software, etc. in order to access different regions in a way
not dependent upon the host driver, BSP or platform.
configuration software, etc. in order to access different regions in a way not
dependent upon the host driver, BSP or platform.
- PCI configuration space
@ -275,26 +278,28 @@ configuration space access.
Some non-standard hardware may also define the PCI bus big-endian, for example
the LEON2 AT697 PCI host bridge and some LEON3 systems may be configured that
way. It is up to the BSP to set the appropriate PCI endianness on compile time
(BSP_PCI_BIG_ENDIAN) in order for inline macros to be correctly defined.
Another possibility is to use the function pointers defined by the access
layer to implement drivers that support "run-time endianness detection".
(``BSP_PCI_BIG_ENDIAN``) in order for inline macros to be correctly defined.
Another possibility is to use the function pointers defined by the access layer
to implement drivers that support "run-time endianness detection".
Configuration space
~~~~~~~~~~~~~~~~~~~
Configuration space is accessed using the routines listed below. The
pci_dev_t type is used to specify a specific PCI bus, device and function. It
is up to the host driver or BSP to create a valid access to the requested
PCI slot. Requests made to slots that are not supported by hardware should
result in PCISTS_MSTABRT and/or data must be ignored (writes) or 0xffffffff
is always returned (reads).
Configuration space is accessed using the routines listed below. The pci_dev_t
type is used to specify a specific PCI bus, device and function. It is up to
the host driver or BSP to create a valid access to the requested PCI
slot. Requests made to slots that are not supported by hardware should result
in ``PCISTS_MSTABRT`` and/or data must be ignored (writes) or ``0xFFFFFFFF`` is
always returned (reads).
.. code:: c
/* Configuration Space Access Read Routines \*/
extern int pci_cfg_r8(pci_dev_t dev, int ofs, uint8_t \*data);
extern int pci_cfg_r16(pci_dev_t dev, int ofs, uint16_t \*data);
extern int pci_cfg_r32(pci_dev_t dev, int ofs, uint32_t \*data);
/* Configuration Space Access Write Routines \*/
/* Configuration Space Access Read Routines */
extern int pci_cfg_r8(pci_dev_t dev, int ofs, uint8_t *data);
extern int pci_cfg_r16(pci_dev_t dev, int ofs, uint16_t *data);
extern int pci_cfg_r32(pci_dev_t dev, int ofs, uint32_t *data);
/* Configuration Space Access Write Routines */
extern int pci_cfg_w8(pci_dev_t dev, int ofs, uint8_t data);
extern int pci_cfg_w16(pci_dev_t dev, int ofs, uint16_t data);
extern int pci_cfg_w32(pci_dev_t dev, int ofs, uint32_t data);
@ -305,20 +310,20 @@ I/O space
The BSP or driver provide special routines in order to access I/O space. Some
architectures have a special instruction accessing I/O space, others have it
mapped into a "PCI I/O window" in the standard address space accessed by the
CPU. The window size may vary and must be taken into consideration by the
host driver. The below routines must be used to access I/O space. The address
given to the functions is not the PCI I/O addresses, the caller must have
translated PCI I/O addresses (available in the PCI BARs) into a BSP or host
driver custom address, see `Access functions`_ for how
addresses are translated.
CPU. The window size may vary and must be taken into consideration by the host
driver. The below routines must be used to access I/O space. The address given
to the functions is not the PCI I/O addresses, the caller must have translated
PCI I/O addresses (available in the PCI BARs) into a BSP or host driver custom
address, see `Access functions`_ for how addresses are translated.
.. code:: c
/* Read a register over PCI I/O Space \*/
/* Read a register over PCI I/O Space */
extern uint8_t pci_io_r8(uint32_t adr);
extern uint16_t pci_io_r16(uint32_t adr);
extern uint32_t pci_io_r32(uint32_t adr);
/* Write a register over PCI I/O Space \*/
/* Write a register over PCI I/O Space */
extern void pci_io_w8(uint32_t adr, uint8_t data);
extern void pci_io_w16(uint32_t adr, uint16_t data);
extern void pci_io_w32(uint32_t adr, uint32_t data);
@ -334,52 +339,53 @@ memory space. This leads to register content being swapped, which must be
swapped back. The below routines makes it possible to access registers over PCI
memory space in a portable way on different architectures, the BSP or
architecture must provide necessary functions in order to implement this.
.. code:: c
static inline uint16_t pci_ld_le16(volatile uint16_t \*addr);
static inline void pci_st_le16(volatile uint16_t \*addr, uint16_t val);
static inline uint32_t pci_ld_le32(volatile uint32_t \*addr);
static inline void pci_st_le32(volatile uint32_t \*addr, uint32_t val);
static inline uint16_t pci_ld_be16(volatile uint16_t \*addr);
static inline void pci_st_be16(volatile uint16_t \*addr, uint16_t val);
static inline uint32_t pci_ld_be32(volatile uint32_t \*addr);
static inline void pci_st_be32(volatile uint32_t \*addr, uint32_t val);
static inline uint16_t pci_ld_le16(volatile uint16_t *addr);
static inline void pci_st_le16(volatile uint16_t *addr, uint16_t val);
static inline uint32_t pci_ld_le32(volatile uint32_t *addr);
static inline void pci_st_le32(volatile uint32_t *addr, uint32_t val);
static inline uint16_t pci_ld_be16(volatile uint16_t *addr);
static inline void pci_st_be16(volatile uint16_t *addr, uint16_t val);
static inline uint32_t pci_ld_be32(volatile uint32_t *addr);
static inline void pci_st_be32(volatile uint32_t *addr, uint32_t val);
In order to support non-standard big-endian PCI bus the above pci_* functions
is required, pci_ld_le16 != ld_le16 on big endian PCI buses.
In order to support non-standard big-endian PCI bus the above ``pci_*``
functions is required, ``pci_ld_le16 != ld_le16`` on big endian PCI buses.
Access functions
~~~~~~~~~~~~~~~~
The PCI Access Library can provide device drivers with function pointers
executing the above Configuration, I/O and Memory space accesses. The
functions have the same arguments and return values as the above
functions.
executing the above Configuration, I/O and Memory space accesses. The functions
have the same arguments and return values as the above functions.
The pci_access_func() function defined below can be used to get a function
pointer of a specific access type.
.. code:: c
/* Get Read/Write function for accessing a register over PCI Memory Space
* (non-inline functions).
*
* Arguments
* wr 0(Read), 1(Write)
* size 1(Byte), 2(Word), 4(Double Word)
* func Where function pointer will be stored
* endian PCI_LITTLE_ENDIAN or PCI_BIG_ENDIAN
* type 1(I/O), 3(REG over MEM), 4(CFG)
*
* Return
* 0 Found function
* others No such function defined by host driver or BSP
\*/
int pci_access_func(int wr, int size, void \**func, int endian, int type);
* (non-inline functions).
*
* Arguments
* wr 0(Read), 1(Write)
* size 1(Byte), 2(Word), 4(Double Word)
* func Where function pointer will be stored
* endian PCI_LITTLE_ENDIAN or PCI_BIG_ENDIAN
* type 1(I/O), 3(REG over MEM), 4(CFG)
*
* Return
* 0 Found function
* others No such function defined by host driver or BSP
*/
int pci_access_func(int wr, int size, void **func, int endian, int type);
PCI device drivers may be written to support run-time detection of endianess,
this is mosly for debugging or for development systems. When the product is
finally deployed macros switch to using the inline functions instead which
have been configured for the correct endianness.
finally deployed macros switch to using the inline functions instead which have
been configured for the correct endianness.
PCI address translation
~~~~~~~~~~~~~~~~~~~~~~~
@ -390,23 +396,25 @@ using configuration space routines or in the device tree, the addresses given
are PCI addresses. The below functions can be used to translate PCI addresses
into CPU accessible addresses or vice versa, translation may be different for
different PCI spaces/regions.
.. code:: c
/* Translate PCI address into CPU accessible address \*/
static inline int pci_pci2cpu(uint32_t \*address, int type);
/* Translate CPU accessible address into PCI address (for DMA) \*/
static inline int pci_cpu2pci(uint32_t \*address, int type);
/* Translate PCI address into CPU accessible address */
static inline int pci_pci2cpu(uint32_t *address, int type);
/* Translate CPU accessible address into PCI address (for DMA) */
static inline int pci_cpu2pci(uint32_t *address, int type);
PCI Interrupt
-------------
The PCI specification defines four different interrupt lines INTA#..INTD#,
the interrupts are low level sensitive which make it possible to support
multiple interrupt sources on the same interrupt line. Since the lines are
level sensitive the interrupt sources must be acknowledged before clearing the
The PCI specification defines four different interrupt lines INTA#..INTD#, the
interrupts are low level sensitive which make it possible to support multiple
interrupt sources on the same interrupt line. Since the lines are level
sensitive the interrupt sources must be acknowledged before clearing the
interrupt contoller, or the interrupt controller must be masked. The BSP must
provide a routine for clearing/acknowledging the interrupt controller, it is
up to the interrupt service routine to acknowledge the interrupt source.
provide a routine for clearing/acknowledging the interrupt controller, it is up
to the interrupt service routine to acknowledge the interrupt source.
The PCI Library relies on the BSP for implementing shared interrupt handling
through the BSP_PCI_shared_interrupt_* functions/macros, they must be defined
@ -423,10 +431,3 @@ PCI Shell command
The RTEMS shell has a PCI command 'pci' which makes it possible to read/write
configuration space, print the current PCI configuration and print out a
configuration C-file for the static or peripheral library.
.. COMMENT: COPYRIGHT (c) 1988-2007.
.. COMMENT: On-Line Applications Research Corporation (OAR).
.. COMMENT: All rights reserved.

View File

@ -1,11 +1,15 @@
.. COMMENT: COPYRIGHT (c) 2011,2015
.. COMMENT: Aeroflex Gaisler AB
.. COMMENT: All rights reserved.
Symmetric Multiprocessing Services
##################################
Introduction
============
The Symmetric Multiprocessing (SMP) support of the RTEMS 4.10.99.0 is
available on
The Symmetric Multiprocessing (SMP) support of the RTEMS 4.11.0 and later is available
on
- ARM,
@ -13,34 +17,38 @@ available on
- SPARC.
It must be explicitly enabled via the ``--enable-smp`` configure command
line option. To enable SMP in the application configuration see `Enable SMP Support for Applications`_. The default
scheduler for SMP applications supports up to 32 processors and is a global
fixed priority scheduler, see also `Configuring Clustered Schedulers`_. For example applications see:file:`testsuites/smptests`.
It must be explicitly enabled via the ``--enable-smp`` configure command line
option. To enable SMP in the application configuration see `Enable SMP Support
for Applications`_. The default scheduler for SMP applications supports up to
32 processors and is a global fixed priority scheduler, see also
:ref:`Configuring Clustered Schedulers`. For example applications
see:file:`testsuites/smptests`.
*WARNING: The SMP support in RTEMS is work in progress. Before you
start using this RTEMS version for SMP ask on the RTEMS mailing list.*
.. warning::
The SMP support in RTEMS is work in progress. Before you start using this
RTEMS version for SMP ask on the RTEMS mailing list.
This chapter describes the services related to Symmetric Multiprocessing
provided by RTEMS.
The application level services currently provided are:
- ``rtems_get_processor_count`` - Get processor count
- rtems_get_processor_count_ - Get processor count
- ``rtems_get_current_processor`` - Get current processor index
- rtems_get_current_processor_ - Get current processor index
- ``rtems_scheduler_ident`` - Get ID of a scheduler
- rtems_scheduler_ident_ - Get ID of a scheduler
- ``rtems_scheduler_get_processor_set`` - Get processor set of a scheduler
- rtems_scheduler_get_processor_set_ - Get processor set of a scheduler
- ``rtems_task_get_scheduler`` - Get scheduler of a task
- rtems_task_get_scheduler_ - Get scheduler of a task
- ``rtems_task_set_scheduler`` - Set scheduler of a task
- rtems_task_set_scheduler_ - Set scheduler of a task
- ``rtems_task_get_affinity`` - Get task processor affinity
- rtems_task_get_affinity_ - Get task processor affinity
- ``rtems_task_set_affinity`` - Set task processor affinity
- rtems_task_set_affinity_ - Set task processor affinity
Background
==========
@ -56,65 +64,62 @@ taken for granted:
- hardware events result in interrupts
There is no true parallelism. Even when interrupts appear to occur
at the same time, they are processed in largely a serial fashion.
This is true even when the interupt service routines are allowed to
nest. From a tasking viewpoint, it is the responsibility of the real-time
operatimg system to simulate parallelism by switching between tasks.
These task switches occur in response to hardware interrupt events and explicit
application events such as blocking for a resource or delaying.
There is no true parallelism. Even when interrupts appear to occur at the same
time, they are processed in largely a serial fashion. This is true even when
the interupt service routines are allowed to nest. From a tasking viewpoint,
it is the responsibility of the real-time operatimg system to simulate
parallelism by switching between tasks. These task switches occur in response
to hardware interrupt events and explicit application events such as blocking
for a resource or delaying.
With symmetric multiprocessing, the presence of multiple processors
allows for true concurrency and provides for cost-effective performance
improvements. Uniprocessors tend to increase performance by increasing
clock speed and complexity. This tends to lead to hot, power hungry
microprocessors which are poorly suited for many embedded applications.
With symmetric multiprocessing, the presence of multiple processors allows for
true concurrency and provides for cost-effective performance
improvements. Uniprocessors tend to increase performance by increasing clock
speed and complexity. This tends to lead to hot, power hungry microprocessors
which are poorly suited for many embedded applications.
The true concurrency is in sharp contrast to the single task and
interrupt model of uniprocessor systems. This results in a fundamental
change to uniprocessor system characteristics listed above. Developers
are faced with a different set of characteristics which, in turn, break
some existing assumptions and result in new challenges. In an SMP system
with N processors, these are the new execution characteristics.
The true concurrency is in sharp contrast to the single task and interrupt
model of uniprocessor systems. This results in a fundamental change to
uniprocessor system characteristics listed above. Developers are faced with a
different set of characteristics which, in turn, break some existing
assumptions and result in new challenges. In an SMP system with N processors,
these are the new execution characteristics.
- N tasks execute in parallel
- hardware events result in interrupts
There is true parallelism with a task executing on each processor and
the possibility of interrupts occurring on each processor. Thus in contrast
to their being one task and one interrupt to consider on a uniprocessor,
there are N tasks and potentially N simultaneous interrupts to consider
on an SMP system.
There is true parallelism with a task executing on each processor and the
possibility of interrupts occurring on each processor. Thus in contrast to
their being one task and one interrupt to consider on a uniprocessor, there are
N tasks and potentially N simultaneous interrupts to consider on an SMP system.
This increase in hardware complexity and presence of true parallelism
results in the application developer needing to be even more cautious
about mutual exclusion and shared data access than in a uniprocessor
embedded system. Race conditions that never or rarely happened when an
application executed on a uniprocessor system, become much more likely
due to multiple threads executing in parallel. On a uniprocessor system,
these race conditions would only happen when a task switch occurred at
just the wrong moment. Now there are N-1 tasks executing in parallel
all the time and this results in many more opportunities for small
windows in critical sections to be hit.
This increase in hardware complexity and presence of true parallelism results
in the application developer needing to be even more cautious about mutual
exclusion and shared data access than in a uniprocessor embedded system. Race
conditions that never or rarely happened when an application executed on a
uniprocessor system, become much more likely due to multiple threads executing
in parallel. On a uniprocessor system, these race conditions would only happen
when a task switch occurred at just the wrong moment. Now there are N-1 tasks
executing in parallel all the time and this results in many more opportunities
for small windows in critical sections to be hit.
Task Affinity
-------------
.. index:: task affinity
.. index:: thread affinity
RTEMS provides services to manipulate the affinity of a task. Affinity
is used to specify the subset of processors in an SMP system on which
a particular task can execute.
RTEMS provides services to manipulate the affinity of a task. Affinity is used
to specify the subset of processors in an SMP system on which a particular task
can execute.
By default, tasks have an affinity which allows them to execute on any
available processor.
Task affinity is a possible feature to be supported by SMP-aware
schedulers. However, only a subset of the available schedulers support
affinity. Although the behavior is scheduler specific, if the scheduler
does not support affinity, it is likely to ignore all attempts to set
affinity.
affinity. Although the behavior is scheduler specific, if the scheduler does
not support affinity, it is likely to ignore all attempts to set affinity.
The scheduler with support for arbitary processor affinities uses a proof of
concept implementation. See https://devel.rtems.org/ticket/2510.
@ -130,12 +135,13 @@ to another. There are three reasons why tasks migrate in RTEMS.
- The scheduler changes explicitly via ``rtems_task_set_scheduler()`` or
similar directives.
- The task resumes execution after a blocking operation. On a priority
based scheduler it will evict the lowest priority task currently assigned to a
- The task resumes execution after a blocking operation. On a priority based
scheduler it will evict the lowest priority task currently assigned to a
processor in the processor set managed by the scheduler instance.
- The task moves temporarily to another scheduler instance due to locking
protocols like *Migratory Priority Inheritance* or the*Multiprocessor Resource Sharing Protocol*.
protocols like *Migratory Priority Inheritance* or the *Multiprocessor
Resource Sharing Protocol*.
Task migration should be avoided so that the working set of a task can stay on
the most local cache level.
@ -173,8 +179,9 @@ clusters. Clusters with a cardinality of one are partitions. Each cluster is
owned by exactly one scheduler instance.
Clustered scheduling helps to control the worst-case latencies in
multi-processor systems, see *Brandenburg, Bjorn B.: Scheduling and
Locking in Multiprocessor Real-Time Operating Systems. PhD thesis, 2011.http://www.cs.unc.edu/~bbb/diss/brandenburg-diss.pdf*. The goal is to
multi-processor systems, see *Brandenburg, Bjorn B.: Scheduling and Locking in
Multiprocessor Real-Time Operating Systems. PhD thesis,
2011.http://www.cs.unc.edu/~bbb/diss/brandenburg-diss.pdf*. The goal is to
reduce the amount of shared state in the system and thus prevention of lock
contention. Modern multi-processor systems tend to have several layers of data
and instruction caches. With clustered scheduling it is possible to honour the
@ -188,8 +195,8 @@ available
- message queues,
- semaphores using the `Priority Inheritance`_
protocol (priority boosting), and
- semaphores using the `Priority Inheritance`_ protocol (priority boosting),
and
- semaphores using the `Multiprocessor Resource Sharing Protocol`_ (MrsP).
@ -198,9 +205,10 @@ real-time requirements and functions that profit from fairness and high
throughput provided the scheduler instances are fully decoupled and adequate
inter-cluster synchronization primitives are used. This is work in progress.
For the configuration of clustered schedulers see `Configuring Clustered Schedulers`_.
For the configuration of clustered schedulers see `Configuring Clustered
Schedulers`_.
To set the scheduler of a task see `SCHEDULER_IDENT - Get ID of a scheduler`_
To set the scheduler of a task see `SCHEDULER_IDENT - Get ID of a scheduler`_
and `TASK_SET_SCHEDULER - Set scheduler of a task`_.
Task Priority Queues
@ -220,9 +228,11 @@ appended to the FIFO. To dequeue a task the highest priority task of the first
priority queue in the FIFO is selected. Then the first priority queue is
removed from the FIFO. In case the previously first priority queue is not
empty, then it is appended to the FIFO. So there is FIFO fairness with respect
to the highest priority task of each scheduler instances. See also *Brandenburg, Bjorn B.: A fully preemptive multiprocessor semaphore protocol for
latency-sensitive real-time applications. In Proceedings of the 25th Euromicro
Conference on Real-Time Systems (ECRTS 2013), pages 292-302, 2013.http://www.mpi-sws.org/~bbb/papers/pdf/ecrts13b.pdf*.
to the highest priority task of each scheduler instances. See also
*Brandenburg, Bjorn B.: A fully preemptive multiprocessor semaphore protocol
for latency-sensitive real-time applications. In Proceedings of the 25th
Euromicro Conference on Real-Time Systems (ECRTS 2013), pages 292-302,
2013.http://www.mpi-sws.org/~bbb/papers/pdf/ecrts13b.pdf*.
Such a two level queue may need a considerable amount of memory if fast enqueue
and dequeue operations are desired (depends on the scheduler instance count).
@ -242,11 +252,11 @@ for the task itself. In case a task needs to block, then there are two options
In case the task is dequeued, then there are two options
- the task is the last task on the queue, then it removes this queue from
the object and reclaims it for its own purpose, or
- the task is the last task on the queue, then it removes this queue from the
object and reclaims it for its own purpose, or
- otherwise, then the task removes one queue from the free list of the
object and reclaims it for its own purpose.
- otherwise, then the task removes one queue from the free list of the object
and reclaims it for its own purpose.
Since there are usually more objects than tasks, this actually reduces the
memory demands. In addition the objects contain only a pointer to the task
@ -257,39 +267,40 @@ and OpenMP run-time support).
Scheduler Helping Protocol
--------------------------
The scheduler provides a helping protocol to support locking protocols like*Migratory Priority Inheritance* or the *Multiprocessor Resource
Sharing Protocol*. Each ready task can use at least one scheduler node at a
time to gain access to a processor. Each scheduler node has an owner, a user
and an optional idle task. The owner of a scheduler node is determined a task
The scheduler provides a helping protocol to support locking protocols like
*Migratory Priority Inheritance* or the *Multiprocessor Resource Sharing
Protocol*. Each ready task can use at least one scheduler node at a time to
gain access to a processor. Each scheduler node has an owner, a user and an
optional idle task. The owner of a scheduler node is determined a task
creation and never changes during the life time of a scheduler node. The user
of a scheduler node may change due to the scheduler helping protocol. A
scheduler node is in one of the four scheduler help states:
:dfn:`help yourself`
This scheduler node is solely used by the owner task. This task owns no
resources using a helping protocol and thus does not take part in the scheduler
helping protocol. No help will be provided for other tasks.
resources using a helping protocol and thus does not take part in the
scheduler helping protocol. No help will be provided for other tasks.
:dfn:`help active owner`
This scheduler node is owned by a task actively owning a resource and can be
used to help out tasks.
In case this scheduler node changes its state from ready to scheduled and the
task executes using another node, then an idle task will be provided as a user
of this node to temporarily execute on behalf of the owner task. Thus lower
priority tasks are denied access to the processors of this scheduler instance.
In case a task actively owning a resource performs a blocking operation, then
an idle task will be used also in case this node is in the scheduled state.
This scheduler node is owned by a task actively owning a resource and can
be used to help out tasks. In case this scheduler node changes its state
from ready to scheduled and the task executes using another node, then an
idle task will be provided as a user of this node to temporarily execute on
behalf of the owner task. Thus lower priority tasks are denied access to
the processors of this scheduler instance. In case a task actively owning
a resource performs a blocking operation, then an idle task will be used
also in case this node is in the scheduled state.
:dfn:`help active rival`
This scheduler node is owned by a task actively obtaining a resource currently
owned by another task and can be used to help out tasks.
The task owning this node is ready and will give away its processor in case the
This scheduler node is owned by a task actively obtaining a resource
currently owned by another task and can be used to help out tasks. The
task owning this node is ready and will give away its processor in case the
task owning the resource asks for help.
:dfn:`help passive`
This scheduler node is owned by a task obtaining a resource currently owned by
another task and can be used to help out tasks.
The task owning this node is blocked.
This scheduler node is owned by a task obtaining a resource currently owned
by another task and can be used to help out tasks. The task owning this
node is blocked.
The following scheduler operations return a task in need for help
@ -324,15 +335,15 @@ the system depends on the maximum resource tree size of the application.
Critical Section Techniques and SMP
-----------------------------------
As discussed earlier, SMP systems have opportunities for true parallelism
which was not possible on uniprocessor systems. Consequently, multiple
techniques that provided adequate critical sections on uniprocessor
systems are unsafe on SMP systems. In this section, some of these
unsafe techniques will be discussed.
As discussed earlier, SMP systems have opportunities for true parallelism which
was not possible on uniprocessor systems. Consequently, multiple techniques
that provided adequate critical sections on uniprocessor systems are unsafe on
SMP systems. In this section, some of these unsafe techniques will be
discussed.
In general, applications must use proper operating system provided mutual
exclusion mechanisms to ensure correct behavior. This primarily means
the use of binary semaphores or mutexes to implement critical sections.
exclusion mechanisms to ensure correct behavior. This primarily means the use
of binary semaphores or mutexes to implement critical sections.
Disable Interrupts and Interrupt Locks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -369,80 +380,85 @@ to simple interrupt disable/enable sequences. It is disallowed to acquire a
single interrupt lock in a nested way. This will result in an infinite loop
with interrupts disabled. While converting legacy code to interrupt locks care
must be taken to avoid this situation.
.. code:: c
.. code-block:: c
:linenos:
void legacy_code_with_interrupt_disable_enable( void )
{
rtems_interrupt_level level;
rtems_interrupt_disable( level );
/* Some critical stuff \*/
rtems_interrupt_enable( level );
rtems_interrupt_level level;
rtems_interrupt_disable( level );
/* Some critical stuff */
rtems_interrupt_enable( level );
}
RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" )
RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" );
void smp_ready_code_with_interrupt_lock( void )
{
rtems_interrupt_lock_context lock_context;
rtems_interrupt_lock_acquire( &lock, &lock_context );
/* Some critical stuff \*/
rtems_interrupt_lock_release( &lock, &lock_context );
rtems_interrupt_lock_context lock_context;
rtems_interrupt_lock_acquire( &lock, &lock_context );
/* Some critical stuff */
rtems_interrupt_lock_release( &lock, &lock_context );
}
The ``rtems_interrupt_lock`` structure is empty on uni-processor
configurations. Empty structures have a different size in C
(implementation-defined, zero in case of GCC) and C++ (implementation-defined
non-zero value, one in case of GCC). Thus the``RTEMS_INTERRUPT_LOCK_DECLARE()``, ``RTEMS_INTERRUPT_LOCK_DEFINE()``,``RTEMS_INTERRUPT_LOCK_MEMBER()``, and``RTEMS_INTERRUPT_LOCK_REFERENCE()`` macros are provided to ensure ABI
compatibility.
non-zero value, one in case of GCC). Thus the
``RTEMS_INTERRUPT_LOCK_DECLARE()``, ``RTEMS_INTERRUPT_LOCK_DEFINE()``,
``RTEMS_INTERRUPT_LOCK_MEMBER()``, and ``RTEMS_INTERRUPT_LOCK_REFERENCE()``
macros are provided to ensure ABI compatibility.
Highest Priority Task Assumption
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
On a uniprocessor system, it is safe to assume that when the highest
priority task in an application executes, it will execute without being
preempted until it voluntarily blocks. Interrupts may occur while it is
executing, but there will be no context switch to another task unless
the highest priority task voluntarily initiates it.
On a uniprocessor system, it is safe to assume that when the highest priority
task in an application executes, it will execute without being preempted until
it voluntarily blocks. Interrupts may occur while it is executing, but there
will be no context switch to another task unless the highest priority task
voluntarily initiates it.
Given the assumption that no other tasks will have their execution
interleaved with the highest priority task, it is possible for this
task to be constructed such that it does not need to acquire a binary
semaphore or mutex for protected access to shared data.
Given the assumption that no other tasks will have their execution interleaved
with the highest priority task, it is possible for this task to be constructed
such that it does not need to acquire a binary semaphore or mutex for protected
access to shared data.
In an SMP system, it cannot be assumed there will never be a single task
executing. It should be assumed that every processor is executing another
application task. Further, those tasks will be ones which would not have
been executed in a uniprocessor configuration and should be assumed to
have data synchronization conflicts with what was formerly the highest
priority task which executed without conflict.
application task. Further, those tasks will be ones which would not have been
executed in a uniprocessor configuration and should be assumed to have data
synchronization conflicts with what was formerly the highest priority task
which executed without conflict.
Disable Preemption
~~~~~~~~~~~~~~~~~~
On a uniprocessor system, disabling preemption in a task is very similar
to making the highest priority task assumption. While preemption is
disabled, no task context switches will occur unless the task initiates
them voluntarily. And, just as with the highest priority task assumption,
there are N-1 processors also running tasks. Thus the assumption that no
other tasks will run while the task has preemption disabled is violated.
On a uniprocessor system, disabling preemption in a task is very similar to
making the highest priority task assumption. While preemption is disabled, no
task context switches will occur unless the task initiates them
voluntarily. And, just as with the highest priority task assumption, there are
N-1 processors also running tasks. Thus the assumption that no other tasks will
run while the task has preemption disabled is violated.
Task Unique Data and SMP
------------------------
Per task variables are a service commonly provided by real-time operating
systems for application use. They work by allowing the application
to specify a location in memory (typically a ``void *``) which is
logically added to the context of a task. On each task switch, the
location in memory is stored and each task can have a unique value in
the same memory location. This memory location is directly accessed as a
variable in a program.
systems for application use. They work by allowing the application to specify a
location in memory (typically a ``void *``) which is logically added to the
context of a task. On each task switch, the location in memory is stored and
each task can have a unique value in the same memory location. This memory
location is directly accessed as a variable in a program.
This works well in a uniprocessor environment because there is one task
executing and one memory location containing a task-specific value. But
it is fundamentally broken on an SMP system because there are always N
tasks executing. With only one location in memory, N-1 tasks will not
have the correct value.
executing and one memory location containing a task-specific value. But it is
fundamentally broken on an SMP system because there are always N tasks
executing. With only one location in memory, N-1 tasks will not have the
correct value.
This paradigm for providing task unique data values is fundamentally
broken on SMP systems.
This paradigm for providing task unique data values is fundamentally broken on
SMP systems.
Classic API Per Task Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -479,50 +495,54 @@ configuration of libgomp. In addition application configurable thread pools
for each scheduler instance are available in GCC 6.1 or later.
The run-time configuration of libgomp is done via environment variables
documented in the `libgomp
manual <https://gcc.gnu.org/onlinedocs/libgomp/>`_. The environment variables are evaluated in a constructor function
which executes in the context of the first initialization task before the
actual initialization task function is called (just like a global C++
constructor). To set application specific values, a higher priority
constructor function must be used to set up the environment variables.
documented in the `libgomp manual <https://gcc.gnu.org/onlinedocs/libgomp/>`_.
The environment variables are evaluated in a constructor function which
executes in the context of the first initialization task before the actual
initialization task function is called (just like a global C++ constructor).
To set application specific values, a higher priority constructor function must
be used to set up the environment variables.
.. code:: c
#include <stdlib.h>
void __attribute__((constructor(1000))) config_libgomp( void )
{
setenv( "OMP_DISPLAY_ENV", "VERBOSE", 1 );
setenv( "GOMP_SPINCOUNT", "30000", 1 );
setenv( "GOMP_RTEMS_THREAD_POOLS", "1$2@SCHD", 1 );
setenv( "OMP_DISPLAY_ENV", "VERBOSE", 1 );
setenv( "GOMP_SPINCOUNT", "30000", 1 );
setenv( "GOMP_RTEMS_THREAD_POOLS", "1$2@SCHD", 1 );
}
The environment variable ``GOMP_RTEMS_THREAD_POOLS`` is RTEMS-specific. It
determines the thread pools for each scheduler instance. The format for``GOMP_RTEMS_THREAD_POOLS`` is a list of optional``<thread-pool-count>[$<priority>]@<scheduler-name>`` configurations
separated by ``:`` where:
determines the thread pools for each scheduler instance. The format for
``GOMP_RTEMS_THREAD_POOLS`` is a list of optional
``<thread-pool-count>[$<priority>]@<scheduler-name>`` configurations separated
by ``:`` where:
- ``<thread-pool-count>`` is the thread pool count for this scheduler
instance.
- ``<thread-pool-count>`` is the thread pool count for this scheduler instance.
- ``$<priority>`` is an optional priority for the worker threads of a
thread pool according to ``pthread_setschedparam``. In case a priority
value is omitted, then a worker thread will inherit the priority of the OpenMP
master thread that created it. The priority of the worker thread is not
changed by libgomp after creation, even if a new OpenMP master thread using the
worker has a different priority.
- ``$<priority>`` is an optional priority for the worker threads of a thread
pool according to ``pthread_setschedparam``. In case a priority value is
omitted, then a worker thread will inherit the priority of the OpenMP master
thread that created it. The priority of the worker thread is not changed by
libgomp after creation, even if a new OpenMP master thread using the worker
has a different priority.
- ``@<scheduler-name>`` is the scheduler instance name according to the
RTEMS application configuration.
- ``@<scheduler-name>`` is the scheduler instance name according to the RTEMS
application configuration.
In case no thread pool configuration is specified for a scheduler instance,
then each OpenMP master thread of this scheduler instance will use its own
dynamically allocated thread pool. To limit the worker thread count of the
thread pools, each OpenMP master thread must call ``omp_set_num_threads``.
Lets suppose we have three scheduler instances ``IO``, ``WRK0``, and``WRK1`` with ``GOMP_RTEMS_THREAD_POOLS`` set to``"1@WRK0:3$4@WRK1"``. Then there are no thread pool restrictions for
scheduler instance ``IO``. In the scheduler instance ``WRK0`` there is
one thread pool available. Since no priority is specified for this scheduler
instance, the worker thread inherits the priority of the OpenMP master thread
that created it. In the scheduler instance ``WRK1`` there are three thread
pools available and their worker threads run at priority four.
Lets suppose we have three scheduler instances ``IO``, ``WRK0``, and ``WRK1``
with ``GOMP_RTEMS_THREAD_POOLS`` set to ``"1@WRK0:3$4@WRK1"``. Then there are
no thread pool restrictions for scheduler instance ``IO``. In the scheduler
instance ``WRK0`` there is one thread pool available. Since no priority is
specified for this scheduler instance, the worker thread inherits the priority
of the OpenMP master thread that created it. In the scheduler instance
``WRK1`` there are three thread pools available and their worker threads run at
priority four.
Thread Dispatch Details
-----------------------
@ -548,10 +568,10 @@ variables,
Updates of the heir thread and the thread dispatch necessary indicator are
synchronized via explicit memory barriers without the use of locks. A thread
can be an heir thread on at most one processor in the system. The thread context
is protected by a TTAS lock embedded in the context to ensure that it is used
on at most one processor at a time. The thread post-switch actions use a
per-processor lock. This implementation turned out to be quite efficient and
can be an heir thread on at most one processor in the system. The thread
context is protected by a TTAS lock embedded in the context to ensure that it
is used on at most one processor at a time. The thread post-switch actions use
a per-processor lock. This implementation turned out to be quite efficient and
no lock contention was observed in the test suite.
The current implementation of thread dispatching has some implications with
@ -607,31 +627,34 @@ lock individual tasks to specific processors. In this way, one can designate a
processor for I/O tasks, another for computation, etc.. The following
illustrates the code sequence necessary to assign a task an affinity for
processor with index ``processor_index``.
.. code:: c
#include <rtems.h>
#include <assert.h>
void pin_to_processor(rtems_id task_id, int processor_index)
{
rtems_status_code sc;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(processor_index, &cpuset);
sc = rtems_task_set_affinity(task_id, sizeof(cpuset), &cpuset);
assert(sc == RTEMS_SUCCESSFUL);
rtems_status_code sc;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(processor_index, &cpuset);
sc = rtems_task_set_affinity(task_id, sizeof(cpuset), &cpuset);
assert(sc == RTEMS_SUCCESSFUL);
}
It is important to note that the ``cpuset`` is not validated until the``rtems_task_set_affinity`` call is made. At that point,
it is validated against the current system configuration.
It is important to note that the ``cpuset`` is not validated until the
``rtems_task_set_affinity`` call is made. At that point, it is validated
against the current system configuration.
Directives
==========
This section details the symmetric multiprocessing services. A subsection
is dedicated to each of these services and describes the calling sequence,
related constants, usage, and status codes.
This section details the symmetric multiprocessing services. A subsection is
dedicated to each of these services and describes the calling sequence, related
constants, usage, and status codes.
.. COMMENT: rtems_get_processor_count
.. _rtems_get_processor_count:
GET_PROCESSOR_COUNT - Get processor count
-----------------------------------------
@ -660,7 +683,7 @@ maximum count of application configured processors.
None.
.. COMMENT: rtems_get_current_processor
.. _rtems_get_current_processor:
GET_CURRENT_PROCESSOR - Get current processor index
---------------------------------------------------
@ -692,8 +715,7 @@ thread dispatching disabled.
None.
.. COMMENT: rtems_scheduler_ident
.. _rtems_scheduler_ident:
SCHEDULER_IDENT - Get ID of a scheduler
---------------------------------------
@ -703,17 +725,24 @@ SCHEDULER_IDENT - Get ID of a scheduler
.. code:: c
rtems_status_code rtems_scheduler_ident(
rtems_name name,
rtems_id \*id
rtems_name name,
rtems_id *id
);
**DIRECTIVE STATUS CODES:**
``RTEMS_SUCCESSFUL`` - successful operation
``RTEMS_INVALID_ADDRESS`` - ``id`` is NULL
``RTEMS_INVALID_NAME`` - invalid scheduler name
``RTEMS_UNSATISFIED`` - - a scheduler with this name exists, but
the processor set of this scheduler is empty
.. list-table::
:class: rtems-table
* - ``RTEMS_SUCCESSFUL``
- successful operation
* - ``RTEMS_INVALID_ADDRESS``
- ``id`` is NULL
* - ``RTEMS_INVALID_NAME``
- invalid scheduler name
* - ``RTEMS_UNSATISFIED``
- a scheduler with this name exists, but the processor set of this scheduler
is empty
**DESCRIPTION:**
@ -724,7 +753,7 @@ scheduler configuration. See `Configuring a System`_.
None.
.. COMMENT: rtems_scheduler_get_processor_set
.. _rtems_scheduler_get_processor_set:
SCHEDULER_GET_PROCESSOR_SET - Get processor set of a scheduler
--------------------------------------------------------------
@ -734,30 +763,37 @@ SCHEDULER_GET_PROCESSOR_SET - Get processor set of a scheduler
.. code:: c
rtems_status_code rtems_scheduler_get_processor_set(
rtems_id scheduler_id,
size_t cpusetsize,
cpu_set_t \*cpuset
rtems_id scheduler_id,
size_t cpusetsize,
cpu_set_t *cpuset
);
**DIRECTIVE STATUS CODES:**
``RTEMS_SUCCESSFUL`` - successful operation
``RTEMS_INVALID_ADDRESS`` - ``cpuset`` is NULL
``RTEMS_INVALID_ID`` - invalid scheduler id
``RTEMS_INVALID_NUMBER`` - the affinity set buffer is too small for
set of processors owned by the scheduler
.. list-table::
:class: rtems-table
* - ``RTEMS_SUCCESSFUL``
- successful operation
* - ``RTEMS_INVALID_ADDRESS``
- ``cpuset`` is NULL
* - ``RTEMS_INVALID_ID``
- invalid scheduler id
* - ``RTEMS_INVALID_NUMBER``
- the affinity set buffer is too small for set of processors owned by the
scheduler
**DESCRIPTION:**
Returns the processor set owned by the scheduler in ``cpuset``. A set bit
in the processor set means that this processor is owned by the scheduler and a
Returns the processor set owned by the scheduler in ``cpuset``. A set bit in
the processor set means that this processor is owned by the scheduler and a
cleared bit means the opposite.
**NOTES:**
None.
.. COMMENT: rtems_task_get_scheduler
.. _rtems_task_get_scheduler:
TASK_GET_SCHEDULER - Get scheduler of a task
--------------------------------------------
@ -767,26 +803,32 @@ TASK_GET_SCHEDULER - Get scheduler of a task
.. code:: c
rtems_status_code rtems_task_get_scheduler(
rtems_id task_id,
rtems_id \*scheduler_id
rtems_id task_id,
rtems_id *scheduler_id
);
**DIRECTIVE STATUS CODES:**
``RTEMS_SUCCESSFUL`` - successful operation
``RTEMS_INVALID_ADDRESS`` - ``scheduler_id`` is NULL
``RTEMS_INVALID_ID`` - invalid task id
.. list-table::
:class: rtems-table
* - ``RTEMS_SUCCESSFUL``
- successful operation
* - ``RTEMS_INVALID_ADDRESS``
- ``scheduler_id`` is NULL
* - ``RTEMS_INVALID_ID``
- invalid task id
**DESCRIPTION:**
Returns the scheduler identifier of a task identified by ``task_id`` in``scheduler_id``.
Returns the scheduler identifier of a task identified by ``task_id`` in
``scheduler_id``.
**NOTES:**
None.
.. COMMENT: rtems_task_set_scheduler
.. _rtems_task_set_scheduler:
TASK_SET_SCHEDULER - Set scheduler of a task
--------------------------------------------
@ -796,22 +838,27 @@ TASK_SET_SCHEDULER - Set scheduler of a task
.. code:: c
rtems_status_code rtems_task_set_scheduler(
rtems_id task_id,
rtems_id scheduler_id
rtems_id task_id,
rtems_id scheduler_id
);
**DIRECTIVE STATUS CODES:**
``RTEMS_SUCCESSFUL`` - successful operation
``RTEMS_INVALID_ID`` - invalid task or scheduler id
``RTEMS_INCORRECT_STATE`` - the task is in the wrong state to
perform a scheduler change
.. list-table::
:class: rtems-table
* - ``RTEMS_SUCCESSFUL``
- successful operation
* - ``RTEMS_INVALID_ID``
- invalid task or scheduler id
* - ``RTEMS_INCORRECT_STATE``
- the task is in the wrong state to perform a scheduler change
**DESCRIPTION:**
Sets the scheduler of a task identified by ``task_id`` to the scheduler
identified by ``scheduler_id``. The scheduler of a task is initialized to
the scheduler of the task that created it.
identified by ``scheduler_id``. The scheduler of a task is initialized to the
scheduler of the task that created it.
**NOTES:**
@ -819,36 +866,44 @@ None.
**EXAMPLE:**
.. code:: c
.. code-block:: c
:linenos:
#include <rtems.h>
#include <assert.h>
void task(rtems_task_argument arg);
void example(void)
{
rtems_status_code sc;
rtems_id task_id;
rtems_id scheduler_id;
rtems_name scheduler_name;
scheduler_name = rtems_build_name('W', 'O', 'R', 'K');
sc = rtems_scheduler_ident(scheduler_name, &scheduler_id);
assert(sc == RTEMS_SUCCESSFUL);
sc = rtems_task_create(
rtems_build_name('T', 'A', 'S', 'K'),
1,
RTEMS_MINIMUM_STACK_SIZE,
RTEMS_DEFAULT_MODES,
RTEMS_DEFAULT_ATTRIBUTES,
&task_id
);
assert(sc == RTEMS_SUCCESSFUL);
sc = rtems_task_set_scheduler(task_id, scheduler_id);
assert(sc == RTEMS_SUCCESSFUL);
sc = rtems_task_start(task_id, task, 0);
assert(sc == RTEMS_SUCCESSFUL);
rtems_status_code sc;
rtems_id task_id;
rtems_id scheduler_id;
rtems_name scheduler_name;
scheduler_name = rtems_build_name('W', 'O', 'R', 'K');
sc = rtems_scheduler_ident(scheduler_name, &scheduler_id);
assert(sc == RTEMS_SUCCESSFUL);
sc = rtems_task_create(
rtems_build_name('T', 'A', 'S', 'K'),
1,
RTEMS_MINIMUM_STACK_SIZE,
RTEMS_DEFAULT_MODES,
RTEMS_DEFAULT_ATTRIBUTES,
&task_id
);
assert(sc == RTEMS_SUCCESSFUL);
sc = rtems_task_set_scheduler(task_id, scheduler_id);
assert(sc == RTEMS_SUCCESSFUL);
sc = rtems_task_start(task_id, task, 0);
assert(sc == RTEMS_SUCCESSFUL);
}
.. COMMENT: rtems_task_get_affinity
.. _rtems_task_get_affinity:
TASK_GET_AFFINITY - Get task processor affinity
-----------------------------------------------
@ -858,18 +913,25 @@ TASK_GET_AFFINITY - Get task processor affinity
.. code:: c
rtems_status_code rtems_task_get_affinity(
rtems_id id,
size_t cpusetsize,
cpu_set_t \*cpuset
rtems_id id,
size_t cpusetsize,
cpu_set_t *cpuset
);
**DIRECTIVE STATUS CODES:**
``RTEMS_SUCCESSFUL`` - successful operation
``RTEMS_INVALID_ADDRESS`` - ``cpuset`` is NULL
``RTEMS_INVALID_ID`` - invalid task id
``RTEMS_INVALID_NUMBER`` - the affinity set buffer is too small for
the current processor affinity set of the task
.. list-table::
:class: rtems-table
* - ``RTEMS_SUCCESSFUL``
- successful operation
* - ``RTEMS_INVALID_ADDRESS``
- ``cpuset`` is NULL
* - ``RTEMS_INVALID_ID``
- invalid task id
* - ``RTEMS_INVALID_NUMBER``
- the affinity set buffer is too small for the current processor affinity
set of the task
**DESCRIPTION:**
@ -881,7 +943,7 @@ cleared bit means the opposite.
None.
.. COMMENT: rtems_task_set_affinity
.. _rtems_task_set_affinity:
TASK_SET_AFFINITY - Set task processor affinity
-----------------------------------------------
@ -891,17 +953,24 @@ TASK_SET_AFFINITY - Set task processor affinity
.. code:: c
rtems_status_code rtems_task_set_affinity(
rtems_id id,
size_t cpusetsize,
const cpu_set_t \*cpuset
rtems_id id,
size_t cpusetsize,
const cpu_set_t *cpuset
);
**DIRECTIVE STATUS CODES:**
``RTEMS_SUCCESSFUL`` - successful operation
``RTEMS_INVALID_ADDRESS`` - ``cpuset`` is NULL
``RTEMS_INVALID_ID`` - invalid task id
``RTEMS_INVALID_NUMBER`` - invalid processor affinity set
.. list-table::
:class: rtems-table
* - ``RTEMS_SUCCESSFUL``
- successful operation
* - ``RTEMS_INVALID_ADDRESS``
- ``cpuset`` is NULL
* - ``RTEMS_INVALID_ID``
- invalid task id
* - ``RTEMS_INVALID_NUMBER``
- invalid processor affinity set
**DESCRIPTION:**
@ -921,9 +990,3 @@ locking protocols may temporarily use processors that are not included in the
processor affinity set of the task. It is also not an error if the processor
affinity set contains processors that are not part of the system.
.. COMMENT: COPYRIGHT (c) 2011,2015
.. COMMENT: Aeroflex Gaisler AB
.. COMMENT: All rights reserved.