REAL-TIME, FAULT-TOLERANT, DISTRIBUTED SYSTEMS ARCHITECTURE

Subject presentation

Embedded digital systems constraints such as processing power, modularity, maintenability,autonomy, require the design of hardware and software architectures meeting distribution, real-time and fault-tolerance requirements.

A distributed architecture meets the constraints of critical embedded systems in so far as it offers:

The needed processing power

The aim is to distribute processing tasks (while keeping access to shared ressources) and to exploit potential parallelism between processing units.

A way to implement complex applications

Distribution leads to a programming approach exploiting both locality of communications and decomposition of a global problem into independant sub-problems, that can be more easily implemented.

Some versatility

A distributed architecture presents modularity characteristics allowing some evolution according to processing needs. Such a versatility can be achieved only if the workload distribution can take into account a variable number of processing units.

Real-time constraints handling

Real-time processing relies on the ability of the system to take into account (and to treat) some events within given and proved time frames. This assumes that real-time scheduling strategies can be adapted in a distributed environment.

Safety and dependability improvement

Redundancy and hardware and software reconfiguration capabilities allow either the complete masking of some faults either the recovery (if necessary in a degraded manner) after fault detection and treatment.

Thus this research subject deals with techniques needed to implement a distributed architecture, then ensuring an optimal exploitation of processing tasks parallelism in a real time environment. This requires moreover to tolerate faults of some of the elements of the system.

Research topics

A first study dealt with the specification and the validation of mechanisms needed to handle a reconfigurable distributed architecture. These mechanisms allow the exploitation of :

- functional reconfiguration, which aims the determination of an optimal interconnection topology for a given step of the application;

- and fault tolerant reconfiguration, the objective of which is the confinement of a faulty processor and (if possible) its replacement by a spare one to allow the application to continue.

A second research topic first aimed at a state of the art in the field of hardware and software architecture answering distribution, real-time processing and fault tolerance constraints in the context of a spaceborne target system.

Next step will deal with the selection of concepts, techniques and solutions that can be more precisely studied and integrated in a given application context. A more detailed study of some of these mechanisms (distributed real-time scheduling) will be undertaken and may lead to a prototyping and evaluation phase.

A third topic will deal with avionics systems relying on integrated modular avionics system concept (IMA). For such systems, various functions of the system are allocated on a set of processing units belonging to a distributed architecture. More critical functions are implemented by the use of redundancy and reconfiguration techniques.

Proposed study aims the specification of system simulation tools relying on a modelisation of the whole modular avionics system.

Activities related to the subject

Participation to the CNRS/MRT Programme de recherches Cordonnées dealing with new computer architectures.

Teaching at ENSAE: lectures on computer networks and distributed systems .

Publications

Ch. FRABOUL, L. MAILLET, P. SIRON
Architecture distribuée temps réel tolérante aux pannes
Rapport final 1/3451/DERI convention DRET no 89.002.00.124 Juillet 1993

Ch. FRABOUL, P. SIRON
Architecture parallèle reconfigurable tolérante aux pannes
Rapport final 2/3454/DERI convention CNRS/MRE no 91 S 0288 Décembre 1993

G. CHOPARD
"Analyse, maquettage et validation d'une chaine de développement de programmes parallèles reconfigurables"
Rapport de stage IEE-CNAM septembre 1993

Ch. FRABOUL, P. SIRON
"Un environnement de programmation d'applications distribuées et tolérantes aux pannes sur une architecture parallèle reconfigurable"
AGARD Avionics Panel Meeting on Aerospace Software Engineering for Advanced System Architecture, Paris, 10-13 mai 1993

Ch. FRABOUL, P. SIRON
"Architecture parallèle reconfigurable tolérante aux pannes"
Journées PRC-ANM, Rennes, décembre 1993

Contacts

Christian FRABOUL fraboul@tls-cs.cert.fr
Pierre SIRON siron@CUTtls-cs.cert.fr