transmission and receipt of data. Communication watchdog timers
should also be employed by every module on a bus to detect a loss of
bus activity. Safety PLCs will automatically set their outputs to a pre-
determined safe state (OFF) when an I/0 module has lost
communication with its control module. Redundant communications
paths, standard in safety PLCs, should be considered for general PLCs
for higher availability.
To insure input data is originating from the correct module and going
to the correct module, the processor should incorporate some form of
address verification. Safety PLCs use redundant serial data links to
communicate between the processor and the I/0 modules. Serial
communications allow for source and destination addressing to be
embedded into messages and compared with the hardware address
established by the backplane. Parallel backplane designs typically
found in general purpose PLCs do not usually incorporate any address
Memory Corruption and Losses
All programmable control system memory (RAM, ROM, and EEPROM)
should be fully tested upon power-up and continuously tested on-line
with background diagnostics’ Volatile memory (RAM) should be battery
backed and a low battery diagnostic should indicate to the operator
when a battery needs to be replaced.
A “common cause” failure is defined as the failure of two or more
similar components due to a single stress event (a single cause). The
key word here is “stress.” Stressor events include electrical events like
power spikes, lightning, and high current levels. Mechanical stress
includes shock and vibration. Chemical stress includes corrosive
atmospheres, salt air, and humidity. Physical stress includes
temperature. Heavy usage including high data rates is even a stress,
especially to system software. If the stress level is high enough, two
or more similar components can fail at the same time.
Software may be the most significant contributor of all to the common
cause failure rate. A “stress’ to a software system is the combination
of inputs, timing, and stored data seen by the CPU. Imagine a fault
tolerant system with two or three processors where all the CPUs are
running the exact same program in lock-step synchronous operation.
The CPUs will all see the exact same inputs, the same stored data with
the same timing. The chance of simultaneous failure due to a common
software bug is high.
A Safety PLC can achieve “common cause strength” through a number
· Physical separation of redundant units. The worst implementation has
redundant circuits on the same circuit board. The best implementation
allows redundant circuits to be located in different cabinets.
· Asynchronous operation of redundant units to reduce software
common cause. The worst implementation has identical software
running the same functionality in perfect synchronisation. The best
implementation runs asynchronously with different operating modes
between redundant units.
· Diversity. The worst implementation has identical software and hardware
in redundant units. The best implementation uses diverse components
that respond differently to a common stress.
· High strength hardware and software. Other important parameters
include the overall ruggedness of the safety PLC and the use of a
systematic audited software development process.
BMS Safety PLC System Architectures
Typically a specially designed safety PLC, provides high reliability and
high safety via special electronics, special software and pre-engineered
redundancy. The safety PLC has I/0 circuits that are designed to be
fail-safe with built-in diagnostics. The CPU of a safety PLC has built-in
diagnostics for memory, CPU operation, watchdog timer and all
communications systems. I/0 module addressing is done via serial
communications messages that have full automatic error checking.
Figure 9 shows the architecture of a non-redundant safety PLC. The
1oo1D (one out of one with diagnostics) architecture uses the special
diagnostic circuits to convert dangerous failures into safe failures by
de-energising the output. This is the most cost effective safety PLC
solution and meets IEC 61508 SIL 2 requirements.
Figure 9. The 1oo1D architecture uses special diagnostic circuits to convert dangerous
failures into safe circuits.
When high availability is important in addition to safety, a redundant
architecture can be used. Two primary architectures are used, 2oo3
and 1oo2D. Figure 10 shows the 2oo3 (two out of three) architecture
that was designed to provide high safety and high availability. It is
typically implemented with three physical sets of electronics. Each set
of electronics includes the input circuitry, a logic solver, and output
circuitry. A 2oo3 system can tolerate a one-unit failure but is more
susceptible to common cause than the 1oo2D. Also, because the 2oo3
architecture requires more hardware it can be a complex and
expensive to implement.
Figure 10. The 2oo3 architecture is designed to provide safety and
Figure 11 shows the loo2D (one out of two with diagnostics)
architecture. It was designed to provide high safety, high availability
and high common cause strength at a lower cost than a 2oo3 system.
It is simple to implement with typically two physical sets of
electronics. Each set of electronics includes the input circuitry, a logic
solver, and output circuitry. Each circuit has special diagnostic
circuitry that combines to form another logical channel. When two
sets of electronics are combined together a four-channel architecture
Conceptually, each of the two units reads inputs, calculates, and
stores outputs. The diagnostic circuits monitor proper operation and
will de-energise a second series output switch if a failure is detected.
Any potentially dangerous failure is converted into a safe failure if
detected by the diagnostics. If the diagnostics work perfectly, the
system is fail safe. High availability is achieved through the parallel
combination of the two sets of electronics. If one side fails safely, the
other side maintains the load and the protection function.
The loo2D architecture requires good self-diagnostics. Diagnostic
techniques have improved considerably; however, it is arguable that
perfect self-diagnostics can be achieved. Therefore, in order to assure
high safety integrity, actual implementations of the loo2D provide
interprocessor communication between the logic solvers. A
comparison of input data and calculation results between the two units
provides complete protection in addition to the self-diagnostics. When
the comparison of either unit detects a mismatch, the system is de-
Figure 11. The 1oo2D architecture provides safety, via diagnostic
and extra series output switches, availability and common
There are many aspects of a Burner Management System that
contribute to its operating safety and meeting IEC 61508 and
regulatory agency requirements. For example and not covered by this
paper, much can be done with flame detectors, field sensors and
actuators, such as voting redundant sensors, using analog transmitters
in place of switch interlocks, and installing limits switches on valves.
There are also now more certified field sensors becoming available that
are designed to meet the standards. However, the device that controls
all of the system I/O plays a major role in the operating safety of the
system. Selection of the control system is just as, if not more critical,
than the selection of the associated field hardware.
Depending on the mix of analog and digital I/0, the cost of a modern
safety PLCs will not be much higher than a conventional PLC. In
addition, one significant advantage of the safety PLC is eliminating the
special engineering and application level programming required in the
conventional PLC. None of the special circuits shown in Figures 1, 3, 5
& 7 are needed when using a safety PLC. The installed cost of a safety
PLC can be significantly lower than a conventional PLC when
engineering and installation expenses are considered for burner