Burner Management System
The function of a burner management system (BMS) is to assure safe
operation of the combustion associated with boilers, ovens, kilns,
process heaters and furnaces. The BMS provides a safe start-up
procedure and stops fuel flow if conditions are detected that affect the
safety of the unit.
With the advancement of microprocessor technology, programmable
systems have become the preferred solution for burner management
design. When issues like documentation, configuration management,
diagnostics, capabilities for operator graphics and communications to
other Plantwide control systems are considered, the advantages of
programmable technology over relay/solid-state technology become
very significant. Since the failure modes of microprocessor technology
is not readily predictable, the Australian Gas Association (AGA) and a
number of other international standards and regulatory agencies
(NFPA, TUV, FM, IRI) have established recommended practices and
guidelines for applying this technology in burner management
The needs of a Burner Management System.
There are strong economic reasons to ensure combustion equipment
operates safely. These reasons include possible equipment losses,
personnel injury and loss’ and production downtime as a result of an
accident. When risk analysis is combined with life cycle costing
techniques, many companies realise that the financial impact of safety
risk is higher than imagined.
Gas & Fuel Authorities are bringing out newer, tougher requirements
including requirements for approvals from independent testing
agencies like TUV. The IEC61508 standard for the functional safety of
electrical/electronic/programmable electronic (E/E/PE) safety-related
systems has been released and the Australian version AS61508 will be
fully published soon. Safe operating combustion equipment design is
not becoming easier.
The latest Australian Standard AS3814/AG 501 – 2000 for industrial
and commercial gas-fired appliances states that for a Programmable
Electronic System (PES) to gain acceptance on Type B appliances the
following applies as in clause 2.26.3, sections: –
“If it is desired to use a PES controller to perform safety-related
functions, then it shall be a redundant safety-related PES and possess
a TUV safety certificate to the appropriate safety class of DIN V 19250
or some equivalent certificate. Only TUV approved “firmware” (or
equivalent) is to be used in the controller.”
“Like computer programs, the only true way of assessing a PES user-
program to ensure that it functions the way it was designed, is to test
run the program. It is not possible to inspect a PES program in its
entirety by visual examination and conclude that the program does
what it is required to do under all possible operating situations.
Therefore in order to ensure the integrity of the PES user software,
the person/company who designed the system shall have QA
accreditation, and shall have adhered to the principles outlined in AS
61508. It is the designer’s responsibility for the development of the
program, and for test-running the program by simulating the inputs,
and proving that the outputs occur at the right time and duration. A
signed written statement to that effect shall be submitted to the
The NFPA 8502 standard for the prevention of furnace
explosions/implosions in multiple burner boilers, 1999 edition clause 4-
3.2.1, lists the following minimum failures that must be evaluated and
(a) Interruptions, excursions, dips, recoveries, transients, and
partial losses of power
(b) Memory corruption and losses
(c) Information transfer corruption and losses
(d) Inputs and outputs (fail-on, fail-off)
(e) Signals that are unreadable or not being read
(f) Failure to address errors
(g) Processor faults
(h) Relay coil failure
(i) Relay contact failure (fail-on, fail-off)
(j) Timer failure
The new FM 7605 standard, first released in January 2000, for PLC
based BMS systems also requires compliance with the IEC 61508
“The system shall conform at a specified Safety Integrity Level (SIL) to
IEC 61508, Part 1, General requirements. The hardware architecture
shall include self-checking firmware, external and internal watchdog
systems, redundant processors, and dual I/O cards as required to
achieve the specified SIL. Software architecture shall include
communications drivers, fault handling, executive software,
input/output functions, and derived functions as required to achieve
the specified SIL. Redundant components shall be separated so as to
reduce common cause failures.”
This need to meet regulations and properly implement safety
protection equipment adds another dimension to the trade offs that
must be made by design engineers.
Regardless of these requirements many control engineers are
selecting programmable electronic systems for burner management
applications. Advantages include ease of installation, lower false trip
rate, math capability and more sophisticated logic capability – in newer
generation PLCs, other benefits include IEC 61131 standard language
capability, self-documenting graphical configuration and management
of change functions among a growing list of other user friendly tools.
With all these advantages, why not? The big problem is that solid-
state components can fail in several ways, many of which may create
dangerous undetectable failures.
The BMS maintains safe operation of the boiler during start-up,
operation, and shutdown. Both PLCs and DCSs can accommodate
safety and process control in a single processor, but the National Fire
Protection Association, Factory Mutual Research Corporation, and good
engineering practice call for independence between burner
management systems and all other control systems.
Early automated BMS were either proprietary hardware or relay based.
Since the 1980s, PLCs are preferred for their reliability, flexibility,
configurability, and lower life cycle cost.
With any automated electronic control-based system, the designer
must pay close attention to failure modes. Safety features that can be
designed into a BMS include input checking, critical output monitoring,
external watchdog circuit, coil monitoring, fuse monitoring, circuit
breaker monitoring, and related alarming and diagnostics.
Many other processes in a power house can be controlled with PLCs to
cut installed system cost, reduce spare parts requirements, speed
maintenance and operator training, and ease installation and
Output monitoring (or readback) is a technique that uses an input
channel to measure an output channel’s value and compares it to the
value demanded by the system logic. This diagnostic can determine if
the output has failed ON or failed OFF. Figure 1 shows how output
monitoring is typically implemented in a PLC. Ladder logic must be
written to ensure that each output is compared with its corresponding
diagnostic input channel and appropriate diagnostics are generated.
Safety PLCs incorporate output monitoring into their I/0 module
hardware using special circuitry and an onboard microprocessor to
generate the diagnostics, as illustrated in Figure 2. This eliminates the
wiring and programming required by general purpose PLCs.
Furthermore, this relieves the application controller from the burden of
generating these diagnostics.
Output monitoring provides valuable diagnostic information. However,
it can do nothing more than annunciate the problem on its own. In
order to convert the potentially dangerous failure into a safe failure, an
additional technique must be applied in addition to the output
Series wired trip relays could be incorporated to “protect” the
monitored outputs. Figure 3 illustrates the typical addition of a trip
relay to the general purpose PLC output monitoring in Figure 1. The
output to the trip relay is programmed to de-energise if any of the
outputs it is protecting reports a dangerous fault. This provides a
secondary means of de-energising an output if for some reason, the
output fails to turn-off when commanded. Additionally, a contact of
the trip relay should be monitored to ensure that it is functioning
properly. The trip relay must be manually reset before it can be re-
energised. This can be accomplished by wiring a reset pushbutton to
an input circuit or via an engineer’s console.
Most safety PLCs incorporate protected or guarded outputs. Figure 4
shows the incorporation of a diagnostic cut-off relay to the typical
safety PLC block diagram, which provides guarded outputs. Note that
the relay is also monitored for proper function. Here, the diagnostic
generated by the faulted output or relay must be manually cleared
before the relay can be re-energised.
Watchdog timer circuits are employed to ensure that outputs fail-safe
upon detection of a processor failure. The typical implementation with
a general purpose PLC is to configure one or two outputs to continually
generate square wave output(s). The watchdog timer will trip if the
output(s) fail to change state within the timer’s specified preset. This
will cause the trip relay to de-energise. Figure 5 shows the addition of
a watchdog timer to the general purpose PLC application in Figure 3.
There should be at least one watchdog timer monitoring every CPU in
the system. Two watchdog timers are required to detect watchdog
Safety PLCs also employ watchdog timers, however, watchdog timers
are integral to the modules and usually implemented redundantly.
That is, every CPU circuit is monitored by two watchdog timers, and
the timers also monitor each other to detect watchdog timer failure. If
either watchdog trips, the diagnostic cut-off relay is de-energised.
Figure 6 depicts the addition of watchdog timers to the typical safety
PLC block the diagram. As shown, the watchdog timer has direct
control of the relay, de-energising it upon a watchdog time-out.
The quality of output signals is only as good as the power used to drive them.
To insure that outputs are not turned on when the power supply is out of
tolerance, a power monitor diagnostic can be added to the general purpose
PLC. Figure 7 shows the addition of a signal conditioner (trip alarm), which
detects if the power supply is under range or over range. To protect the outputs
from damage, possible dropout, or oscillation during brownout conditions, the
PLC must be programmed to de-energise the trip relay output if the power supply
goes out of range.
Figure 8 shows the complete safety PLC output module block diagram
with the addition of the power monitor circuit. Like the trip alarm, the
power monitor circuit detects if the power supply goes over or under
range and can automatically trip the diagnostic cut-off relay to protect
the outputs. This circuit can also detect if the main fuse is blown.
Input Circuit Protection
Input circuits can fail ON or OFF, which if left undetected, can leave a
Safety System unprotected. There are multiple techniques for
detecting failed ON or failed OFF outputs. They are pulse testing
(automatic input testing) and redundant input circuits comparison.
During the test, inputs are briefly de-energised by turning off an
output that supplies power to the inputs. Programmed logic must then
prove that all of the inputs successfully detected the change in state.
However, additional logic must ensure that the application logic holds
the inputs during the test. Some safety PLCs incorporate automatic
input testing in their input modules or redundant input detection
circuits for each input channel.
Inter-module communications require diagnostics that can detect
corrupted messages or a loss of communication. Cyclical redundancy
checking (CRC) is a very reliable technique for confirming correct
transmission and receipt of data. Communication watchdog timers
should also be employed by every module on a bus to detect a loss of
bus activity. Safety PLCs will automatically set their outputs to a pre-
determined safe state (OFF) when an I/0 module has lost
communication with its control module. Redundant communications
paths, standard in safety PLCs, should be considered for general PLCs
for higher availability.
To insure input data is originating from the correct module and going
to the correct module, the processor should incorporate some form of
address verification. Safety PLCs use redundant serial data links to
communicate between the processor and the I/0 modules. Serial
communications allow for source and destination addressing to be
embedded into messages and compared with the hardware address
established by the backplane. Parallel backplane designs typically
found in general purpose PLCs do not usually incorporate any address
Memory Corruption and Losses
All programmable control system memory (RAM, ROM, and EEPROM)
should be fully tested upon power-up and continuously tested on-line
with background diagnostics’ Volatile memory (RAM) should be battery
backed and a low battery diagnostic should indicate to the operator
when a battery needs to be replaced.
A “common cause” failure is defined as the failure of two or more
similar components due to a single stress event (a single cause). The
key word here is “stress.” Stressor events include electrical events like
power spikes, lightning, and high current levels. Mechanical stress
includes shock and vibration. Chemical stress includes corrosive
atmospheres, salt air, and humidity. Physical stress includes
temperature. Heavy usage including high data rates is even a stress,
especially to system software. If the stress level is high enough, two
or more similar components can fail at the same time.
Software may be the most significant contributor of all to the common
cause failure rate. A “stress’ to a software system is the combination
of inputs, timing, and stored data seen by the CPU. Imagine a fault
tolerant system with two or three processors where all the CPUs are
running the exact same program in lock-step synchronous operation.
The CPUs will all see the exact same inputs, the same stored data with
the same timing. The chance of simultaneous failure due to a common
software bug is high.
A Safety PLC can achieve “common cause strength” through a number
· Physical separation of redundant units. The worst implementation has
redundant circuits on the same circuit board. The best implementation
allows redundant circuits to be located in different cabinets.
· Asynchronous operation of redundant units to reduce software
common cause. The worst implementation has identical software
running the same functionality in perfect synchronisation. The best
implementation runs asynchronously with different operating modes
between redundant units.
· Diversity. The worst implementation has identical software and hardware
in redundant units. The best implementation uses diverse components
that respond differently to a common stress.
· High strength hardware and software. Other important parameters
include the overall ruggedness of the safety PLC and the use of a
systematic audited software development process.
BMS Safety PLC System Architectures
Typically a specially designed safety PLC, provides high reliability and
high safety via special electronics, special software and pre-engineered
redundancy. The safety PLC has I/0 circuits that are designed to be
fail-safe with built-in diagnostics. The CPU of a safety PLC has built-in
diagnostics for memory, CPU operation, watchdog timer and all
communications systems. I/0 module addressing is done via serial
communications messages that have full automatic error checking.
Figure 9 shows the architecture of a non-redundant safety PLC. The
1oo1D (one out of one with diagnostics) architecture uses the special
diagnostic circuits to convert dangerous failures into safe failures by
de-energising the output. This is the most cost effective safety PLC
solution and meets IEC 61508 SIL 2 requirements.
Figure 9. The 1oo1D architecture uses special diagnostic circuits to convert dangerous
failures into safe circuits.
When high availability is important in addition to safety, a redundant
architecture can be used. Two primary architectures are used, 2oo3
and 1oo2D. Figure 10 shows the 2oo3 (two out of three) architecture
that was designed to provide high safety and high availability. It is
typically implemented with three physical sets of electronics. Each set
of electronics includes the input circuitry, a logic solver, and output
circuitry. A 2oo3 system can tolerate a one-unit failure but is more
susceptible to common cause than the 1oo2D. Also, because the 2oo3
architecture requires more hardware it can be a complex and
expensive to implement.
Figure 10. The 2oo3 architecture is designed to provide safety and
Figure 11 shows the loo2D (one out of two with diagnostics)
architecture. It was designed to provide high safety, high availability
and high common cause strength at a lower cost than a 2oo3 system.
It is simple to implement with typically two physical sets of
electronics. Each set of electronics includes the input circuitry, a logic
solver, and output circuitry. Each circuit has special diagnostic
circuitry that combines to form another logical channel. When two
sets of electronics are combined together a four-channel architecture
Conceptually, each of the two units reads inputs, calculates, and
stores outputs. The diagnostic circuits monitor proper operation and
will de-energise a second series output switch if a failure is detected.
Any potentially dangerous failure is converted into a safe failure if
detected by the diagnostics. If the diagnostics work perfectly, the
system is fail safe. High availability is achieved through the parallel
combination of the two sets of electronics. If one side fails safely, the
other side maintains the load and the protection function.
The loo2D architecture requires good self-diagnostics. Diagnostic
techniques have improved considerably; however, it is arguable that
perfect self-diagnostics can be achieved. Therefore, in order to assure
high safety integrity, actual implementations of the loo2D provide
interprocessor communication between the logic solvers. A
comparison of input data and calculation results between the two units
provides complete protection in addition to the self-diagnostics. When
the comparison of either unit detects a mismatch, the system is de-
Figure 11. The 1oo2D architecture provides safety, via diagnostic
and extra series output switches, availability and common
There are many aspects of a Burner Management System that
contribute to its operating safety and meeting IEC 61508 and
regulatory agency requirements. For example and not covered by this
paper, much can be done with flame detectors, field sensors and
actuators, such as voting redundant sensors, using analog transmitters
in place of switch interlocks, and installing limits switches on valves.
There are also now more certified field sensors becoming available that
are designed to meet the standards. However, the device that controls
all of the system I/O plays a major role in the operating safety of the
system. Selection of the control system is just as, if not more critical,
than the selection of the associated field hardware.
Depending on the mix of analog and digital I/0, the cost of a modern
safety PLCs will not be much higher than a conventional PLC. In
addition, one significant advantage of the safety PLC is eliminating the
special engineering and application level programming required in the
conventional PLC. None of the special circuits shown in Figures 1, 3, 5
& 7 are needed when using a safety PLC. The installed cost of a safety
PLC can be significantly lower than a conventional PLC when
engineering and installation expenses are considered for burner