Main Neural Networks in Healthcare: Potential And Challenges

Neural Networks in Healthcare: Potential And Challenges

, ,
0 / 0
How much do you like this book?
What’s the quality of the file?
Download the book for quality assessment
What’s the quality of the downloaded files?
Healthcare costs around the globe are on the rise, creating a strong need for new ways of assisting the requirements of the healthcare system. Besides applications in other areas, neural networks have naturally found many promising applications in the health and medicine areas. Neural Networks in Healthcare: Potential and Challenges presents interesting and innovative developments from leading experts and scientists working in health, biomedicine, biomedical engineering, and computing areas. This book covers many important and state-of-the-art applications in the areas of medicine and healthcare, including: cardiology, electromyography, electroencephalography, gait and human movement, therapeutic drug monitoring for patient care, sleep apnea, and computational fluid dynamics areas.Neural Networks in Healthcare: Potential and Challenges is a useful source of information for researchers, professionals, lecturers, and students from a wide range of disciplines. Readers of this book will be able to use the ideas for further research efforts in this very important and highly multidisciplinary area.
Year:
2006
Edition:
illustrated edition
Publisher:
Idea Group Publishing
Language:
english
Pages:
349
ISBN 10:
1591408482
ISBN 13:
9781591408482
File:
PDF, 5.92 MB
Download (pdf, 5.92 MB)

You may be interested in Powered by Rec2Me

 

Most frequently terms

 
0 comments
 

You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.
1

Trading the GARTLEY 222

Year:
2003
Language:
english
File:
PDF, 691 KB
0 / 0
2

A First Course in Linear Algebra

Year:
2005
Language:
english
File:
PDF, 4.45 MB
0 / 0
Neural Networks
in Healthcare:
Potential and Challenges

Rezaul Begg
Victoria University, Australia
Joarder Kamruzzaman
Monash University, Australia
Ruhul Sarker
University of New South Wales, Australia

IDEA GROUP PUBLISHING
Hershey • London • Melbourne • Singapore

Acquisitions Editor:
Development Editor:
Senior Managing Editor:
Managing Editor:
Copy Editor:
Typesetter:
Cover Design:
Printed at:

Michelle Potter
Kristin Roth
Amanda Appicello
Jennifer Neidig
April Schmidt
Diane Huskinson
Lisa Tosheff
Yurchak Printing Inc.

Published in the United States of America by
Idea Group Publishing (an imprint of Idea Group Inc.)
701 E. Chocolate Avenue
Hershey PA 17033
Tel: 717-533-8845
Fax: 717-533-8661
E-mail: cust@idea-group.com
Web site: http://www.idea-group.com
and in the United Kingdom by
Idea Group Publishing (an imprint of Idea Group Inc.)
3 Henrietta Street
Covent Garden
London WC2E 8LU
Tel: 44 20 7240 0856
Fax: 44 20 7379 0609
Web site: http://www.eurospanonline.com
Copyright © 2006 by Idea Group Inc. All rights reserved. No part of this book may be reproduced,
stored or distributed in any form or by any means, electronic or mechanical, including photocopying,
without written permission from the publisher.
Product or company names used in this book are for identification purposes only. Inclusion of the
names of the products or companies does not indicate a claim of ownership by IGI of the trademark
or registered trademark.
Library of Congress Cataloging-in-Publication Data
Neural networks in healthcare : potential and challenges / Rezaul Begg,
Joarder Kamruzzaman, and Ruhul Sarker, editors.
p. ; cm.
Includes bibliographical references.
Summary: "This book covers state-of-the-art applications in many areas
of medicine and healthcare"--Provided by publisher.
ISBN 1-59140-848-2 (hardcover) -- ISBN 1-59140-849-0 (softcover)
1. Neural networks (Computer science) 2. Medicine--Research--Data
processing. 3. Medical informatics.
I. Begg, Rezaul. II. Kamruzzaman,
Joarder. III. Sarker, Ruhul.
[DNLM: 1. ; Neural Networks (Computer) 2. Medical Informatics Applications.
W 26.55.A7 N494 2006]
R853.D37N48 2006
610'.285--dc22
2005027413
British Cataloguing in Publication Data
A Cataloguing in Publication record for this book is available from the British Library.
All work contributed to this book is new, previously-unpublished material. The views expressed in this
book are those of the authors, but not necessarily of the publisher.

Neural Networks
in Healthcare:
Potential and Challenges

Table of Contents

Preface ............................................................................................................ vii
Section I: Introduction and Applications in Healthcare
Chapter I
Overview of Artificial Neural Networks and their Applications in
Healthcare ....................................................................................................... 1
Joarder Kamruzzaman, Monash University, Australia
Rezaul Begg, Victoria University, Australia
Ruhul Sarker, University of New South Wales, Australia
Chapter II
A Survey on Various Applications of Artificial Neural Networks in
Selected Fields of Healthcare ..................................................................... 20
Wolfgang I. Schöllhorn, Westfälische Wilhelms-Universität
Münster, Germany
Jörg M. Jäger, Westfälische Wilhelms-Universität Münster,
Germany

Section II: Electrocardiography
Chapter III
The Role of Neural Networks in Computerized Classification of the
Electrocardiogram ........................................................................................ 60
Chris D. Nugent, University of Ulster at Jordanstown,
Northern Ireland
Dewar D. Finlay, University of Ulster at Jordanstown,
Northern Ireland
Mark P. Donnelly, University of Ulster at Jordanstown,
Northern Ireland
Norman D. Black, University of Ulster at Jordanstown,
Northern Ireland
Chapter IV
Neural Networks in ECG Classification: What is Next for Adaptive
Systems? ........................................................................................................ 81
G. Camps-Valls, Universitat de València, Spain
J. F. Guerrero-Martínez, Universitat de València, Spain
Chapter V
A Concept Learning-Based Patient-Adaptable Abnormal ECG Beat
Detector for Long-Term Monitoring of Heart Patients ...................... 105
Peng Li, Nanyang Technological University, Singapore
Kap Luk Chan, Nanyang Technological University, Singapore
Sheng Fu, Nanyang Technological University, Singapore
Shankar M. Krishnan, Nanyang Technological University,
Singapore
Section III: Electromyography
Chapter VI
A Recurrent Probabilistic Neural Network for EMG Pattern
Recognition ................................................................................................. 130
Toshio Tsuji, Hiroshima University, Japan
Nan Bu, Hiroshima Univeristy, Japan
Osamu Fukuda, National Institute of Advanced Industrial
Science and Technology, Japan

Chapter VII
Myoelectric Teleoperation of a Dual-Arm Manipulator Using
Neural Networks ........................................................................................ 154
Toshio Tsuji, Hiroshima University, Japan
Kouji Tsujimura, OMRON Corporation, Japan
Yoshiyuki Tanaka, Hiroshima University, Japan
Section IV: Electroencephalography and Evoked Potentials
Chapter VIII
Artificial Neural Networks in EEG Analysis ......................................... 177
Markad V. Kamath, McMaster University, Canada
Adrian R. Upton, McMaster University, Canada
Jie Wu, McMaster University, Canada
Harjeet S. Bajaj, McMaster University, Canada
Skip Poehlman, McMaster University, Canada
Robert Spaziani, McMaster University, Canada
Chapter IX
The Use of Artificial Neural Networks for Objective
Determination of Hearing Threshold Using the Auditory
Brainstem Response ................................................................................. 195
Robert T. Davey, City University, London, UK
Paul J. McCullagh, University of Ulster, Northern Ireland
H. Gerry McAllister, University of Ulster, Northern Ireland
H. Glen Houston, Royal Victoria Hospital, Northern Ireland
Section V: Applications in Selected Areas
Chapter X
Movement Pattern Recognition Using Neural Networks ................... 217
Rezaul Begg, Victoria University, Australia
Joarder Kamruzzaman, Monash University, Australia
Ruhul Sarker, University of New South Wales, Australia
Chapter XI
Neural and Kernel Methods for Therapeutic Drug Monitoring ........ 238
G. Camps-Valls, Universitat de València, Spain
J. D. Martín-Guerrero, Universitat de València, Spain

Chapter XII
Computational Fluid Dynamics and Neural Network for Modeling
and Simulations of Medical Devices ....................................................... 262
Yos S. Morsi, Swinburne University, Australia
Subrat Das, Swinburne University, Australia
Chapter XIII
Analysis of Temporal Patterns of Physiological Parameters .............. 284
Balázs Benyó, Széchenyi István University, Hungary and Budapest
University of Technology and Economics, Hungary
About the Authors ..................................................................................... 317
Index ............................................................................................................ 327

vii

Preface

Artificial neural networks are learning machines inspired by the operation of
the human brain, and they consist of many artificial neurons connected in parallel. These networks work via non-linear mapping techniques between the inputs and outputs of a model indicative of the operation of a real system. Although introduced over 40 years ago, many wonderful new developments in
neural networks have taken place as recently as during the last decade or so.
This has led to numerous recent applications in many fields, especially when
the input-output relations are too complex and difficult to express using formulations.
Healthcare costs around the globe are on the rise, and therefore there is strong
need for new ways of assisting the requirements of the healthcare system.
Besides applications in many other areas, neural networks have naturally found
many promising applications in the health and medicine areas. This book is
aimed at presenting some of these interesting and innovative developments from
leading experts and scientists working in health, biomedicine, biomedical engineering, and computing areas. The book covers many important and state-ofthe-art applications in the areas of medicine and healthcare, including cardiology, electromyography, electroencephalography, gait and human movement,
therapeutic drug monitoring for patient care, sleep apnea, and computational
fluid dynamics areas.
The book presents thirteen chapters in five sections as follows:
• Section I: Introduction and Applications in Healthcare
• Section II: Electrocardiography
• Section III: Electromyography
• Section IV: Electroencephalography and Evoked Potentials
• Section V: Applications in Selected Areas

viii

The first section consists of two chapters. The first chapter, by Kamruzzaman,
Begg, and Sarker, provides an overview of the fundamental concepts of neural
network approaches, basic operation of the neural networks, their architectures, and the commonly used algorithms that are available to assist the neural
networks during learning from examples. Toward the end of this chapter, an
outline of some of the common applications in healthcare (e.g., cardiology, electromyography, electroencephalography, and gait data analysis) is provided. The
second chapter, by Schöllhorn and Jäger, continues on from the first chapter
with an extensive overview of the artificial neural networks as tools for processing miscellaneous biomedical signals. A variety of applications are illustrated in several areas of healthcare using many examples to demonstrate how
neural nets can support the diagnosis and prediction of diseases. This review is
particularly aimed at providing a thoughtful insight into the strengths as well as
weaknesses of artificial neural networks as tools for processing biomedical
signals.
Electrical potentials generated by the heart during its pumping action are transmitted to the skin through the body’s tissues, and these signals can be recorded
on the body’s surface and are represented as an electrocardiogram (ECG).
The ECG can be used to detect many cardiac abnormalities. Section II, with
three chapters, deals with some of the recent techniques and advances in the
ECG application areas.
The third chapter, by Nugent, Finlay, Donnelly, and Black, presents an overview of the application of neural networks in the field of ECG classification.
Neural networks have emerged as a strong candidate in this area as the highly
non-linear and chaotic nature of the ECG represents a well-suited application
for this technique. The authors highlight issues that relate to the acceptance of
this technique and, in addition, identify challenges faced for the future.
In the fourth chapter, Camps-Valls and Guerrero-Martínez continue with further applications of neural networks in cardiac pathology discrimination based
on ECG signals. They discuss advantages and drawbacks of neural and adaptive systems in cardiovascular medicine and some of the forthcoming developments in machine learning models for use in the real clinical environment. They
discuss some of the problems that can arise during the learning tasks of beat
detection, feature selection/extraction and classification, and subsequently provide proposals and suggestions to alleviate the problems.
Chapter V, by Li, Luk, Fu, and Krishnan, presents a new concept learningbased approach for abnormal ECG beat detection to facilitate long-term monitoring of heart patients. The uniqueness in this approach is the use of their
complementary concept, “normal”, for the learning task. The authors trained a
ν-Support Vector Classifier (ν-SVC) with only normal ECG beats from a specific patient to relieve the doctors from annotating the training data beat by
beat. The trained model was then used to detect abnormal beats in the long-

ix

term ECG recording of the same patient. They then compared the conceptlearning model with other classifiers, including multilayer feedforward neural
networks and binary support vector machines.
Two chapters in Section III focus on applications of neural networks in the area
of electromyography (EMG) pattern recognition. Tsuji et al., in Chapter VI,
discuss the use of probabilistic neural networks (PNNs) for pattern recognition
of EMG signals. In this chapter, a recurrent PNN, called Recurrent Log-Linearized Gaussian Mixture Network (R-LLGMN), is introduced for EMG pattern recognition with the emphasis on utilizing temporal characteristics. The
structure of R-LLGMN is based on the algorithm of a hidden Markov model
(HMM), and, hence, R-LLGMN inherits advantages from both HMM and neural computation. The authors present experimental results to demonstrate the
suitability of R-LLGMN in EMG pattern recognition.
In Chapter VII, Tsuji, Tsujimura, and Tanaka describe an advanced intelligent
dual-arm manipulator system teleoperated by EMG signals and hand positions.
This myoelectric teleoperation system also employs a probabilistic neural network, LLGMN, to gauge the operator’s intended hand motion from EMG patterns measured during tasks. In this chapter, the authors also introduce an eventdriven task model using Petri net and a non-contact impedance control method
to allow a human operator to maneuver robotic manipulators.
Section IV presents two interesting chapters. Kamath et al., in Chapter VIII,
describe applications of neural networks in the analysis of bioelectric potentials
representing the brain activity level, often represented using
electroencephalography plots (EEG). Neural networks have a major role to
play in the EEG signal processing because of their effectiveness as pattern
classifiers. In this chapter, the authors study several specific applications, for
example: (1) identification of abnormal EEG activity in patients with neurological diseases such as epilepsy, Huntington’s disease, and Alzheimer’s disease;
(2) interpretation of physiological signals, including EEG recorded during sleep
and surgery under anaesthesia; (3) controlling external devices using embedded signals within the EEG waveform called BCI or brain-computer interface
which has many applications in rehabilitation like helping handicapped individuals to independently operate appliances.
The recording of an evoked response is a standard non-invasive procedure,
which is routine in many audiology and neurology clinics. The auditory brainstem
response (ABR) provides an objective method of assessing the integrity of the
auditory pathway and hence assessing an individual’s hearing level. Davey,
McCullagh, McAllister, and Houston, in Chapter IX, analyze ABR data using
ANN and decision tree classifiers.
The final section presents four chapters with applications drawn from selected
healthcare areas. Chapter X, by Begg, Kamruzzaman, and Sarker, provides an
overview of artificial neural network applications for detection and classifica-

x

tion of various gait types from their characteristics. Gait analysis is routinely
used for detecting abnormality in the lower limbs and also for evaluating the
progress of various treatments. Neural networks have been shown to perform
better compared to statistical techniques in some gait classification tasks. Various studies undertaken in this area are discussed with a particular focus on
neural network’s potential as gait diagnostics. Examples are presented to demonstrate neural network’s suitability for automated recognition of gait changes
due to ageing from their respective gait-pattern characteristics and their potentials for recognition of at-risk or faulty gait.
Camps-Valls and Martín-Guerrero, in Chapter XI, discuss important advances
in the area of dosage formulations, therapeutic drug monitoring (TDM), and the
role of combined therapies in the improvement of the quality of life of patients.
In this chapter, the authors review the various applications of neural and kernel
models for TDM and present illustrative examples in real clinical problems to
demonstrate improved performance by neural and kernel methods in the area.
Chapter XII, by Morsi and Das, describes the utilization of Computational Fluid
Dynamics (CFD) with neural networks for analysis of medical equipment. They
present the concept of mathematical modeling in solving engineering problems,
CFD techniques and the associated numerical techniques. A case study on the
design and optimization of scaffold of heart valve for tissue engineering application using CFD and neural network is presented. In the end, they offer interesting discussion on the advantage and disadvantage of neural network techniques for the CFD modeling of medical devices and their future prospective.
The final chapter, by Benyó discusses neural network applications in the analysis
of two important physiological parameters: cerebral blood flow (CBF) and respiration. Investigation of the temporal blood flow pattern before, during, and after the
development of CBF oscillations has many important applications, for example, in
the early identification of cerebrovascular dysfunction such as brain trauma or
stroke. The author later introduces the online method to recognize the most common breathing disorder, the sleep apnea syndrome, based on the nasal airflow.
We hope the book will be of enormous help to a broad audience of readership,
including researchers, professionals, lecturers, and graduate students from a
wide range of disciplines. We also trust that the ideas presented in this book will
trigger further research efforts and development works in this very important
and highly multidisciplinary area involving many fields (e.g., computing, biomedical engineering, biomedicine, human health, etc.).
Rezaul Begg, Victoria University, Australia
Joarder Kamruzzaman, Monash University, Australia
Ruhul Sarker, University of New South Wales, Australia
Editors

xi

Acknowledgments

The editors would like to thank all the authors for their excellent contributions
to this book and also everybody involved in the review process of the book
without whose support the project could not have been satisfactorily completed.
Each book chapter has undergone a peer-review process by at least two reviewers. Most of the authors of chapters also served as referees for chapters
written by other authors. Thanks to all those who provided critical, constructive, and comprehensive reviews that helped to improve the scientific and technical quality of the chapters. In particular, special thanks goes to (in alphabetical order): Mahfuz Aziz of the University of South Australia; Harjeet Bajaj of
McMaster University; Balazs Benyo of Budapest University of Technology
and Economics; Gustavo Camps-Vall of Universitat de Valencia; Paul
McCullough of the University of Ulster at Jordanville; Chris Nugent of the
University of Ulster at Jordanville; Toshio Tsuji of Hiroshima University; Markad
Kamath of McMaster University; Tony Sparrow of Deakin University; Tharshan
Vaithianathan of the University of South Australia; and Wolfgang I. Schöllhorn
of Westfälische Wilhelms-Universität Münster for their prompt, detailed, and
constructive feedback on the submitted chapters.
We would like to thank our university authorities (Victoria University, Monash
University, and the University of New South Wales @ADFA) for providing
logistic support throughout this project.
The editors would also like to thank the publishing team at Idea Group Inc., who
provided continuous help, encouragement. and professional support from the
initial proposal stage to the final publication, with special thanks to Mehdi
Khosrow-Pour, Jan Travers, Renée Davies, Amanda Phillips, and Kristin Roth.

xii

Finally, we thank our families, especially our wives and children for their love
and support throughout the project.
Rezaul Begg, Victoria University, Australia
Joarder Kamruzzaman, Monash University, Australia
Ruhul Sarker, University of New South Wales, Australia
Editors

Section I
Introduction and
Applications in Healthcare

Artificial Neural Networks and their Applications in Healthcare 1

Chapter I

Overview of Artificial
Neural Networks
and their Applications
in Healthcare
Joarder Kamruzzaman, Monash University, Australia
Rezaul Begg, Victoria University, Australia
Ruhul Sarker, University of New South Wales, Australia

Abstract
Artificial neural network (ANN) is one of the main constituents of the
artificial intelligence techniques. Like in many other areas, ANN has made
a significant mark in the domain of healthcare applications. In this chapter,
we provide an overview of the basics of neural networks, their operation,
major architectures that are widely employed for modeling the input-tooutput relations, and the commonly used learning algorithms for training
the neural network models. Subsequently, we briefly outline some of the
major application areas of neural networks for the improvement and well
being of human health.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

2 Kamruzzaman, Begg, & Sarker

Introduction
Following the landmark work undertaken by Rumelhart and his colleagues during
the 1980s (Rumelhart et al., 1986), artificial neural networks (ANNs) have
drawn tremendous interest due to their demonstrated successful applications in
many pattern recognition and modeling works, including image processing
(Duranton, 1996), engineering tasks (Rafiq et al., 2001), financial modeling
(Coakley & Brown, 2000; Fadlalla & Lin, 2001), manufacturing (Hans et al.,
2000; Wu, 1992), biomedicine (Nazeran & Behbehani, 2000), and so forth. In
recent years, there has been a wide acceptance by the research community in
the use of ANN as a tool for solving many biomedical and healthcare problems.
Within the healthcare area, significant applications of neural networks include
biomedical signal processing, diagnosis of diseases, and also aiding medical
decision support systems.
Though developed as a model for mimicking human intelligence into machines,
neural networks have an excellent capability of learning the relationship between
the input-output mapping from a given dataset without any prior knowledge or
assumptions about the statistical distribution of the data. This capability of
learning from data without any a priori knowledge makes the neural network
quite suitable for classification and regression tasks in practical situations. In
many biomedical applications, classification and regression tasks constitute a
major and integral part. Furthermore, neural networks are inherently nonlinear
which makes them more practicable for accurate modeling of complex data
patterns, as opposed to many traditional methods based on linear techniques.
ANNs have been shown in many real world problems, including biomedical
areas, to outperform statistical classifiers and multiple regression techniques for
the analysis of data. Because of their ability to generalize unseen data well, they
are also suitable for dealing with outliers in the data as well as tackling missing
and/or noisy data. Neural networks have also been used in combination with
other techniques to tie together the strengths and advantages of both techniques.
Since the book aims to demonstrate innovative and successful applications of
neural networks in healthcare areas, this introductory chapter presents a broad
overview of neural networks, various architectures and learning algorithms, and
concludes with some of the common applications in healthcare and biomedical areas.

Artificial Neural Networks
Artificial neural networks are highly structured information processing units
operating in parallel and attempting to mimic the huge computational ability of the
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 3

human brain and nervous system. Even though the basic computational elements
of the human brain are extremely slow devices compared to serial processors,
the human brain can easily perform certain types of tasks that conventional
computers might take astronomical amounts of time and, in most cases, may be
unable to perform the task. By attempting to emulate the human brain, neural
networks learn from experience, generalize from previous examples, abstract
essential characteristics from inputs containing irrelevant data, and deal with
fuzzy situations. ANNs consist of many neurons and synaptic strengths called
weights. These neurons and weights are used to mimic the nervous system in the
way weighted signals travel through the network. Although artificial neural
networks have some functional similarity to biological neurons, they are much
more simplified, and therefore, the resemblance between artificial and biological
neurons is only superficial.

Individual Neural Computation
A neuron (also called unit or node) is the basic computational unit of a neural
network. This concept was initially developed by McCulloch and Pitt (1943).
Figure 1 shows an artificial neuron, which performs the following tasks:
a.

Receives signals from other neurons.

b.

Multiplies each signal by the corresponding connection strength, that is,
weight.

c.

Sums up the weighted signals and passes them through an activation
function.

d.

Feeds output to other neurons.

Denoting the input signal by a vector x (x1, x2,…, xn) and the corresponding
weights to unit j by wj (wj1, wj2,…, wjn), the net input to the unit j is given by
net j = ∑ w jn x n + w j 0 = w j x + b
n

(1)

The weight wj0(=b) is a special weight called bias whose input signal is always +1.
There are many types of activation functions used in the literature. Activation
functions are mostly nonlinear; however, linear activation functions are also
used. When neurons are arranged in multiple layers, a network with the linear
activation function can be represented as a network of a single layer. Such a
network has limited capability since it can only solve linearly separable problems.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

4 Kamruzzaman, Begg, & Sarker

Figure 1. Individual neural computation
x0=1
wj0
x1

wj1
wj2

x2

j

output = f(netj)

.
.
.
wjn

xn

Usually, the final layer neurons can have linear activation functions while
intermediate layer neurons implement nonlinear functions. Since most real world
problems are nonlinearly separable, nonlinearity in intermediate layers is essential for modeling complex problems. There are many different activation
functions proposed in the literature, and they are often chosen to be a monotonically increasing function. The following are the most commonly used activation
functions:
Linear f(x) = x
Hyperbolic tangent f(x) = tanh(x)
Sigmoidal f ( x) =

1
1 + e− x

Gaussian f(x) = exp(− x 2 / 2 σ2)

Neural Network Models
There have been many neural network models proposed in the literature that vary
in terms of topology and operational mode. Each model can be specified by the
following seven major concepts (Hush & Horne, 1993; Lippman, 1987):

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 5

1.

A set of processing units.

2.

An activation function of each neuron.

3.

Pattern of connectivity among neurons, that is, network topology.

4.

Propagation method of activities of neurons through the network.

5.

Rules to update the activities of each node.

6.

External environment that feeds information to the network.

7.

Learning method to modify the pattern of connectivity.

The most common way is to arrange the neurons in a series of layers. The first
layer is known as the input layer, the final one as the output layer, and any
intermediate layer(s) are called hidden layer(s). In a multilayer feedforward
network, the information signal always propagates along the forward direction.
The number of input units at the input layer is dictated by the number of feature
values or independent variables, and the number of units at the output corresponds to the number of classes or values to be predicted. There are no widely
accepted rules for determining the optimal number of hidden units. A network
with fewer than the required number of hidden units will be unable to learn the
input-output mapping well, whereas too many hidden units will generalize poorly
on any unseen data. Several researchers in the past attempted to determine the
appropriate size of the hidden units. Kung and Hwang (1988) suggested that the
number of hidden units should be equal to the number of distinct training patterns,
while Masahiko (1989) concluded that N input patterns would require N-1 hidden
units in a single layer. However, as remarked by Lee (1997), it is rather difficult
to determine the optimum network size in advance. Other studies have suggested
that ANNs would generalize better when succeeding layers are smaller than the
preceding ones (Kruschke, 1989; Looney, 1996). Although a two-layer network
(i.e., two layers of weights) is commonly used in most problem solving approaches, the determination of an appropriate network configuration usually
involves many trial and error methods. Another way to select network size is to
use constructive approaches. In constructive approaches, the network starts
with a minimal size and grows gradually during the training (Fahlman & Lebiere,
1990; Lehtokangas, 2000). In feedback network topology, neurons are interconnected with neurons in the same layer or neurons from a proceeding layer.
Learning rules that are used to train a network architecture can be divided into
two main types: supervised and unsupervised. In supervised training, the network
is presented with a set of input-output pairs; that is, for each input, an associated
target output is known. The network adjusts its weights using a known set of
input-output pairs, and once training is completed, it is expected to produce a
correct output in response to an unknown input. In unsupervised training, the
network adjusts its weights in response to input patterns without having any

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

6 Kamruzzaman, Begg, & Sarker

known associated outputs. Through unsupervised training, the network learns to
classify input patterns in similarity categories. This can be useful in situations
where no known output corresponding to an input exists. We expect an
unsupervised neural network to discover any rule that might find a correct
response to an input.

Learning Algorithms
During learning, a neural network gradually modifies its weights and settles down
to a set of weights capable of realizing the input-output mapping with either no
error or a minimum error set by the user. The most common type of supervised
learning is backpropagation learning. Some other supervised learning includes:
radial basis function (RBF), probabilistic neural network (PNN), generalized
regression neural network (GRNN), cascade-correlation, and so forth. Some
examples of unsupervised learning, for instance, self-organizing map (SOM),
adaptive resonance theory (ART), and so forth, are used when training sets with
known outputs are not available. In the following, we describe some of the widely
used neural network learning algorithms.

Backpropagation Algorithm
A recent study (Wong et al., 1997) has shown that approximately 95% of the
reported neural network applications utilize multilayer feedforward neural
networks with the backpropagation learning algorithm. Backpropagation
(Rumelhart et al., 1986) is a feedforward network, as shown in Figure 2. In a fully

Figure 2. A three-layer backpropogation network. Not all of the
interconnections are shown.

ypk

ω kj
h pj

j

M output
L hidden

ω ji
N Input

i
xpi

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 7

connected network, each hidden unit is connected with every unit at the bottom
and upper layers. Units are not connected to the other units at the same layer.
A backpropagation network must have at least two layers of weights. Cybenko
(1989) showed that any continuous function could be approximated to an
arbitrary accuracy by a two-layer feedforward network with a sufficient number
of hidden units. Backpropagation applies a gradient descent technique iteratively
to change the connection weights. Each iteration consists of two phases: a
propagation phase and an error backpropagation phase. During the propagation
phase, input signals are multiplied by the corresponding weights, propagate
through the hidden layers, and produce output(s) at the output layer. The outputs
are then compared with the corresponding desired (target) outputs. If the two
match, no changes in weights are made. If the outputs produced by the network
are different from the desired outputs, error signals are calculated at the output
layer. These error signals are propagated backward to the input layer, and the
weights are adjusted accordingly.
Consider a set of input vectors (x1,x2,…,xp) and a set of corresponding output
vectors (y1,y2,…,yp) to be trained by the backpropagation learning algorithm. All
the weights between layers are initialized to small random values at the
beginning. All the weighted inputs to each unit of the upper layer are summed up
and produce an output governed by the following equations:
y pk = f (∑ ω kj h pj + θ k ) ,

(2)

h pj = f (∑ ω ji x pi + θ j ) ,

(3)

j

i

where hpj and ypk are the outputs of hidden unit j and output unit k, respectively,
for pattern p. ω stands for connecting weights between units, θ stands for the
threshold of the units, and f(.) is the sigmoid activation function.
The cost function to be minimized in standard backpropagation is the sum of the
squared error measured at the output layer and defined as:
1
2
E = å å (t pk - y pk )
2 p k

(4)

where tkp is the target output of neuron k for pattern p.
Backpropagation uses the steepest descent technique for changing weights in
order to minimize the cost function of Equation (4). The weight update at t-th
iteration is governed by the following equation:
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

8 Kamruzzaman, Begg, & Sarker

DM (t ) =- D

¶E
+α∆ωji(t–1)
¶M (t )

(5)

where η and α are the learning rate and momentum factor, respectively. For the
sigmoid activation function, the weight update rule can be further simplified as:
∆ωkj(t) =ηδkp yjp+α∆ωji(t–1), for output layer weight

(6)

∆ωji(t) =ηδjp yip+α∆ωji(t–1), for hidden layer weight

(7)

δ kp=(tkp –y kp)y kp (1–y kp)

(8)

δjp=ykp(1–ykp) (S δkpωkj)

(9)

k

@ kp and @ jp are called the error signals measured at the output and hidden layer,
respectively.

A neural network with multiple layers of adaptive weights may contain many
local minima and maxima in weight space which result in gradients of the error
E very close to zero. Since the gradient of E is a factor of weight update in
backpropagation techniques, it causes more iterations and becomes trapped in
local minima for an extensive period of time. During the training session, the error
usually decreases with iterations. Trapping into local minima may lead to a
situation where the error does not decrease at all. When a local minima is encountered,
the network may be able to get out of the local minima by changing the learning
parameters or hidden unit numbers. Several other variations of backpropagation
learning that have been reported to have faster convergence and improved generalization on unseen data are scaled conjugate backpropagation (Hagan et al., 1996),
Bayesian regularization techniques (Mackay, 1992), and so forth.

Radial Basis Function Network
A radial basis function (RBF) network, as shown in Figure 3, has a hidden layer
of radial units and an output layer of linear units. RBFs are local networks, as
compared to feedforward networks that perform global mapping. Each radial

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 9

unit is most receptive to a local region of the input space. Unlike hidden layer units
in the preceding algorithm where the activation level of a unit is determined using
weighted sum, a radial unit (i.e., local receptor field) is defined by its center point
and a radius. Similar input vectors are clustered and put into various radial units.
If an input vector lies near the centroid of a particular cluster, that radial unit will
be activated. The activation level of the i-th radial unit is expressed as:

⎛ || x − ||2 ⎞
ui ⎟
⎜
exp
=
hi
⎜⎜ 2 2 ⎟⎟
σi ⎠
⎝

(10)

where x is the input vector, ui is a vector with the same dimension as x denoting
the center, and s is the width of the function. The activation level of the radial
basis function hi for i-th radial unit is maximum when the x is at the center ui of
that unit. The final output of the RBF network can be computed as the weighted
sum of the outputs of the radial units as:
yi = ∑ ω i h i ( x)
i

where ωi is the connection weight between the radial unit i and the output unit,
and the solution can be written directly as ωt= R†y, where R is a vector whose
components are the output of radial units, and y is the target vector. The
adjustable parameters of the network, that is, the center and shape of radial basis
units (ui, σi, and ωi), can be trained by a supervised training algorithm. Centers
should be assigned to reflect the natural clustering of the data. Lowe (1995)
proposed a method to determine the centers based on standard deviations of
training data. Moody and Darken (1989) selected the centers by means of data
clustering techniques like k-means clustering, and σis are then estimated by
taking the average distance to several nearest neighbors of uis. Nowlan and
Hinton (1992) proposed soft competition among radial units based on maximum
likelihood estimation of the centers.

Probabilistic Neural Network
In the case of a classification problem, neural network learning can be thought
of as estimating the probability density function (pdf) from the data. An
alternative approach to pdf estimation is the kernel-based approximation, and this
motivated the development of probabilistic neural network (PNN) by Specht

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

10 Kamruzzaman, Begg, & Sarker

Figure 3. A radial basis function network. Not all of the interconnections
are shown. Each basis function acts like a hidden unit.
Output y

h1

h2

h3

hn

Basis
functions

Input x

(1990) for classification task. It is a supervised neural network that is widely used
in the area of pattern recognition, nonlinear mapping, and estimation of the
probability of class membership and likelihood ratios (Specht & Romsdahl,
1994). It is also closely related to Bayes’ classification rule and Parzen
nonparametric probability density function estimation theory (Parzen, 1962;
Specht, 1990). The fact that PNNs offer a way to interpret the network’s
structure in terms of probability density functions is an important merit of these
type of networks. PNN also achieves faster training than backpropagation type
feedforward neural networks.
The structure of a PNN is similar to that of feedforward neural networks,
although the architecture of a PNN is limited to four layers: the input layer, the
pattern layer, the summation layer, and the output layer, as illustrated in Figure
4. An input vector x is applied to the n input neurons and is passed to the pattern
layer. The neurons of the pattern layer are divided into K groups, one for each
class. The i-th pattern neuron in the k-th group computes its output using a
Gaussian kernel of the form:

F k , i ( x) =

1

(2π σ 2)

exp ( −
n/2

|| x − x k , i ||
2σ 2

2

)

(11)

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 11

Figure 4. A probabilistic neural network

C

G1

F1,1

Output Layer

Gk

G2

F1,M1

F2,1

F1,M2

Fk,1

Summation Layer

Fk,Mk

Pattern Layer

Input Layer
x1

x2

xn

where xk,i is the center of the kernel, and σ, called the spread (smoothing)
parameter, determines the size of the receptive field of the kernel. The
summation layer contains one neuron for each class. The summation layer of the
network computes the approximation of the conditional class probability functions through a combination of the previously computed densities as follows:
Mk

G k (x) = ∑ ω ki F ki (x),
i =1

k ∈{1, L , K },

(12)

where Mk is the number of pattern neurons of class k, and ωki are positive
coefficients satisfying ∑iM=1 k ω ki = 1. Pattern vector x is classified to belong to the
class that corresponds to the summation unit with the maximum output. The
parameter that needs to be determined for optimal PNN is the smoothing
parameter. One way of doing that is to select an arbitrary set of σ, train the
network, and test on the validation set. The procedure is repeated to find the set
of σ that produces the least misclassification. An alternative way to search the
optimal smoothing parameter was proposed by Masters (1995). The main

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

12 Kamruzzaman, Begg, & Sarker

disadvantages of PNN algorithm is that the network can grow very big and slow
to execute with a large training set, making it impractical for large classification
problems.

Self-Organizing Feature Map
In contrast to the previous learning algorithms, Kohonen’s (1988) self-organizing
feature map (SOFM) is an unsupervised learning algorithm that discovers the
natural association found in the data. SOFM combines an input layer with a
competitive layer where the units compete with one another for the opportunity
to respond to the input data. The winner unit represents the category for the input
pattern. Similarities among the data are mapped into closeness of relationship on
the competitive layer.
Figure 5 shows a basic structure for a SOFM. Each unit in the competitive layer
is connected to all the input units. When an input is presented, the competitive
layer units sum their weighted inputs and compete. Initially, the weights are
assigned to small random values. When an input x is presented, it calculates the
distance dj between x and w j (weight of unit j) in the competitive layer as:
dj = & x - wj&

The winner unit c is the one with the lowest distance, such that

( x − w j ) , for all of unit j in the competitive layer.
dc = min
j
Figure 5. Architecture of a self-organizing map
dj
j

x1

x2

xi

…

xn

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 13

After identifying the winner unit, the neighborhood around it is identified. The
neighborhood is usually in the form of a square shape centered on the wining unit c.
Denoting the neighborhood as Nc, the weighs between input i and competitive
layer unit j are updated as
⎧η ( x i − w ij ) for unit j in the neighbourhood of N c
∆ w ji = ⎨
⎩ 0 otherwise

where η is the learning rate parameter. After updating the weights to the wining
unit and its neighborhood, they become more similar to the input pattern. When
the same or similar input is presented subsequently, the winner is most likely to
win the competition. Initially, the algorithm starts with large values of learning
rate η and the neighborhood size Nc, and then gradually decreases as the learning
progresses.
The other notable unsupervised learning algorithm is adaptive resonance theory
(ART) by Carpenter and Grossberg (1988), which is not so commonly used in
biomedical applications and hence left out of the discussion for the current
chapter. Interested readers may consult relevant works.

Overview of Neural Network
Applications in Healthcare
One of the major goals of the modern healthcare system is to offer quality
healthcare services to individuals in need. To achieve that objective, a key
requirement is the early diagnosis of diseases so that appropriate intervention
programs can be exercised to achieve better outcomes. There have been many
significant advances in recent times in the development of medical technology
aimed at helping the healthcare needs of our community. Artificial intelligence
techniques and intelligent systems have found many valuable applications to
assist in that cause (cf. Ifeachor et al., 1998; Teodorrescu et al., 1998).
Specifically, neural networks have been demonstrated to be very useful in many
biomedical areas, to help with the diagnosis of diseases and studying the
pathological conditions, and also for monitoring the progress of various treatment
outcomes. In providing assistance with the task of processing and analysis of
biomedical signals, neural network tools have been very effective. Some of such
common application areas include analysis of electrocardiography (ECG),
electromyography (EMG), electroencephalography (EEG), and gait and move-

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

14 Kamruzzaman, Begg, & Sarker

ment biomechanics data. Furthermore, neural network’s potentials have been
demonstrated in many other healthcare areas, for example, medical image
analysis, speech/auditory signal recognition and processing, sleep apnea detection, and so on.
The ECG signal is a representation of the bioelectrical activity of the heart’s
pumping action. This signal is recorded via electrodes placed on the patient’s
chest. The physician routinely uses ECG time-history plots and the associated
characteristic features of P, QRS, and T waveforms to study and diagnose the
heart’s overall function. Deviations in these waveforms have been linked to
many forms of heart diseases, and neural network have played a significant role
in helping the ECG diagnosis process. For example, neural networks have been
used to detect signs of acute myocardial infarction (AMI), cardiac arrhythmias,
and other forms of cardiac abnormalities (Baxt, 1991; Nazeran & Behbehani,
2001). Neural networks have performed exceptionally well when applied to
differentiate patients with and without a particular abnormality, for example, in
the diagnosis of patients with AMI (97.2% sensitivity and 96.2% specificity;
Baxt, 1991).
Electromyography (EMG) is the electrical activity of the contracting muscles.
EMG signals can be used to monitor the activity of the muscles during a task or
movement and can potentially lead to the diagnosis of muscular disorders. Both
amplitude and timing of the EMG data are used to investigate muscle function.
Neural networks have been shown to help in the modeling between mechanical
muscle force generation and the corresponding recorded EMG signals (Wang &
Buchanan, 2002). Neuromuscular diseases can affect the activity of the muscles
(e.g., motor neuron disease), and neural networks have been proven useful in
identifying individuals with neuromuscular diseases from features extracted
from the motor unit action potentials of their muscles (e.g., Pattichis et al., 1995).
The EEG signal represents electrical activity of the neurons of the brain and is
recorded using electrodes placed on the human scalp. The EEG signals and their
characteristic plots are often used as a guide to diagnose neurological disorders,
such as epilepsy, dementia, stroke, and brain injury or damage. The presence of
these neurological disorders is reflected in the EEG waveforms. Like many other
pattern recognition techniques, neural networks have been used to detect
changes in the EEG waveforms as a result of various neurological and other
forms of abnormalities that can affect the neuronal activity of the brain. A wellknown application of neural networks in EEG signal analysis is the detection of
epileptic seizures, which often result in a sudden and transient disturbance of the
body movement due to excessive discharge of the brain cells. This seizure event
results in spikes in the EEG waveforms, and neural networks and other artificial
intelligence tools, such as fuzzy logic and support vector machines, have been
employed for automated detection of these spikes in the EEG waveform. Neural
networks-aided EEG analysis has also been undertaken for the diagnosis of
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 15

many other related pathologies, including Huntington’s and Alzheimer’s diseases
(Jervis et al., 1992; Yagneswaran et al., 2002). Another important emerging
application of neural networks is in the area of brain computer interface (BCI),
in which neural networks use EEG activity to extract embedded features linked
to mental status or cognitive tasks to interact with the external environment
(Culpepper & Keller, 2003). Such capability has many important applications in
the area of rehabilitation by aiding communication for physically disabled people
with the external environment (Garrett et al., 2003; Geva & Kerem, 1998). Other
related applications of neural networks include analysis of evoked potentials and
evoked responses of the brain in response to various external stimuli that are
reflected in the EEG waveforms (Hoppe et al., 2001).
Gait is the systematic analysis of human walking. Various instrumentations are
available to study different aspects of gait. Among its many applications, gait
analysis is increasingly used to diagnose abnormality in the lower limb functions
and to assess the progress of improvement as a result of interventions. Neural
networks have found widespread applications for gait pattern recognition and
clustering of gait types, for example, to classify simulated gait patterns (Barton
& Lees, 1997) or to identify normal and pathological gait patterns (Holzreiter &
Kohle, 1993; Wu et al., 1998). Gait also changes significantly in aging people with
potential risks to loss of balance and falls, and neural networks have been useful
in the automated recognition of aging individuals with balance disorders using gait
measures (Begg et al., 2005). Further applications in gait and clinical biomechanics areas may be found in Chau (2001) and Schöllhorn (2004).
There are numerous other areas within the biomedical fields where neural
networks have contributed significantly. Some of these applications include
medical image diagnosis (Egmont-Petersen et al., 2001), low back pain diagnosis
(Gioftsos & Grieve, 1996), breast cancer diagnosis (Abbass, 2002), glaucoma
diagnosis (Chan et al., 2002), medical decision support systems (Silva & Silva,
1998), and so forth. The reader is referred to Chapter II for a comprehensive
overview of neural networks applications for biomedical signal analysis in some
of these selected healthcare areas.

References
Abbass, H. A. (2002). An evolutionary artificial neural networks approach for
breast cancer diagnosis. Artificial Intelligence in Medicine, 25(3), 265281.
Barton, J. G., & Lees, A. (1997). An application of neural networks for
distinguishing gait patterns on the basis of hip-knee joint angle diagrams.
Gait and Posture, 5, 28-33.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

16 Kamruzzaman, Begg, & Sarker

Baxt, W. G. (1991). Use of an artificial neural network for the diagnosis of
myocardial infarction. Annals of Internal Medicine, 115, 843-848.
Begg, R. K., Hasan, R., Taylor, S., & Palaniswami, M. (2005, January).
Artificial neural network models in the diagnosis of balance impairments.
Proceedings of the Second International Conference on Intelligent
Sensing and Information Processing, Chennai, India.
Carpenter, G.A., & Grossberg, S. (1988). The ART of adaptive pattern
recognition by a self-organizing neural network. Computer, 21(3), 77-88.
Chan, K., Lee, T. W., Sample, P. A., Goldbaum, M. H., Weinreb, R. N., &
Sejnowski, T. J. (2002). Comparison of machine learning and traditional
classifiers in glaucoma diagnosis. IEEE Transactions on Biomedical
Engineering, 49(9), 963-974.
Chau, T. (2001). A review of analytical techniques for gait data. Part 2: Neural
network and wavelet methods. Gait Posture, 13, 102-120.
Coakley, J., & Brown, C. (2000). Artificial neural networks in accounting and
finance: Modeling issues. International Journal of Intelligent Systems in
Accounting, Finance & Management, 9, 119-144.
Culpepper, B. J., & Keller, R. M. (2003). Enabling computer decisions based on
EEG input. IEEE Transactions on Neural Systems and Rehabilitation
Engineering, 11(4), 354-360.
Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function.
Mathematical Control Signal Systems, 2, 303-314.
Duranton, M. (1996). Image processing by neural networks. IEEE Micro,
16(5), 12-19.
Egmont-Petersen, M., de Ridder, D., & Handels, H. (2001). Image processing
with neural networks: A review. Pattern Recognition, 35, 2279-2301.
Fadlalla, A., & Lin, C. H. (2001). An analysis of the applications of neural
networks in finance. Interfaces, 31(4), 112-122.
Fahlman, S. E., & Lebiere, C. (1990). The cascade-correlation learning architecture. Advances in Neural Information Processing Systems, 2, 524-532.
Garrett, D., Peterson, D. A., Anderson, C. W., & Thaur, M. H. (2003).
Comparison of linear, non-linear and feature selection methods for EEG
signal classification. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11, 141-147.
Geva, A.B., & Kerem, D. H. (1998). Brain state identification and forecasting
of acute pathology using unsupervised fuzzy clustering of EEG temporal
patterns. In T. Teodorrescu, A. Kandel, & L. C. Jain (Eds.), Fuzzy and
neuro-fuzzy systems in medicine (pp. 57-93). Boca Raton, FL: CRC
Press.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 17

Gioftsos, G., & Grieve, D.W. (1996). The use of artificial neural networks to
identify patients with chronic low-back pain conditions from patterns of sitto-stand manoeuvres. Clinical Biomechanics, 11(5), 275-280.
Hagan, M. T., Demuth, H. B., & Beale, M. H. (1996). Neural network design.
Boston: PWS Publishing.
Hans, R. K., Sharma, R. S., Srivastava, S., & Patvardham, C. (2000). Modeling
of manufacturing processes with ANNs for intelligent manufacturing.
International Journal of Machine Tools & Manufacture, 40, 851-868.
Holzreiter, S. H., & Kohle, M. E. (1993). Assessment of gait pattern using neural
networks. Journal of Biomechanics, 26, 645-651.
Hoppe, U., Weiss, S., Stewart, R. W., & Eysholdt, U. (2001). An automatic
sequential recognition method for cortical auditory evoked potentials. IEEE
Transactions on Biomedical Engineering, 48(2), 154-164.
Hush, P. R., & Horne, B. G. (1993). Progress in supervised neural networks.
IEEE Signal Processing, 1, 8-39.
Ifeachor, E. C., Sperduti, A., & Starita, A. (1998). Neural networks and
expert systems in medicine and health care. Singapore: World Scientific Publishing.
Jervis, B. W., Saatchi, M. R., Lacey, A., Papadourakis, G. M., Vourkas, M.,
Roberts, T., Allen, E. M., Hudson, N. R., & Oke, S. (1992). The application
of unsupervised artificial neural networks to the sub-classification of
subjects at-risk of Huntington’s Disease. IEEE Colloquium on Intelligent
Decision Support Systems and Medicine, 5, 1-9.
Kohonen, T. (1988). Self-organisation and associative memory. New York:
Springer-Verlag.
Kruschke, J. K. (1989). Improving generalization in backpropagation networks
with distributed bottlenecks. Proceedings of the IEEE/INNS International Joint Conference on Neural Networks, 1, 443-447.
Kung, S. Y., & Hwang, J. N. (1988). An algebraic projection analysis for optimal
hidden units size and learning rate in backpropagation learning. Proceedings of the IEEE/INNS International Joint Conference on Neural
Networks, 1, 363-370.
Lee, C. W. (1997). Training feedforward neural networks: An algorithm for
improving generalization. Neural Networks, 10, 61-68.
Lehtokangas, M. (2000). Modified cascade-correlation learning for classification. IEEE Transactions on Neural Networks, 11, 795-798.
Lippman, R. P. (1987). An introduction to computing with neural nets. IEEE
Association Press Journal, 4(2), 4-22.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

18 Kamruzzaman, Begg, & Sarker

Looney, C. G. (1996). Advances in feedforward neural networks: Demystifying
knowledge acquiring black boxes. IEEE Transactions on Knowledge &
Data Engineering, 8, 211-226.
Lowe, D. (1995). Radial basis function networks. In M.A. Arbib (Ed.), The
handbook of brain theory and neural networks. Cambridge, MA: MIT
Press.
Mackay, D. J. C. (1992). Bayesian interpolation. Neural Computation, 4, 415-447.
Masahiko, A. (1989). Mapping abilities of three layer neural networks. Proceedings of the IEEE/INNS International Joint Conference on Neural
Networks, 1, 419-423.
Masters, T. (1995). Advanced algorithms for neural networks. New York:
John Wiley & Sons.
McCullah, W. S., & Pitts, W. (1943). A logical calculus of ideas immanent in
nervous activity. Bulletin of Mathematical Biophysics, 5, 115-133.
Moody, J., & Darken, C. J. (1989). Fast learning in networks of locally-tuned
processing units. Neural Computation, 1(2), 281-294.
Nazeran, H., & Behbehani, K. (2001). Neural networks in processing and
analysis of biomedical signals. In M. Akay (Ed.), Nonlinear biomedical
signal processing: Fuzzy logic, neural networks and new algorithms
(pp. 69-97). IEEE Press.
Nowlan, S.J., & Hinton, G.E. (1992). Simplifying neural networks by soft
weight-sharing. Neural Computation, 4(4), 473-493.
Parzen, E. (1962). On the estimation of a probability density function and mode.
Annals of Mathematical Statistics, 3, 1065-1076.
Pattichis, C. S., Schizas, C. N., & Middleton, L. T. (1995). Neural network
models in EMG diagnosis. IEEE Transactions on Biomedical Engineering, 42(5), 486-496.
Rafiq, M. Y., Bugmann, G., & Easterbrook, D.J. (2001). Neural network design
for engineering applications. Computers & Structures, 79, 1541-1552.
Rumelhart, D. E., McClelland, J. L., & the PDP Research Group (1986).
Parallel Distributed Processing, 1.
Schöllhorn, W. I. (2004). Applications of artificial neural nets in clinical
biomechanics. Clinical Biomechanics, 19, 876-98.
Silva, R., & Silva, A. C. R. (1998). Medical diagnosis as a neural networks
pattern classification problem. In E. C. Ifeachor, A. Sperduti, & A. Starita
(Eds.), Neural networks and expert systems in medicine and health
care (pp. 25-33). Singapore: World Scientific Publishing.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

Artificial Neural Networks and their Applications in Healthcare 19

Specht, D. F. (1990). Probabilistic neural networks. Neural Networks, 1(13),
109-118.
Specht, D. F., & Romsdahl, H. (1994). Experience with adaptive probabilistic
neural network and adaptive general regression neural network. Proceedings of the IEEE/INNS International Joint Conference on Neural
Networks, 2, 1203-1208.
Teodorrescu, T., Kandel, A., & Jain, L. C. (1998). Fuzzy and neuro-fuzzy
systems in medicine. Boca Raton, FL: CRC Press.
Wang., L., & Buchanan, T.S. (2002). Prediction of joint moments using a neural
network model of muscle activation from EMG signals. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 10, 30-37.
Wong, B.K., Bodnovich, T.A., & Selvi, Y. (1997). Neural network applications
in business: A review and analysis of the literature (1988-1995). Decision
Support Systems, 19, 301-320.
Wu, B. (1992). An introduction to neural networks and their applications in
manufacturing. Journal of Intelligent Manufacturing, 3, 391-403.
Wu, W. L., Su, F. C., & Chou, C. K. (1998). Potential of the back propagation
neural networks in the assessment of gait patterns in ankle arthrodesis. In
E.C. Ifeachor, A. Sperduti, & A. Starita (Eds.), Neural networks and
expert systems in medicine and health care (pp. 92-100). Singapore:
World Scientific Publishing.
Yagneswaran, S., Baker, M., & Petrosian, A. (2002). Power frequency and
wavelet characteristics in differentiating between normal and Alzheimer
EEG. Proceedings of the Second Joint 24th Annual Conference and the
Annual Fall Meeting of the Biomedical Engineering Society (EMBS/
BMES), 1, 46-47.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

20 Schöllhorn & Jäger

Chapter II

A Survey on Various
Applications of
Artificial Neural
Networks in Selected
Fields of Healthcare
Wolfgang I. Schöllhorn,
Westfälische Wilhelms-Universität Münster, Germany
Jörg M. Jäger,
Westfälische Wilhelms-Universität Münster, Germany

Abstract
This chapter gives an overview of artificial neural networks as instruments
for processing miscellaneous biomedical signals. A variety of applications
are illustrated in several areas of healthcare. The structure of this chapter
is rather oriented on medical fields like cardiology, gynecology, or
neuromuscular control than on types of neural nets. Many examples
demonstrate how neural nets can support the diagnosis and prediction of
diseases. However, their content does not claim completeness due to the
enormous amount and exponentially increasing number of publications in

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 21

this field. Besides the potential benefits for healthcare, some remarks on
underlying assumptions are also included as well as problems which may
occur while applying artificial neural nets. It is hoped that this review gives
profound insight into strengths as well as weaknesses of artificial neural
networks as tools for processing biomedical signals.

Introduction
Until now, there has been a tremendous amount of interest in and excitement
about artificial neural networks (ANNs), also known as parallel distributed
processing models, connectionist models, and neuromorphic systems. During the
last two decades, ANNs have matured considerably from the early “first
generation” methods (Akay, 2000) to the “second generation” of classification
and regression tools (Lisboa et al., 1999) to the continuing development of “new
generation” automatic feature detection and rule extraction instruments. Although it is obvious that ANNs have already been widely exploited in the area
of biomedical signal analysis, a couple of interesting, and for healthcare reasons,
valuable applications of all generations of ANNs are introduced in the following
chapter. The selection of applications is arbitrary and does not claim completeness. The chapter is structured rather by medical domains than by type of neural
nets or type of signals because very often different types of neural nets were
compared on the basis of the same set of data, and different types of signals were
chosen for input variables. According to the interdisciplinary character of
modern medicine, this structure cannot be totally disjoint and will display some
overlappings. However, due to the enormous amount and exponentially increasing number of publications, only a coarse stroboscopic insight into a still growing
field of research will be provided.

Cardiology
The versatility of applications of ANNs with respect to their input variables is
displayed in the field of applications related to heart diseases. Instead of using
electrocardiographic (ECG) data (for a more comprehensive overview of ANN
and ECG, see Chapters III-V), laboratory parameters like blood (Baxt, 1991;
Baxt & Skora, 1996; Kennedy et al., 1997), angiography (Mobley et al., 2000),
stress redistribution scintigrams, and myocardial scintigraphy (Kukar et al.,
1999) as well as personal variables including past history (Baxt, 1991; Baxt &

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

22 Schöllhorn & Jäger

Table 1. Applications of ANNs in cardiology
Authors
Azuaje et
al. (1999)

Objective
Coronary
disease risk

Baxt
(1991);
Baxt et al.
(1996)

Acute
Myocardial
infarction
(AMI)

Dorffner
and
Porenta
(1994)

Coronary
artery
disease

Kennedy
et al.
(1997)

AMI

Kukar et
al. (1997)

Ischaemic
heart disease
(IHD)

Mobley et
al. (2004)
Mobley et
al. (2000)

Coronary
stenosis
Coronary
artery
stenosis

Input
Poincare plots
Binary coded
Analog coded
20 Items including
symptoms, past
history, and
physical and
laboratory findings,
e.g., palpitations,
Angina, Rales or
T-wave inversion
45 Parameters of the
Stress redistribution
scintigram
(5 segments times 3
views times 3
phases – stress, rest,
washout)
39 items, e.g.,
Blood and ECG
parameters derived
to 53 binary inputs
77 parameters; IHD
data for different
diagnostic levels
(e.g. signs,
symptoms, exercise
ECG, and
myocardial
scintigraphy,
angiography)

11 personal
parameters
14 angiographic
variables

Output
Two, three,
and five
levels of risk
AMI yes/no

Type of ANN
MLP
(144,576,1024/
70,200,500/30,100,
200/2,3,5)
MLP (20/10/10/1)

Results/Remarks
Even binary coded
Poincare plots
produce useful results
MLP had higher
sensitivity and
specificity in
diagnosing a
myocardial infarction
than physicians

Normal/
pathological

1) MLP
2) RBFN
3) Conic section
function networks

MLP can be among
the poorest methods
Importance of
initialization in MLP
and CSFN

AMI yes/no

MLP (53/18/1)

Trained ANN at least
as accurate as
medical doctors

Classification

1) (semi-)naive
Bayesian classifier
2) Backpropagation
learning of neural
networks
3) Two algorithms
for induction of
decision trees
(Assistant-I and -R)
4) k-nearest
neighbors
MLP (11/36/1)

Improvements in the
predictive power of
the diagnostic
process.
Analyzing and
improving the
diagnosis of
ischaemic heart
disease with machine
learning

Stenosis
yes/no
Stenosis
yes/no

MLP (14/26/1)

Specificity 26%
Sensitivity 100%
ANN could identify
patients who did not
need coronary
antiography

Skora, 1996; Mobley et al., 2004), signs and symptoms (Kukar et al., 1999) or
binary and analog coded Poincare plots (Azuaje et al., 1999) were chosen for
input. With one exception, all of the selected papers present multilayer perceptrons
(MLPs) for diagnosing acute myocardial infarction or coronary stenosis (Table
1). Most intriguingly, the model of coronary disease risk (Azuaje et al., 1999) on
the basis of binary-coded Poincare data reveals useful results. The comparison
of the performance of MLP with radialbasis function networks (RBFN) and
conic section function networks (CSFN) provides evidence for the importance
of the initialization of MLPs and CSFNs. With non-optimal initialization, MLPs
can be among the poorest methods. Kukar et al. (1999) compares four
classification approaches (Bayesian, MLP, decision tree, k-nearest neighbor)

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 23

for classes of ischaemic heart diseases. The experiments with various learning
algorithms achieved a performance level comparable to that of clinicians. A
further interesting result in this study was that only ten attributes were sufficient
to reach a maximum accuracy. A closer look at the structure of this subset of
attributes suggests that most of the original 77 attributes were redundant in the
diagnostic process.

Gynecology
Breast Cancer
Breast cancer ranks first in the causes of cancer death among women in
developed and developing countries (Parkin et al., 2001; Pisani et al., 1999). The
best way to reduce death due to breast cancer suggests treating the disease at
an earlier stage (Chen et al., 1999). Earlier treatment requires early diagnosis,
and early diagnosis requires an accurate and reliable diagnostic procedure that
allows physicians to differentiate benign breast tumors from malignant ones.
Current procedures for detecting and diagnosing breast cancer illustrate the
difficulty in maximizing sensitivity and specificity (Chen et al., 1999). For data
acquisition, versatile diagnostic tools are applied. Correspondingly, most publications about breast cancer diagnosis apply different forms of supervised neural
nets (Table 2), in most cases, a three layered MLP with one output node modeling
benign or malignant diagnosis.
Abbass (2002) trained a special kind of evolutionary artificial neural network,
memetic pareto ANNs on the basis of nine laboratory attributes. Setiono (1996,
2000) took Wisconsin Breast Cancer Diagnosis data as input parameters for a
three layered pruned MLP. Buller et al. (1996) and Chen et al. (1999) rely on
ultrasonic images for the training of three and four layered MLPs. Input data
from radiographic image features are the basis for Fogel et al. (1998), Markey
et al. (2003), Ronco (1999), Papadopulos et al., (2004), and Papadopulos et al.
(2002) whereby, in all cases, the main amount of image data were reduced either
by principal component analysis (PCA) or by selecting characteristic features.
Markey et al. (2003) applied a self-organizing map (SOM) for classifying the
image data, and Papadopulos et al. (2002) combined a four layered MLP with an
expert system.
Lisboa et al. (2003) present a partial logistic artificial neural network (PLANN)
for prognosis after surgery on the basis of 18 categorical input data, which
include, among others, number of nodes, pathological size, and oestrogen level.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

24 Schöllhorn & Jäger

The results of contrasting the PLANN model with the clinically accepted
proportional hazards model were that the two are consistent, but the neural
network may be more specific in the allocation of patients into prognostic groups
using a default procedure.
The probability of relapse after surgery is objective to the MLP model of GomezRuiz et al. (2004). The model is based only on six attributes including tumor size,
patient age, or menarchy age and is able to make an appropriate prediction about
the relapse probability at different times of follow up. A similar model is provided
by Jerez-Aragones et al. (2003). A comparison of a three layered MLP with
decision trees and logistic regression for breast cancer survivability is presented
by Delen et al. (2004). Cross et al. (1999) describe the development and testing
of several decision support strategies for the cytodiagnosis of breast cancer that
are based on image features selected by specialist clinicians.
However, the reliability study of Kovalerchuk et al. (2000) shows that the
development of reliable computer-aided diagnostic (CAD) methods for breast
cancer diagnosis requires more attention regarding the problems of selection of
training, testing data, and processing methods. Strictly speaking, all CAD
methods are still very unreliable in spite of the apparent and possibly fortuitous
high accuracy of cancer diagnosis reported in literature and therefore require
further research. Several criteria for reliability studies are discussed next.
Knowledge-based neurocomputation for the classification of cancer tissues
using micro-array data were applied by Futschik et al. (2003). Knowledge-based
neural nets (KBNN) address the problem of knowledge representation and
extraction. As a particular version of a KBNN, Futschik et al. (2003) applied
evolving fuzzy neural networks (EFuNNs), which are implementations of
evolving connectionist systems (ECOS). ECOS are multilevel, multimodular
structures where many modules have inter- and intra-connections. ECOS are not
restricted to a clear multilayered structure and do have a modular open structure.
In contrast to MLPs, the learned knowledge in EFuNNs is locally embedded and
not distributed over the whole neural network. Rule extraction was used to
identify groups of genes that form profiles and are highly indicative of particular
cancer types.
A whole group of publications in medical literature, including Setiono, (1996,
2000); Downs et al. (1996); Delen et al. (2004); Papadopulos et al. (2002), copes
with long assumed disadvantages of MLPs, based on the fundamental theoretical
works of Benitez et al. (1997); Castro and Miranda (2002). Mainly, ANNs have
been shown to be universal approximators. However, many researchers refuse
to use them due to their shortcomings which have given them the title “black
boxes”. Determining why an ANN makes a particular decision is a difficult task
(Benitez et al., 1997). The lack of capacity to infer a direct “human comprehensible” explanation from the network is seen as a clear impediment to a more

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 25

Table 2. Applications of ANNs in breast cancer research (? = not specified
in reference)
Authors
Abbass
(2002)

Objective
Breast cancer

Input
9 Attributes: Clump
thickness,
uniformity of cell
size, uniformity of
cell shape, marginal
adhesion, single
epithelial cell size,
bare nuclei, bland
chromatin, normal
nucleoli, mitoses
Ultrasonic Image
parameters

Output
benign/malignant

Type of ANN
1 memetic pareto
artificial neural
networks
(MPANN), (i.e.,
special kind of
Evolutionary
ANN)

Results/remarks
MPANN have better
generalization and
lower computational
cost than other
evolutionary ANNs
or back propagation
networks

Buller et al.
(1996)

Breast cancer

benign/malignant

Weak when evidence
on the image is
considered weak by
the expert

Breast nodules

Ultrasonic images
24-dimensional
image feature vector

benign/malignant

2 MLPs:
one for malignant,
one for benign
(27/4/1)
(27/14/2/1)
compared with
experts
MLP (25/10/1)

Chen et al.
(1999)

Delen et al.
(2004)

Comparison
methods
Breast cancer
survivability

Survived
Not survived

1) Decision tree
2) MLP(17/15/2)
3) Log reg

Downs et al.
(1996)

1) Coronary
care
2) Breast
cancer
3) Myocardial
infarction

?

Fuzzy ARTMAP

Fogel et al.
(1998)

Screening
features for
mammogram

17 predictor
variables
(race, marital status,
histology, tumor
size, etc.)
1) 43 items of
clinical or
electrocardiographic
data
2) 10 binary valued
features., e.g.,
presence/absence of
negrotic epithelial
cells
3) 35 Items of
clinical or ECG data
coded as 37 binary
inputs
12 radiographic
features

Malignant/
non-malignant

MLP
(12/1,2/1)

Accuracy for
detecting malignant
tumors 95%,
sensitivity 98%,
specificity 93%
DT: 93.2%
ANN: 91.2%
Log reg: 89.2%
Recommended kfold cross validation
Symbolic rule
extraction

ANNs with only two
hidden nodes
performed as well as
more complex ANNs
and better than
ANNs with only one
hidden
node

widespread acceptance of ANNs (Tickle et al., 1998). Knowledge in conventional ANNs like MLPs is stored locally in the connection weights and is
distributed over the whole network, complicating its interpretation (Futschik et
al., 2003). Setiono (1996, 2000) found that a more concise set of rules can be thus
expected from a network with fewer connections and fewer clusters of hidden
unit activations. Under some minor restrictions, the functional behavior of radial
basis function networks is equivalent to fuzzy interference systems (Jang & Sun,
1993). Analogously, the results of Paetz (2003) are a major extension of

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

26 Schöllhorn & Jäger

Table 2. Applications of ANNs in breast cancer research (? = not specified
in reference) (cont.)
6 Attributes after
cancer surgery
(tumor size, patient
age, menarchy age,
etc.)
15 personal and
laboratory
attributes

Breast cancer
relapse

MLP
(6/18,16,15,13,12/1)

Breast cancer
relapse

1) MLP
2) Decision tree

Prognostic risk
groups and
management
of breast
cancer patients
(censored
data)

18 categorical
variables

Assignment of a
prognostic group
index or risk score
to each individual
record

Partial Logistic
Artificial Neural
Network (PLANN)

Markey et al.
(2003)

Breast cancer

Mammographic
findings
Patient age

1) Classification
2) malignancy

Papadopulos
et al. (2002)

Detection of
micro
calcification
(mammogram)
Micro
calcification in
mammogram

Image data

Micro calcification
or not

2048x2048 pixel
data reduction
PCA

Malignant or
benign

Ronco (1999)

Breast cancer
screening

42 variables: data
from familial
history of cancer
and
sociodemographic,
gynecoobstetric,
and dietary
variables

Cancer yes/no

1) 4x4 SOM
CSNN profile of each
cluster
2) MLP (7/14/1)
Hybrid network
classifier
Expert system + MLP
(9/20/10/1)
1) Rule based
classifier
2) MLP (7/15/1)
3) SVM
MLP (42/16-256/1)

Setiono
(1996)

Breast cancer
diagnosis

benign/malignant

MLP
(Pruned NN)

Setiono
(2000)

Breast cancer
diagnosis

Wisconsin Breast
Cancer diagnosis
(WBCD)
WBCD

benign/malignant

MLP (/3,5/)

Gomez-Ruiz
et al. (2004)

Prognosis of
early breast
cancer

JerezAragones et
al. (2003)

Breast cancer
relapse

Lisboa et al.
(2003)

Papadopulos
et al. (2004)

n=1616 cases with
variable numbers
of missing values

PCA data reduction

Predictions about
the relapse
probability at
different times of
follow-up
Control of
induction by
sample division
method
Pathological size
and histology are
key factors to
differentiate the
highest
surviving-group
from the rest and
clinical stage
Nodes specify
lowest-surviving
group
25 % specificity
98% sensitivity

Performance
0.79-0.83

94.04%positive
predictive value
and 97.6%
negative
predictive value
Improvement of
cost/benefit ratio

95% accuracy
rate
rule extraction
Do. improved by
preprocessing,
e.g., relevant
parameters,
extracting
missing
attributes

preliminary work now providing understandable knowledge for classification and
generating more performant rules for preventing septic shocks. Rule extraction
from neural networks has been in a state of stagnation for the last few years
(Tickle et al., 1998); however, the very recent works from Duch et al. (2001),
Markowska-Kaczmar and Trelak (2003), and Hammer et al. (2002, 2003)
brought new issues back into discussion.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 27

Pregnancy
The applications of ANNs with respect to the subject of pregnancy cover nearly
all states of life development from predicting ovulation time (Gürgen et al., 1995)
to the prediction of fertilization success (Cunningham et al., 2000) and intrauterine growth retardation (Gurgen, 1999; Gurgen et al., 1997) up to cardiotocography
(CTG) (Ulbricht et al., 1998). All investigations listed in Table 3 are based on
MLPs with input variables of different numeric scales. The input variables cover
the spectrum from symbolic variables (Cunningham et al., 2000) to numeric
ultrasound (Gurgen, 1999; Gurgen et al., 1997, 2001), heart rate (Ulbricht et al.,
1998) and hormone (Cunningham et al., 2000; Gürgen et al., 1995) parameters.
Gurgen et al. (2001), Alonso-Betanzos et al. (1999), and Ulbricht et al. (1998)
compare the MLPs with Bayes discriminant analysis (Gurgen, 1999), neurofuzzyMLPs with neurofuzzy-radial basis functions (Gurgen et al., 2001), and MLPs
with Bayes-model, discriminant analysis, Shortliffe/Buchannan-Algorithm
(Shortliffe & Buchanan, 1975), as well as with experts (Alonso-Betanzos et al.,
1999). Gurgen (1999) achieves the better success rate for predicting intrauterine
growth retardation by means of MLPs in comparison to Bayesian based
statistical approaches. On the basis of incomplete input data, the results of
Alonso-Betanzos et al. (1999) show small advantages for the Shortliffe/Buchanan
(SB) algorithm with respect to the false positives. As far as validation against the
problem is concerned, it can be seen that presenting a higher level of true
positives are three experts, the SB-algorithm, and the ANN.

Image Analysis
General
Automated image analysis is an increasing and important area for the application
of ANN pattern recognition techniques. Particularly in healthcare, pattern
recognition is used to identify and extract important features in ECTs, MRIs,
radiographies, smears, and so forth. Especially with the introduction of film and
decreasing radio and photography departments, the role of image analysis is
likely to grow. An important step in the analysis of biomedical images is the
precise segmentation of the objects of an image. Thereby, the construction of a
3D-model is generally carried out by overlaying contours obtained after a 2D
segmentation of each image-slice.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

28 Schöllhorn & Jäger

Table 3. Applications of ANNs in research related to pregnancy (? = not
specified in reference)
Authors
AlonsoBetanzos et
al. (1999)

Objective
Pregnancy
monitoring with
incomplete data

Input
NST
parameters,
risk factors

Output
Indicators for
good or
bad/slightly
bad moderately
bad/quite fetal
outcome

Cunningha
m et al.
(2000)

Fertilization

28 parameters:
18 numeric,
10 symbolic

Successful or not

Gürgen et
al. (1995)
Gurgen et
al. (1997)

Predicting
ovulation time
Intrauterine
growth
retardation

Ovulation time

5 MLPs (1/4-6/1)

Normal
Symmetric
IUGR
Assymmetric
IUGR

MLP
(4,8,12,16/2,3,4,5/3)

Gurgen
(1999)

1) Predicting
intrauterine
growth
retardation
(IUGR) and
2) ovulation time
on the basis of
luteinizing
hormone (LH)

1) Symmetric
IUGR, normal,
asymmetric
IUGR
2) Ovulation
time

1) MLPs (=1582)
2) Bayes

1) Success rate of
95%
2) Peak error 0.070.12
The NN approach
is particularly more
appropriate than
Bayesian-based
statistical
approaches

Gurgen et
al. (2001)

Antenatal fetal
risk assessment

Luteinizing
hormone (LH)
Weekly
ultrasonic
examination
parameters
(WI, HC, AC,
HC/AC,
+ Gaussian
noise in
training data)
1) Week index
(WI)
Head
circumference
(HC),
Abdominal
circumference
(AC),
HC/AC
Between 16th
and 36th week
1) LH
4 Doppler
Ultrasound
Blood flow
velocity
waveforms

1) MLP
2) Neurofuzzy MLP
3) NF-RBF
decision

Sensitivity:
88-100%
Specificity:
95-100%
PPT: 90-100%
PNT: 93-100%

Ulbricht et
al. (1998)

Cardiotocogram
(CTG)

Grade values
0.7-0.95:
alarming risk
0.4-0.7:
suspicious
situation
0-0.4: good fetal
conditions
Deceleration
No deceleration

MLP (15/3/2)

Correct positive:
72-90%
False negative:
20-40%

The baseline
of the fetal
heart rate and
values
indicating the
microfluctuation

Type of ANN
1) MLP (8/16/4)
2) Bayes
Discriminant
3) ANN
4) Shortliffe/Buchanan
(SB)
5) Experts
MLP (28/?/1)

Results/remarks
Best results SB
ANN better than
experts

Stability problems
Results are highly
sensitive on
training data
Peak error:
0.07-0.12
Success rate
increased from 1st
gestational week
until the 4th from
61% to 95%

A comprehensive up-to-date review on image processing and neural networks
is provided by Egmont-Petersen et al. (2001), Miller et al. (1992), and Nattkemper
(2004). Examples of filtering, segmentation, and edge detection techniques using
cellular neural networks (CNN) to improve resolution in brain tomographies and
to improve global frequency correction for the detection of microcalcifications
in mammograms can be seen in the work of Aizenberg et al. (2001). An ANN

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 29

was successfully applied to enhance low-level segmentation of eye images for
diagnosis of Grave’s ophthalmopathy (Ossen et al., 1994). The ANN segmentation system was integrated into an existing medical imaging system. The
system provides a user interface to allow interactive selection of images, ANN
architectures, training algorithms, and data.
Dellepiane (1999) concentrates on the importance of feature selection in image
analysis. The proposed approach, based on the definition of a training set,
represents a possible solution to the problem of selecting the best feature set that
is characterized by robustness and reproducibility.
A patented commercial product for cytodiagnosis, especially for cancer detection and cancer diagnosis, is presented by Koss (1999). PAPNET has been
constructed for the purpose of selecting a limited number of cells from cytologic
preparations for display on a high resolution television monitor. Digitized images,
representing abnormal cells from cervical smears are fed into an ANN circuit.
The machine showed a sensitivity of 97% for abnormal smears. The ANN based
devices are interactive and do not offer an automated verdict, leaving the
diagnosis to trained humans.
Further applications of ANN in medical image analysis are listed in Table 4.
A proposal for the segmentation of images based on active contours is presented
by Vilarino et al. (1998). The application of cellular neural networks (CNN)
(Chua & Yang, 1988) operates by means of active contours. The CNNs are
defined as a n-dimensional array of dynamic processing elements called cells
with local iterations between them. The CNN allows for a continuous treatment
of the contour and leads to greater precision in the adaptation implemented in the
CNN.
An alternative approach for image analysis in the context of recognizing
lymphocytes in tissue sections is suggested by Nattkemper et al. (2001). After
data reduction of digitized images by means of principal component analysis,
local linear maps (LLM) were applied for automatic quantization of fluorescent
lymphocytes in tissue section. The LLM approach was originally motivated by
the Kohonen self-organizing map (SOM) (Kohonen, 1982, 2001) with the
intention of obtaining a better map resolution even with a small number of units.
The LLM combines supervised and unsupervised learning which differs from
MLPs trained with back propagation. The neural cell detection system (NCDS)
detected a minimum of 95% and delivered the number, the positions, and the
phenotypes of the fluorescent cells once it was trained within 2 minutes. In
comparison to human experts and support vector machines (SVM) (Nattkemper
et al., 2003), this LLM-based high-throughput screening approach is much faster
and achieves the best recognition rate. A further advantage of LLM can be seen
in its robustness because, for micrographs of varying quality, it is more resistant
to variations in the image domain.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

30 Schöllhorn & Jäger

Table 4. Applications of ANNs in medical image analysis (? = not specified
in reference)
Authors

Objective

Input

Output

Type of
ANN
Local linear
map (LLM)

Nattkemper
et al. (2001)

Image analysis
Lymphocytes in
tissue sections

6 dim feature
vector extracted
by PCA

Combined
supervised and
unsupervised

Nattkemper
et al. (2003)

Microscopic
analysis of
lymphocytes

dto.

1) SVM
2) LLM

Pizzi et al.
(1995)

Alzheimer’s
diseased tissue

658x517 pixel
micrographs
PCA sixeigenvectors
1) Infrared
spectra (IS)
2) PCA of
infrared spectra

Classification
2 groups
5 groups

1) LDA
2) MLP

Tjoa and
Krishnan
(2003)

Colon status from
endoscopic
images

Normal
abnormal

MLP (?/525/2)

Vilarino et
al. (1998)

Images, e.g.,
computational
tomography (CT)

Texture features,
texture spectra,
colon features,
PCA
Images (with
energy
information, i.e.
different gray
scales)

Contours in
images

Zhou et al.
(2002)

Identification of
Lung cancer cells

Images of the
specimen of
needle biopsis

1) Normal cell
2) Cancer cell
a)
adenocarcinoma,
b) squamous cell
c) small cell c.
d) large cell c.
e) normal

Discretetime
cellular
neural
network
(DTCNN)
ANNensembles

Results /remarks
Detected 95% of the
cells
Experts detected only
80%
The trained system
evaluates an image in
2 min calculating
ANNs classified
microscopy data with
satisfactory results
PCA of IS, 2 groups:
98%
PCA of IS, 2 groups
100%
Only IS always
provided poorer results
Texture and color:
97.7%
Texture: 96.6%
Color: 90.5%
(automatic) Detection
of contours in images
for segmentation

Automatic pathological
diagnosis procedure
named Neural
Ensemble-based
Detection (NED) is
proposed

The importance of data preprocessing in a similar context is rationale in the
investigation of Pizzi et al. (1995). ANN classification methods were applied to
the infrared spectra of histopathologically confirmed Alzheimer’s-diseased and
control brain tissue. When data were preprocessed with PCA, the ANNs
outperformed their linear discriminant counterparts: 100% vs. 98% correct
classifications. Only one of the three selected ANN architectures produced
comparable results when the original spectra were used.
Providing assistance for the physician in detecting the colon status from
colonoscopic images was intended by the development of an ANN model by Tjoa
and Krishnan (2003). Schemes were developed to extract texture features from
the texture spectra in the chromatic and achromatic domains, and color features
for a selected region of interest from each color component histogram of the
colonoscopic images. Similarly to the previous investigations, the features were
first reduced in size by means of PCA before they were evaluated with back

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 31

Figure 1. A classifier ensemble of neural networks

propagation ANN. Using texture and color features with PCA led to an average
classification accuracy of 97.7%, and classification only by means of texture
(96.9%) or color (90.5%) features were outperformed.
An ensemble of artificial neural networks was applied by Zhou et al. (2002) for
the identification of lung cancer cells in the images of the specimens of needle
biopsies. Neural network ensembles (Figure 1) were proposed by Hansen and
Salamon (1990) for improving the performance and training of neural networks
for classification. Cross validation is used as a tool for optimizing network
parameters and architectures, where remaining residual generalization errors
can be reduced by invoking ensembles of similar networks (Islam et al., 2003;
Krogh & Vedelsby, 1995) The comparison of single neural networks and
ensembles of neural networks showed a clear decrease in the error rate from
45.5% down to 13.6% on average.

MRI
Within the area of image analysis, a comprehensive amount of publications are
focused on Magnetic Resonance Imaging (MRI) and functional Magnetic
Resonance Imaging (fMRI) (Table 5). The imaging methods can lead to
automated techniques to identify and quantify cortical regions of the brain. Hall
et al. (1992) compare segmentation techniques for MRI on the basis of pixel and
feature data. Segmentation methods for MRI data are developed to enhance
tumor/edema boundary detection, for surgery simulation and for radiation
treatment planning. Similar results are provided by supervised (forward cascade

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

32 Schöllhorn & Jäger

Table 5. Applications of ANNs in medical image analysis: MRI images
Authors
Dimitriadou et
al. (2004)

Objective
fMRI
comparison of
clustering
methods

Input
fMRI image data

Output
Tissue
Classification

Type of ANN
Hierarchical, crisp
(neural gas, selforganizing maps, hard
competitive learning, kmeans, maximium and
minimum distance,
CLARA) and fuzzy (cmeans, fuzzy
competitive learning)
Recurrent pulsed
random network
(RPRN)

Gelenbe et al.
(1996)

Volumetric
Magnet
Resonance
Imaging

5x5 Blocks of
pixels of the MRI
with their grey
level

Hall et al.
(1992)

Magnet
resonance
images

MRI Data
(256x256 pixel
with three features:
T1, T2, ρ) leading
to a 65536x3
matrix for each
image

A finite set of M
regions R1, …, RM
of the MRI which
are wished to be
identified
Up to seven
classes of (MR)
tissue, e.g., white
or grey matter or
edema

Lai and Fang
(2000)

MR images

1) Wavelet
histogram features
2) spatial statistical
features

Classifying into
multiple clusters

Reddick et al.
(1997)

Multispectral
Magnetic
Resonance
Images

1) Nine levels of
segmented images
2) Classification
of MRI pixel into
seven classes:
white, white/grey,
grey, grey/CSF,
CSF, other,
Background

1) 3x3 SOM for
segmentation
2) MLP (3/7/7) for
classification

Wismuller et
al. (2004)

fMRI

1) Input vector of
the non-normalized
T1, T2, and PD
signal intensities
for a single pixel
within the region of
interest in MRI
2) Three
components of
weight vector
associated with
each neuron of the
SOM
Five slices with
100 images
Resolution
3x3x4mm

Tissue
Classification

1) Neural gas (NG)
network
2) SOM
3) Fuzzy clustering
based on deterministic
annealing

1) Fuzzy c-means
algorithms
(unsupervised)
2) approximate fuzzy cmeans (unsupervised)
3) FF cascade
correlation NN
(supervised)
One hierarchical NN
(RBF) for clustering and
a number of estimating
networks

Results/remarks
Neural gas method
seems to be the best
choice for fMRI
Cluster analysis,
neural gas, and the
k-means algorithm
perform significantly
better than all others
Results are similar to
a human expert
carrying out manual
volumetric analysis
of brain MR images
Supervised and
unsupervised
segmentation
techniques provide
similar results

Automatic
adjustment of
display
window width and
center for a wide
range of magnetic
resonance (MR)
images
ICC (ri) of 0.91,
0.95, and 0.98 for
white matter, gray
matter, and
ventricular
cerebrospinal fluid
(CSF), respectively

Neural gas and fuzzy
clustering
outperformed
KSOM
NG outperformed
with respect to
quantization error
KSOM
outperformed with
respect to
computational
expense

correlation neural network) and unsupervised (fuzzy algorithms) segmentation
techniques.
Lai and Fang (2000) present a hierarchical neural network-based algorithm for
robust and accurate display-window parameter estimation. Instead of image
pixels, wavelet histogram features and spatial statistical features are extracted

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 33

from the images and combined into an input vector. The hierarchical neural
networks consist of a competitive layer neural network for clustering two radial
basis function networks, a bimodal linear estimator for each subclass, and a data
fusion process using estimates from both estimators to compute the final display
parameters. The algorithm was tested on a wide range of MR images and
displayed satisfactory results.
A geometric recurrent pulsed random network for extracting precise morphometric information from MRI scans is presented by Gelenbe et al. (1996). The
network is composed of several layers with internal interneural interaction and
recurrency in each individual layer. Correct classification of 98.4% was observed, based on 5x5 pixel matrices for input data.
A hybrid-neural-network method for automated segmentation and classification
of multispectral MRI of brain is demonstrated by Reddick et al. (1997). The
automated process uses Kohonen’s self organizing ANN for segmentation and
an MLP back propagation network for classification for distinguishing different
tissue types. An intraclass correlation of the automated segmentation and
classification of tissues with the standard radiologist identification led to coefficients of r = 0.91 (white matter), r = 0.95 for gray matter, and 0.98 for ventricular
cerebrospinal fluid. The fully automated procedure produced reliable MR image
segmentation and classification while eliminating intra- and inter-observer
variability.
FMRI based on blood oxygenation-level-dependent signal changes allows assessment of brain activity via local hemodynamic variations over time (Ogawa
et al., 1992, 1993).
An alternative approach for fMRI analysis, based on unsupervised clustering, is
suggested by Wismuller et al. (2004). The proposed neural gas (Martinetz et al.,
1993) is a neuronal clustering algorithm that solves several points of weakness
in classical Kohonen nets. One problem in clustering unknown data by means of
Kohonen nets is the advanced determination of the nets’ dimensions. When the
choice does not fit with the dimensionality of the data, topological defects may
occur that can lead to deficient mapping. In contrast, the neural gas has no
predefined topology which determines the neighborhood relation between the
neurons. The neighborhood in the neural gas (Figure 2) is exclusively determined
by the relation of the weights of the neurons in the input data space. The
reference vectors are moving “free” in the data space. Wismuller et al. (2004)
analyzed fMRI data from subjects performing a visual task. Five slices with 100
images were acquired. The clustering results were evaluated by assessment of
cluster assignment maps, task-related activation maps, and associated time
courses. The major findings were that neural gas and fuzzy clustering outperformed Kohonen’s map in terms of identifying signal components with high
correlation to the fMRI stimulus. The neural gas outperformed the two other

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

34 Schöllhorn & Jäger

Figure 2. Growing neural gas (from Fritzke, 1997)

a) 0 signals

e) 2500 signals

b) 100 signals

f) 10000 signals

c) 300 signals

d) 1000 signals

g) 40000 signals

h) Voronoi regions

methods regarding the quantization error, and Kohonen’s map outperformed the
two other methods in terms of computational expense.
Similarly, Gelenbe et al. (1996) compared several fMRI cluster analysis techniques. The clustering algorithms used were hierarchical, crisp (neural gas,
SOM, hard competitive learning, k-means, maximum distance, CLARA) and
fuzzy (c-means, fuzzy competitive learning). The results clearly showed that the
neural gas and the k-means algorithm perform significantly better than all the
other methods using their set-up. Overall the neural gas method seems to be the
best choice for fMRI cluster analysis, given its true positives while minimizing
false positives and showing stable results.

Neuromuscular Control
Low Back Pain
Besides categorizing groups with common degrees of impairment during gait (cf.
Chapter X), one of the most disabling orthopaedic problems is most often subject
to classification tasks with ANNs: low back pain and trunk motion. Selected
publications are listed in Table 6.

Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written
permission of Idea Group Inc. is prohibited.

A Survey on Various Applications of Artificial Neural Networks 35

Bishop et al. (1997) analyzed several forms of MLPs in assigning sensed lowback pain during a set of trunk motions to selected kinematic parameters that
were collected with a tri-axial goniometer. In comparison to linear classifiers, the
ANN classifier produced superior results with up to 85% accuracy on test data.
Despite the dependence of the system on the low-back classification, the authors
claim to have found a system that could markedly improve the management of
lower back pain in the individual patient.
With a similar goal of classifying the relationship between pain and vertebral
motion, Dickey et al. (2002) collected data of vertebral motions of the lumbar

Table 6. Applications of ANNs in neuromuscular control/orthopaedics—
low back pain and trunk motion (? = not specified in reference)
Authors
Bishop et al.
(1997)

Objective
Low back pain

Input
Tri-axial goniometer
data

Output
Low back pain

Type of ANN
MLP (?/?/?)
(RBF)

Dickey et al.
(2002)

Low back pain

Kinematic motion
parameters derived
from percutaneous
intra-pedicle screws
L4 L5 S1

Level of pain (0-10)

MLP (32/10/1)
Linear
Discriminant
Analysis (LDA)

Gioftsos and
Grieve (1996)

Chronic low
back pain

242 variables Forces +
pressure at feet, knee,
hip, and lumbar
movements

MLP (242/20/3)
classification

86% were
diagnosed
correctly
More successfu