This is a book about learning
from experimental data and about transferring human knowledge
into analytical models.
Performing such tasks belongs
to soft computing. Neural networks (NNs) and support vector
machines (SVMs) are the mathematical structures (models) that
stand behind the idea of learning, and fuzzy logic (FL) systems
are aimed at embedding structured human knowledge into workable
algorithms. However, there is no clear boundary between these
two modeling approaches. The notions, basic ideas, fundamental
approaches, and concepts common to these two fields, as well
as the differences between them, are discussed in some detail.
The sources of this book are course material presented by the
author in undergraduate and graduate lectures and seminars,
and the research of the author and his graduate students. The
text is therefore both class- and practice-tested.
The primary idea of the book
is that not only is it useful to treat support vector machines,
neural networks, and fuzzy logic systems as parts of a connected
whole but it is in fact necessary. Thus, a systematic and unified
presentation is given of these seemingly different fields -
learning from experimental data and transferring human knowledge
into mathematical models.
Each chapter is arranged
so that the basic theory and algorithms are illustrated by
practical examples and followed by a set of problems and simulation
experiments. In the author's experience, this approach is the
most accessible, pleasant, and useful way to master this material,
which contains many new (and potentially difficult) concepts.
To some extent, the problems are intended to help the reader
acquire technique, but most of them serve to illustrate and
develop further the basic subject matter of the chapter. The
author feels that this structure is suitable both for a textbook
used in a formal course and for self-study.
How should one read this
book? A kind of newspaper reading, starting with the back pages,
is potentially viable but not a good idea. However, there are
useful sections at the back. There is an armory of mathematical
weaponry and tools containing a lot of useful and necessary
concepts, equations, and methods. More or less frequent trips
to the back pages (chapters 8 and 9) are probably unavoidable.
But in the usual way of books, one should most likely begin
with this preface and continue reading to the end of chapter
1. This first chapter provides a pathway to the learning and
soft computing field, and after that, readers may continue
with any chapters they feel will be useful. Note, however,
that chapters 3 and 4 are connected and should be read in that
order. (See the figure in Chapters' Survey which represents
the connections between the chapters. See also the concise
chapter's description there.)
In senior undergraduate classes,
the order followed was chapters 1, 3, 4, 5, and 6, and chapters
8 and 9 when needed. For graduate classes, chapter 2 on support
vector machines is not omitted, and the order is regular, working
directly through chapters 1-6.
There is some redundancy
in this book for several reasons. The whole subject of this
book is a blend of different areas. The various fields bound
together here used to be separate, and today they are amalgamated
in the broad area of learning and soft computing. Therefore,
in order to present each particular segment of the learning
and soft computing field, one must follow the approaches, tools,
and terminology in each specific area. Each area was developed
separately by researchers, scientists, and enthusiasts with
different backgrounds, so many things were repeated. Thus,
in this presentation there are some echoes but, the author
believes, not too many. He agrees with the old Latin saying,
Repetio est mater studiorum - Repetition is the mother of learning.
This provides the second explanation of `redundancy' in this
volume.
A few words about the accompanying
software are in order. All the software is based on MATLAB.
All programs run in R11 and R12, i.e., in versions 5 and 6.
The author designed and created
the complete aproxim directory, the entire SVM toolbox for
classification and regression, the multilayer perceptron routine
that includes the error backpropagation learning, all first
versions of core programs for RBF models for n-dimensional
inputs, and some of the core fuzzy logic models. Some programs
date back as far as 1992, so they may be not very elegant.
However, all are effective and perform their allotted tasks
as well as needed.
The author's students took
an important part in creating user-friendly programs with attractive
pop-up menus and boxes. At the same time, those students were
from different parts of the world, and the software was developed
in different countries - Yugoslavia, the United States, Germany,
and New Zealand. Most of the software was developed in New
Zealand. These facts are mentioned to explain why readers may
find program notes and comments in English, Serbian, and German.
(However, all the basic comments are written in English.) We
deliberately left these lines in various languages as nice
traces of the small modern world. Without the work of these
multilingual, ingenious, diligent students and colleagues,
many of the programs would be less user-friendly and, consequently,
less adequate for learning purposes.
The MATLAB version of the
package LEARNSC (as pre-parsed pseudo-code files, P-files) needed
for the simulation experiments (compatible with the Releases
R11 and R12, i.e., with the MATLAB 5 and MATLAB 6 versions
respectively) is free downloadable and it can be retrieved
from this site.
The readers interested in
authors' programming solutions may purchase the source MATLAB
code (M-files) of the program package LEARNSC (for
Releases R11 and R12) from the same site. Note that both the
software package LEARNSC that corresponds to this book and
its each particular routine are for fair use only and free
for all educational purposes. They may not be used for any
kind of commercial activity.
A preliminary draft of this
book was used in the author's senior undergraduate and graduate
courses at various universities in Yugoslavia, Germany, USA
and New Zealand. The valuable feedback from the curious students
who took these courses made many parts of this book easier
to read. He thanks them for that.
back
to top
Introductory
part
In this book no suppositions
are made about preexisting analytical models. There are, however,
no limits to human curiosity and the need for mathematical
models. Thus, when devising algebraic, differential, discrete,
or any other models from first principles is not feasible,
one seeks other avenues to obtain analytical models. Such models
are devised by solving two cardinal problems in modern science
and engineering:
- Learning from experimental data
(examples, samples, measurements, records, patterns, or observations)
by support vector machines (SVMs) and neural networks (NNs)
- Embedding existing structured
human knowledge (experience, expertise, heuristics) into
workable mathematics by fuzzy logic models (FLMs).
These problems seem to be very different,
and in practice that may well be the case. However, after NN
or SVM modeling from experimental data is complete, and after
the knowledge transfer into an FLM is finished, these two models
are mathematically very similar or even equivalent. This equivalence,
discussed in section 6.2, is a very attractive property, and
it may well be used to the benefit of, both fields.
The need for a book about
these topics is clear. Recently, many new 'intelligent' products
(theoretical approaches, software and hardware solutions, concepts,
devices, systems, and so on) have been launched on the market.
Much effort has been made at universities and R&D departments
around the world, and numerous papers have been written on
how to apply NNs, FLMs, and SVMs, and the related ideas of
learning from data and embedding structured human knowledge.
These two concepts and associated algorithms form the new field
of soft computing. They have been recognized as attractive
alternatives to the standard, well established 'hard computing'
paradigms. Traditional hard computing methods are often too
cumbersome for today's problems. They always require a precisely
stated analytical model and often a lot of computation time.
Soft computing techniques, which emphasize gains in understanding
system behavior in exchange for unnecessary precision, have
proved to be important practical tools for many contemporary
problems. Because they are universal approximators of any multivariate
function, NNs, FLMs, and SVMs are of particular interest for
modeling highly nonlinear, unknown, or partially known complex
systems, plants, or processes. Many promising results have
been reported. The whole field is developing rapidly, and it
is still in its initial, exciting phase.
At the very beginning,
it should be stated clearly that there are times when there
is no need for these two novel model-building techniques. Whenever
there is an analytical closed form model, using a reasonable
number of equations, that can solve the given problem in a
reasonable time, at reasonable cost, and with reasonable accuracy,
there is no need to resort to learning from experimental data
or fuzzy logic modeling. Today, however, these two approaches
are vital tools when at least one of those criteria is not
fulfilled. There are many such instances in contemporary science
and engineering.
The title of the book gives
only a partial description of the subject, mainly because the
meaning of learning is variable and indeterminate. Similarly,
the meaning of soft computing can change quickly and unpredictably.
Usually, learning means acquiring knowledge about a previously
unknown or little known system or concept. Adding that the
knowledge will be acquired from experimental data yields the
phrase statistical learning. Very often, the devices and algorithms
that can learn from data are characterized as intelligent.
The author wants to be cautious by stating that learning is
only a part of intelligence, and no definition of intelligence
is given here. This issue used to be, and still is, addressed
by many other disciplines (notably neuroscience, biology, psychology,
and philosophy). However, staying firmly in the engineering
and science domain, a few comments on the terms intelligent
systems or smart machines are now in order.
Without any doubt the human
mental faculties of learning, generalizing, memorizing, and
predicting should be the foundation of any intelligent artificial
device or smart system. Many products incorporating NNs, SVMs,
and FLMs already exhibit these properties. Yet we are still
far away from achieving anything similar to human intelligence.
Part of a machine's intelligence in the future should be an
ability to cope with a large amount of noisy data coming simultaneously
from different sensors. Intelligent devices and systems will
also have to be able to plan under large uncertainties, to
set the hierarchy of priorities, and to coordinate many different
tasks simultaneously. In addition, the duties of smart machines
will include the detection or early diagnosis of faults, in
order to leave enough time for reconfiguration of strategies,
maintenance, or repair. These tasks will be only a small part
of the smart decision making capabilities of the next generation
of intelligent machines. It is certain that the techniques
presented here will be an integral part of these future intelligent
systems.
back
to top
You
are here: Home > |