Neural network and its classification

Conventionally, the word Neural Network used to refer to a set of connections or route of biological neurons. Currently, this word is frequently used to refers to ANN (Artificial Neural Networks), which are tranquil of nodes or artificial neurons. Thus this term has two different usages:

Figure

Biological neural networks are fabricated of genuine biological neurons that are linked or functionally connected in the central nervous system or peripheral nervous system. In Neuroscience area, these are frequently recognized as neurons groups that execute a precise physiological task in laboratory investigation.

Artificial neural networks are fabricated of artificial neurons which are interconnected with each other (encoding constructs that copy the properties of biological neurons). ANNs may either be used to achieve a perceptive of biological neural networks or for solving artificial brainpower problems without essentially creating a sculpt of a genuine biological structure. The genuine, biological nervous system is extremely multifaceted and includes a number of features that may appear excessive based on an acceptance of artificial networks.

Figure

Generally, a biological neural network is collection of a set or sets of chemically linked or functionally linked neurons. A single neuron can be associated to several other neurons and the total number of neurons and links in a system may be broad. Connections, called synapses, are formed from axons to dendrites, though dendrite microcircuits and other connections are possible. Apart from the electrical signaling, there are other forms of signaling that arise from neurotransmitter diffusion, which have an effect on electrical signaling. As such, neural networks are extremely complex.

Figure

Artificial intelligence and cognitive modeling try to replicate some properties of neural networks. In similar techniques, the earlier one want to solving particular problems, while the latter aims to construct arithmetical models of biological neural systems.

In the area of artificial intelligence, ANNs have been deployed fruitfully to Speech Recognition, Analysis and adaptive control of Images, Construction of software agents (3D games and computer ) or robots. Mainly, the presently ANNs are engaged for artificial intelligence, based on Statistical assessment, Optimization and Control hypothesis.

The cognitive modeling area involves the physical or arithmetical modeling of the actions of neural systems; ranging from the individual neural level (e.g. modeling the spike response curves of neurons to a stimulus), through the neural bunch to the total organism (e.g. behavioral modeling of the organism’s response to stimuli). Neural networks, Artificial intelligence and cognitive modeling are paradigms of processing information encouraged by the method biological neural systems data processing.

2.2 History of Neural Network

The neural networks theory started in late-1800s. it was an attempt to explain how the human brain worked. These thoughts started being functional to computational models with Turing’s B-type machines and the perceptron.

In early 1950s Friedrich Hayek was the first to hypothesize the idea of impulsive order in the mind arising out of decentralized networks of simple units (neurons). Late 1940s, Donald Hebb completed one of the first hypothesis for a mechanism of neural plasticity (i.e. learning), Hebbian learning. Hebbian learning is considered to be a ‘classic’ unsupervised learning law and it (and variants of it) was an early model for long term potentiating.

The perceptron is essentially a linear classifier for classifying data specified by parameters and an output function f = w’x + b. Its parameters are adapted with an ad-hoc rule similar to stochastic steepest gradient descent. Because the inner product is a linear operator in the input space, the Perceptron can only perfectly classify a set of data for which different classes are linearly separable in the input space, while it often fails completely for non-separable data. While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.

The cognition (1975) was an early multilayered neural network with a training algorithm. The actual structure of the network and the methods used to set the interconnection weights change from one neural strategy to another, each with its advantages and disadvantages. Networks can propagate information in one direction only, or they can bounce back and forth until self-activation at a node occurs and the network settles on a final state. The ability for bi-directional flow of inputs between neurons/nodes was produced with the Hopfield’s network (1982), and specialization of these node layers for specific purposes was introduced through the first hybrid network.

The parallel distributed processing of the mid-1980s became popular under the name connectionism.

Figure

The rediscovery of the back propagation algorithm was probably the main reason behind the repopularisation of neural networks after the publication of “Learning Internal Representations by Error Propagation” in 1986 (Though back propagation itself dates from 1974). The original network utilized multiple layers of weight-sum units of the type f = g(w’x + b), where g was a sigmoid function or logistic function such as used in logistic regression. Training was done by a form of stochastic steepest gradient descent. The employment of the chain rule of differentiation in deriving the appropriate parameter updates results in an algorithm that seems to ‘back propagate errors’, hence the nomenclature. However it is essentially a form of gradient descent. Determining the optimal parameters in a model of this type is not trivial, and steepest gradient descent methods cannot be relied upon to give the solution without a good starting point. In recent times, networks with the same architecture as the back propagation network are referred to as Multi-Layer Perceptrons. This name does not impose any limitations on the type of algorithm used for learning.

The back propagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signaling was not obvious at the time, but most importantly because there was no plausible source for the ‘teaching’ or ‘target’ signal.

2.3 Functioning of Neural Network

Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is debated.

A subject of current research in theoretical neuroscience is the question surrounding the degree of complexity and the properties that individual neural elements should have to reproduce something resembling animal intelligence.

Historically, computers evolved from the von Neumann architecture, which is based on sequential processing and execution of explicit instructions. On the other hand, the origins of neural networks are based on efforts to model information processing in biological systems, which may rely largely on parallel processing as well as implicit instructions based on recognition of patterns of ‘sensory’ input from external sources. In other words, at its very heart a neural network is a complex statistical processor (as opposed to being tasked to sequentially process and execute).

Figure 5

An artificial neural network (ANN), also called a simulated neural network (SNN) or commonly just neural network (NN) is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionistic approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network.

In more practical terms neural networks are non-linear statistical data modeling or decision making tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data.

2.4 Types of Neural Networks

Feed Forward Neural Network – A simple neural network type where synapses are made from an input layer to zero or more hidden layers, and finally to an output layer. The feed forward neural network is one of the most common types of neural network in use. It is suitable for many types of problems. Feed forward neural networks are often trained with simulated annealing, genetic algorithms or one of the propagation techniques.

Figure

Self Organizing Map (SOM) – A neural network that contains two layers and implements a winner take all strategy in the output layer. Rather than taking the output of individual neurons, the neuron with the highest output is considered the winner. SOM’s are typically used for classification, where the output neurons represent groups that the input neurons are to be classified into. SOM’s are usually trained with a competitive learning strategy.

Hopfield Neural Network – A simple single layer recurrent neural network. The Hopfield neural network is trained with a special algorithm that teaches it to learn to recognize patterns. The Hopfield network will indicate that the pattern is recognized by echoing it back. Hopfield neural networks are typically used for pattern recognition.

Simple Recurrent Network (SRN) Elman Style – A recurrent neural network that has a context layer. The context layer holds the previous output from the hidden layer and then echos that value back to the hidden layer’s input. The hidden layer then always receives input from its previous iteration’s output. Elman neural networks are generally trained using genetic, simulated annealing, or one of the propagation techniques. Elman neural networks are typically used for prediction.

Simple Recurrent Network (SRN) Jordan Style – A recurrent neural network that has a context layer. The context layer holds the previous output from the output layer and then echos that value back to the hidden layer’s input. The hidden layer then always receives input from the previous iteration’s output layer. Jordan neural networks are generally trained using genetic, simulated annealing, or one of the propagation techniques. Jordan neural networks are typically used for prediction.

Simple Recurrent Network (SRN) Self Organizing Map – A recurrent self organizing map that has an input and output layer, just as a regular SOM. However, the RSOM has a context layer as well. This context layer echo’s the previous iteration’s output back to the input layer of the neural network. RSOM’s are trained with a competitive learning algorithm, just as a non-recurrent SOM. RSOM’s can be used to classify temporal data, or to predict.

Figure

Feed forward Radial Basis Function (RBF) Network – A feed forward network with an input layer, output layer and a hidden layer. The hidden layer is based on a radial basis function. The RBF generally used is the Gaussian function.

Figure

Several RBF’s in the hidden layer allow the RBF network to approximate a more complex activation function than a typical feed forward neural network. RBF networks are used for pattern recognition. They can be trained using genetic, annealing or one of the propagation techniques. Other means must be employed to determine the structure of the RBF’s used in the hidden layer.

2.5 Application of Neural Network

Neural Networks in Practice Given this description of neural networks and how they work, what real world applications are they suited for? Neural networks have broad applicability to real world business problems. In fact, they have already been successfully applied in many industries.

Since neural networks are best at identifying patterns or trends in data, they are well suited for prediction or forecasting needs including:

Sales forecasting

Industrial process control

Customer research

Data validation

Risk management

Target marketing

But to give you some more specific examples; ANN are also used in the following specific paradigms: recognition of speakers in communications; diagnosis of hepatitis; recovery of telecommunications from faulty software; interpretation of multimeaning Chinese words; undersea mine detection; texture analysis; three-dimensional object recognition; hand-written word recognition; and facial recognition.

Neural networks in medicine Artificial Neural Networks (ANN) are currently a ‘hot’ research area in medicine and it is believed that they will receive extensive application to biomedical systems in the next few years. At the moment, the research is mostly on modeling parts of the human body and recognizing diseases from various scans (e.g. cardiograms, CAT scans, ultrasonic scans, etc.).

Neural networks are ideal in recognizing diseases using scans since there is no need to provide a specific algorithm on how to identify the disease. Neural networks learn by example so the details of how to recognize the disease are not needed. What is needed is a set of examples that are representative of all the variations of the disease. The quantity of examples is not as important as the ‘quantity’. The examples need to be selected very carefully if the system is to perform reliably and efficiently.

Modeling and Diagnosing the Cardiovascular System Neural Networks are used experimentally to model the human cardiovascular system. Diagnosis can be achieved by building a model of the cardiovascular system of an individual and comparing it with the real time physiological measurements taken from the patient. If this routine is carried out regularly, potential harmful medical conditions can be detected at an early stage and thus make the process of combating the disease much easier.

A model of an individual’s cardiovascular system must mimic the relationship among physiological variables (i.e., heart rate, systolic and diastolic blood pressures, and breathing rate) at different physical activity levels. If a model is adapted to an individual, then it becomes a model of the physical condition of that individual. The simulator will have to be able to adapt to the features of any individual without the supervision of an expert. This calls for a neural network.

Another reason that justifies the use of ANN technology, is the ability of ANNs to provide sensor fusion which is the combining of values from several different sensors. Sensor fusion enables the ANNs to learn complex relationships among the individual sensor values, which would otherwise be lost if the values were individually analyzed. In medical modeling and diagnosis, this implies that even though each sensor in a set may be sensitive only to a specific physiological variable, ANNs are capable of detecting complex medical conditions by fusing the data from the individual biomedical sensors.

Electronic noses ANNs are used experimentally to implement electronic noses. Electronic noses have several potential applications in telemedicine. Telemedicine is the practice of medicine over long distances via a communication link. The electronic nose would identify odors in the remote surgical environment. These identified odors would then be electronically transmitted to another site where an door generation system would recreate them. Because the sense of smell can be an important sense to the surgeon, telesmell would enhance telepresent surgery.

Instant Physician An application developed in the mid-1980s called the “instant physician” trained an auto associative memory neural network to store a large number of medical records, each of which includes information on symptoms, diagnosis, and treatment for a particular case. After training, the net can be presented with input consisting of a set of symptoms; it will then find the full stored pattern that represents the “best” diagnosis and treatment.

Neural Networks in business Business is a diverted field with several general areas of specialization such as accounting or financial analysis. Almost any neural network application would fit into one business area or financial analysis.

There is some potential for using neural networks for business purposes, including resource allocation and scheduling. There is also a strong potential for using neural networks for database mining, that is, searching for patterns implicit within the explicitly stored information in databases. Most of the funded work in this area is classified as proprietary. Thus, it is not possible to report on the full extent of the work going on. Most work is applying neural networks, such as the Hopfield-Tank network for optimization and scheduling.

Marketing There is a marketing application which has been integrated with a neural network system. The Airline Marketing Tactician (a trademark abbreviated as AMT) is a computer system made of various intelligent technologies including expert systems. A feed forward neural network is integrated with the AMT and was trained using back-propagation to assist the marketing control of airline seat allocations. The adaptive neural approach was amenable to rule expression. Additionally, the application’s environment changed rapidly and constantly, which required a continuously adaptive solution. The system is used to monitor and recommend booking advice for each departure. Such information has a direct impact on the profitability of an airline and can provide a technological advantage for users of the system.

While it is significant that neural networks have been applied to this problem, it is also important to see that this intelligent technology can be integrated with expert systems and other approaches to make a functional system. Neural networks were used to discover the influence of undefined interactions by the various variables. While these interactions were not defined, they were used by the neural system to develop useful conclusions. It is also noteworthy to see that neural networks can influence the bottom line.

Chapter 3 Fundamentals of learning algorithms

3.1 Learning process of intelligent system

Designing of intelligence computer system from characteristic associated with intelligence in human behavior.

Example: – 1. Neural Network

2. Fuzzy Logic

3. Expert system

4. Probabilistic reasoning.

Types: 1. Hard computing

2. Soft computing

Characteristics: – 1. Cognition

a) Awareness

b) Perceiving

c) Remembering

2. Logical Interface

3. Pattern Recognition

Human brain has two properties: – Human brain is getting experienced to adapt themselves to their surrounding environments. So as a result the information processing capability of the brain is rendered, when this happen the brain becomes plastic.

1. Plastic: – Capability to process information, capability of adding. Must preserve the information it has learn previously.

2. Stable: – Remain stable when it is presented with irrelevant information, useless information.

Learning is a fundamental component to an intelligent system, although a precise definition of learning is hard to produce. In terms of an artificial neural network, learning typically happens during a specific training phase. Once the network has been trained, it enters a production phase where it produces results independently. Training can take on many different forms, using a combination of learning paradigms, learning rules, and learning algorithms. A system which has distinct learning and production phases is known as a static network. Networks which are able to continue learning during production use are known as dynamical systems.

Learning implies that a processing unit is capable of changing its input/output behavior as a result of changes in the environment. Since the activation rule is usually fixed when the network is constructed and since the input/output vector cannot be changed, to change the input/output behavior the weights corresponding to that input vector need to be adjusted. A method is thus needed by which, at least during a training stage, weights can be modified in response to the input/output process. A number of such learning rules are available for neural network models.

A learning paradigm is supervised, unsupervised or a hybrid of the two, and reflects the method in which training data is presented to the neural network. A method that combines supervised and unsupervised training is known as a hybrid method. A learning rule is a model for the types of methods to be used to train the system, and also a goal for what types of results are to be produced. The learning algorithm is the specific mathematical method that is used to update the inter-neuronal synaptic weights during each training iteration. Under each learning rule, there are a variety of possible learning algorithms for use. Most algorithms can only be used with a single learning rule. Learning rules and learning algorithms can typically be used with either supervised or unsupervised learning paradigms, however, and each will produce a different effect.

Overtraining is a problem that arises when too many training examples are provided, and the system becomes incapable of useful generalization. This can also occur when there are too many neurons in the network and the capacity for computation exceeds the dimensionality of the input space. During training, care must be taken not to provide too many input examples and different numbers of training examples could produce very different results in the quality and robustness of the network.

3.2 Learning rules of neural network

The learning rules determine how ‘experiences’ of a network exert their influence on its future behavior. There are, in essence, three types of learning rules: supervised, re-inforcement, and non-supervised or unsupervised.

3.2.1 Supervised learning

The term supervised is used both in a very general and narrow technical sense. In the narrow technical sense supervised means the following. If for a certain input the corresponding output is known, the network is to learn the mapping from inputs to outputs. In supervised learning applications, the correct output must be known and provided to the learning algorithm. The task of the network is to find the mapping. The weights are changed depending on the magnitude of the error that the network produces at the output layer: the larger the error, i.e. the discrepancy between the output that the network produces’ the actual output ‘ and the correct output value ‘ the desired output ‘, the more the weights change. This is why the term error-correction learning is also used.

3.2.2 Reinforcement learning

Reinforcement learning is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment.'( [Sutton and Barto,1998]). If the teacher only tells a student whether her answer is correct or not, but leaves the task of determining why the answer is correct or false to the student, we have an instance of reinforcement learning. The problem of attributing the error (or the success) to the right cause is called the credit assignment or blame assignment problem.

3.2.3 Unsupervised learning

In unsupervised learning, the agent learns recurring patterns without any tutoring input. Essentially, the neural system detects correlations between neuronal _ring patterns and between those patterns and the structure of inputs to the network. These correlations are strengthened by changes to ANN weights such that, in the future, portions of a pattern such to predict/retrieve much of the remainder.

3.3 Learning rate of neural network

Most of the network structure undergoes learning procedure during which synaptic weights W and v are adjusted. Learning rate coefficient determines the size of the weights adjustments made at each iteration and hence influences the rate of convergence. Poor choice of coefficient can result in a failure in convergence. If learning rate coefficient too large, the search path will oscillate and convergence more slowly in a direct descent. If the coefficient is too small the descent will progress in small steps significantly increasing time to converge.

3.3.1 Perceptron

The Perceptron present by Rosenblatt in 1959. The essential innovation was the introduction of numerical weights and a special interconnection pattern. In the original Rosenblatt model the computing units are threshold elements and the connectivity is determined stochastically. Learning takes place by adapting the weights of the network with a numerical algorithm. Rosenblatt’s model was refined and perfected in the 1960s and its computational properties were carefully analyzed by Minsky and Papert [15]. In the following, Rosenblatt’s model will be called the classical Perceptron and the model analyzed by Minsky and Papert the Perceptron.

The Perceptron forms a network with a single node and set of input connection along with a dummy input which is always set to 1 and a single output lead. the input pattern which could be a set of numbers is applied to each of the connections to the node.

The Perceptron learning algorithm updates the strength of each connection to the node is in such a way that output from the node happens to be with in some threshold value for each class represented by input patterns. Thus the preceptron equation for the class label CK

Ck = W0 + W1I1 + W2 I2 ”””. WnIn

3.4 Introduction to proposed algorithms

If the output is correct then no adjustment of weights is done.

W ij(k1)= Wijk

If the output is 1 but should have been 0 then the weights are decreased on the active input links.

W ij(k+1)= Wijk – ”.xi

Where ” is the learning rate.

If the output is 0 but should have been 1 then the weights are increased on the active input links.

W ij(k+1)= Wijk+ ”.xi

W ij(k+1) is new adjusted weight and W ijkis old weight.

Step 1: create perceptron with (n+1) input neurons X0 X1 ‘Xn where X = 1 is the bias input. Let O be the output neuron.

Step 2: Initialize W= ( Wo,W1’Wn) to random weights.

Step 3: Iterate through the input patterns X of the training set using the weight set (i.e.) compute the weighted sum of input net j = for each input pattern j.

Step 4: Compute the output Y using the step function

Y= f(net j) = 1 net j > 0

=0 otherwise

Step 5: compare the computed output Yj with the target output Yj for each input pattern j. If all the input patterns have been classified correctly output the weights and exist.

Step 6: Otherwise, update the weights as given below:

If the computed output Yj is 1 but should have been 0 ,

Wi=Wi – ”.xi

If the computed output Yj is 0 but should have been 1 ,

Wi=Wi + ”.xi

Where ” is the learning rate.

Step 7: Go to step 3.

Chapter 4 Analysis and Synthesis of shortest path routing

4.1 Fundamentals of Shortest path optimization

The most important algorithms for solving this problem are:

a) Dijkstra’s algorithm – Dijkstra’s algorithm conceived by Dutch computer scientist EdsgerDijkstra in 1956 and published in 1959,[1][2] is a graph search algorithmthat solves the single-source shortest path problem for a graph with non-negative edge path costs, producing a shortest path tree. This algorithm is often used in routing as a subroutine in other graph algorithms, or in GPS Technology. Dijkstra’s algorithm solves the single-source shortest path problems.

b) Bellman’ Ford Algorithm-Bellman’Ford algorithm solves the single-source problem if edge weights may be negative. The Bellman’Ford algorithm is an algorithm that computes shortest paths from a single source vertex to all of the other vertices in a weighted digraph. It is slower than Dijkstra’s algorithm for the same problem, but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers. The algorithm is usually named after two of its developers, Richard Bellman and Lester Ford, Jr., who published it in 1958 and 1956, respectively; however, Edward F. Moore also published the same algorithm in 1957, and for this reason it is also sometimes called the Bellman’Ford’Moore algorithm. Negative edge weights are found in various applications of graphs, hence the usefulness of this algorithm. However, if a graph contains a “negative cycle”, i.e., a cycle whose edges sum to a negative value, then there is no cheapest path, because any path can be made cheaper by one more walk through the negative cycle. In such a case, the Bellman’Ford algorithm can detect negative cycles and report their existence, but it cannot produce a correct “shortest path” answer if a negative cycle is reachable from the source.

c) A* search algorithm -A* search algorithm solves for single pair shortest path using heuristics to try to speed up the search. A* is a computer algorithm that is widely used in path finding and graph traversal, the process of plotting an efficiently traversable path between points, called nodes. Noted for its performance and accuracy, it enjoys widespread use. (However, in practical travel-routing systems, it is generally outperformed by algorithms which can pre-process the graph to attain better performance. A* uses a best-first search and finds a least-cost path from a given initial node to one goal node (out of one or more possible goals). As A* traverses the graph, it follows a path of the lowest expected total cost or distance, keeping a sorted priority queue of alternate path segments along the way.

d) Floyd’Warshall algorithm – Floyd’Warshall algorithm solves all pair’s shortest paths.Floyd’Warshall algorithm (also known as Floyd’s algorithm, Roy’Warshall algorithm, Roy’Floyd algorithm, or the WFI algorithm) is a graph analysis algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles, see below) and also for finding transitive closure of a relation R. A single execution of the algorithm will find the lengths (summed weights) of the shortest paths between all pairs of vertices, though it does not return details of the paths themselves. The algorithm is an example of dynamic programming. It was published in its currently recognized form by Robert Floyd in 1962. However, it is essentially the same as algorithms previously published by Bernard Roy in 1959 and also by Stephen Warshall in 1962 for finding the transitive closure of a graph. The modern formulation of Warshall’s algorithm as three nested for-loops was first described by Peter Ingerman, also in 1962.

e) Johnson’s algorithm- Johnson’s algorithm solves all pair’s shortest paths, and may be faster than Floyd’Warshall on sparse graphs. Johnson’s algorithm is a way to find the shortest paths between all pairs of vertices in a sparse directed graph. It allows some of the edge weights to be negative numbers, but no negative-weight cycles may exist. It works by using the Bellman’Ford algorithm to compute a transformation of the input graph that removes all negative weights, allowing Dijkstra’s algorithm to be used on the transformed graph. It is named after Donald B. Johnson, who first published the technique in 1977.

A similar reweighting technique is also used in Suurballe’s algorithm (1974) for finding two disjoint paths of minimum total length between the same two vertices in a graph with non-negative edge weights.

4.2 Different algorithms of shortest path optimization

The following technique is widely used in many forms, because it is simple and easy to understand. The idea is to build a graph of the subnet, with each node of the graph representing a router and each arc representing a communication line (link).To choose a rout between a given pair of routers , the algorithm just finds the shortest path between them on the graph. The shortest path concept includes definition of the way of measuring path length. Deferent metrics like number of hops, geographical distance, the mean queuing and transmission delay of router can be used. In the most general case, the labels on the arcs could be computed as a function of the distance, bandwidth, average traffic, communication cost, mean queue length, measured delay, and other factors.

There are several algorithms for computing shortest path between two nodes of a graph. One of them due to Dijkstra.

4.2.1 Flooding

That is another static algorithm, in witch every incoming packet is sent out on every outgoing line except the one it arrived on. Flooding generates infinite number of duplicate packets unless some measures are taken to damp the process.

One such measure is to have a hop counter in the header of each packet, which is decremented at each hop, with the packet being discarded when the counter reaches zero. Ideally, the hop counter is initialized to the length of the path from source to destination. If the sender does not no the path length, it can initialize the counter to the worst case, the full diameter of the subnet.

An alternative technique is to keep track of which packets have been flooded, to avoid sending then out a second time. To achieve this goal the source router put a sequence number in each packet it receives from its hosts. Then each router needs a list per source router telling which sequence numbers originating at that source have already been seen. Any incoming packet that is on the list is not flooded. To prevent list form growing, each list should be augmented by a counter, k, meaning that all sequence numbers through k have been seen.

A variation of flooding named selective flooding is slightly more practical. In this algorithm the routers do not send every incoming packet out on every line, but only on those going approximately in the right direction.(there is usually little point in sending a westbound packet on an eastbound line unless the topology is extremely peculiar).

Flooding algorithms are rarely used, mostly with distributed systems or systems with tremendous robustness requirements at any instance.

4.2.2 Flow-Based Routing

The algorithms seen above took only topology into account and did not consider the load. The following algorithm considers both and is called flow-based routing.

In some networks, the mean data flow between each pair of nodes is relatively stable and predictable. Under conditions in which the average traffic from i to j is known in advance and to a reasonable approximation ,constant in time, it is possible to analyze the flows mathematically to optimize the routing.

The idea behind the analysis is that for a given line, if the capacity and average flow are known, it is possible to compute the mean packet delay on that line from queuing theory. From the mean delays on all the lines , it is straightforward to calculate a flow-weighted average to get the mean packet delay for the whole subnet. The routing problem then reduces to finding the routing algorithm that produces the minimum average delay for the subnet.

This technology demands certain information in advance. First the subnet topology, second the traffic matrix, third the capacity matrix and finally a routing algorithm (further explanation look at the same reference as above).

4.2.3 Distance Vector Routing

Modern computer networks generally use dynamic routing algorithms rather then static ones described above. Two dynamic algorithms in particular, distance vector & link state routing are the most popular. In this section we will look at the former algorithm. In the following one we will study the later one.

Distance vector routing algorithms operate by having each router maintain a table giving he best known distance to each destination and which line to use to get there. These table are updated by exchanging information with the neighbors.

The distance vector routing algorithm is sometimes called by other names including Bellman-Ford or Ford-Fulkerson. It was the original ARPANET routing algorithm and was also used in the Internet under the name RIP and in early versions of DECnet and Novell’s IPX. AppleTalk & CISCO routers use improved distance vector protocols.

In that algorithm each router maintains a routing table indexed by and containing one entry for each router in the subnet. This entry contains two parts: the preferred outgoing line to use for that destination and an estimate of the time or distance to that destination. The metric used might be number of hops, time delay in milliseconds, total number of packets queued along the path or something similar.

The router is assumed to know the ‘distance’ to each of its neighbors. In the hops metric the distance is one hop, for queue length metrics the router examines each queue, for the delay metric the route can measure it directly with special ECHO packets that the receiver just timestamps and sends back as fast as it can.

Distance vector routing works in theory, but has a serious drawback in practice: although it converges to the correct answer, it may be done slowly. Good news propagates at linear time through the subnet, while bad ones have the count-to-infinity problem: no router ever has a value more then one higher than the minimum of all its neighbors. Gradually, all the routers work their way up to infinity, but the number of exchanges required depends on the numerical value used for infinity. One of the solutions to this problem is split horizon algorithm that defines the distance to the X router is reported as infinity on the line that packets for X are sent on. Under that behavior bad news propagate also at linear speed through the subnet.

4.2.4 Link State Routing

The idea behind link state routing is simple and can be stated as five parts. Each router must:

1) Discover its neighbors and learn their network addresses.

2) Measure the delay or cost to each of its neighbors.

3) Construct a packet telling to all it has just learned.

4) Send the packet to all other routers.

5) Compute the shortest path to every other router.

In effect, the complete topology and all delays re experimentally measured and distributed to every router. Then Dijkstra’s algorithm can be used to find the shortest path to every other router.

4.2.5 Hierarchical Routing

As the network grows larger the amount of resources necessary to take care or routing table becomes enormous and makes routing impossible. Here appears the idea of hierarchical routing that suggests that routers should be divided into regions, with each router knowing all the details about how to route packets within its own region, but knowing nothing about the internal structure of other regions.

Unfortunately the gains in routing table size & CPU time are not free, the penalty of increasing path length has to be paid.

It has been discovered that the optimal number of nested levels for an N router subnet is ln N, requiring a total of eln N entries per router.

4.2.6 Routing for Mobile Hosts

Through the last years more and more people purchase portable computer under natural assumption that they can be used all over the world. These mobile hosts introduce new complication: to route a packet to a mobile host the network first has to find it. Generally that requirement is implemented through creation of two new issues in LAN foreign agent and home agent.

Each time any mobile host connects to the network it collects a foreign agent packet or generates a request for foreign agent, as a result they establish connection between them and the mobile host supplies the foreign agent with it’s home & some security information.

After that the foreign agent contacts the mobile host’s home agent and delivers the information about the mobile host.

Subsequently the home agent examines the received information and if it authorizes the security information of mobile host it allows the foreign agent to proceed. As the result the foreign agent enters the mobile host into it’s routing table.

When the packet for the mobile host arrive its home agent it encapsulates it and redirects to the foreign agent where the mobile host is hosting. Then it returns encapsulation data to the router that sent the packet so that all next packet would be directly sent to correspondent router (foreign agent).

For some applications ,hosts need to send messages to many or all other hosts. Broadcast routing is used for that purpose. Some deferent methods where proposed for doing that.

1) The source should send the packet to all the necessary destinations. One of the problems of this method is that the source has to have the complete list of destinations.

2) Flooding routing. As it was discussed before the problem of that method is generating duplicate packets.

3) Multidestination routing. In that method each packet includes list or a bitmap indicating desired destinations. When a packet arrives router checks all the destinations to determine the set of output lines that will be needed, generates a new copy of the packet for each output line to be used and includes in each packet only those destinations that are to use the line. In effect, the destination set is partitioned between the lines. After a sufficient number of hops, each packet will carry only one destination and can be treated as a normal packet.

4) This routing method makes use of spanning tree of the subnet. If each router knows which of its lines belong to the panning tree, it can copy an incoming broadcast packet onto all the spanning tree lines except the one it arrived on. Problem: each router has to know the spanning tree.

Reverse path-forwarding algorithm at the arrival of the packet checks if the line that packet arrived on is the same one through which the packets are send to the source, if yes it sends it through all other lines, otherwise discards it.

Chapter 5 Rosenblatt neural network perception