Scott Anthony Gazzard
Department of Psychology
University of Sydney
Back to Contents
Functional properties of the SWA network
The structural features introduced above can be used to construct a network that propagates a circular wavefront of activation which spreads from a cluster of units that are activated simultaneously. The mechanisms of this wavefront propagation are descr
ibed below, and are represented graphically in Figure 2. Consider for the following discussion only the associative layer of the network.
In Figure 2, the associative layer is viewed from above, and activation is represented by tone; dark areas represent high activity, light areas represent low activity. Note that in a state of relaxation (prior to external activation of any units) there e
xists a level of noisy background activation, and that a high level of activation over a number of units is required before further activity is generated. The first step in the formation of an activation wavefront is the simultaneous activation of a clus
ter of nearby units (Figure 2a). By virtue of the large number of connections to nearby units, the activation of this cluster will tend to activate units in the immediate vicinity (Figure 2b). Due to adaptation by the original cluster of units, activati
on will decrease in these units for a short period. The activation of the ring of units surrounding the original cluster subsequently activates adjacent units on the outer edge of the ring (units on the inner edge of the ring will not return to a highly
active state due to adaptation/the refractory period). Thus the wave of activation propagates outward only.
In sum, the SWA model is based upon the propagation of circular waves of activation which spread across a single layer of pyramidal neuron-type processing units. With the simple architecture illustrated above (ie. a single associative layer with sensory
and motor zones at opposite ends), the network will be able to associate any single input signal with any number of output patterns. More complex input patterns require a more complicated model, but before considering such complexities it is important to
explain the operation of the three layer network as displayed in Figure 1.
Two additional assumptions must be made before the network will operate as desired;
With the above assumptions made (neither of which are particularly problematic from a physiological viewpoint), any simple input can be associated with any output. When a single receptor site is stimulated on the input layer, coarse coding will result in
a cluster of units being activated in the sensory area of the associative layer. The architecture described above will lead naturally to a topographic representation of the input layer in the sensory zone. Therefore, stimulation of different receptor s
ites will result in different associative-layer clusters being activated. From the activated cluster of units, a wave of activation will spread across the associative layer, travelling from the sensory zone, to and through the motor zone. Note that the
activation wave passes throughout all parts of the motor zone, so that every unit in the output layer can potentially be associated with activation.
However, for the network to be useful, units in the output layer must be able to discriminate between different inputs. This can be done because every cluster of units originally activated in the sensory zone creates a wavefront that has a unique radius
and orientation. The wavefront produced by a particular cluster effectively transmits a 'signature' pattern of activation to the footprint of each motor unit. Figure 4 illustrates this point. Learning to activate a desired output unit in response to ac
tivation of a given sensory zone-cluster then simply requires modifying weights among the connections that project from the output unit, so that when the sensory zone-cluster's signature wavefront passes through the output unit's footprint area, the outpu
t unit is activated. I wish to again stress the point that every receptor will cause a wavefront to pass throughout the entire motor area, so that potentially any receptor site can be associated with any output unit, even though no direct links exist bet
ween the two. Hence the model overcomes the awkward requirement of near-direct connections between input and output units in conventional networks. Also note that in this model all learning occurs in the direct connections to output units, so learning a
lgorithms such as backpropagation of error are no longer required.
The simple three-layer SWA network that has been described above is able to do what conventional three-layer networks have not, that is to form associations between spatially distant input and output units. However, the model as it stands is severely lim
ited in terms of its associative capabilities. For example, the ability to combine inputs so that the exclusive-or problem might be learned is non-existent, since the model only works effectively when one receptor at a time is stimulated. Similarly, the
re would appear to be no mechanism for encoding information distributed temporally as well as spatially. Nevertheless, I believe that these problems can be solved using SWA principles, though a more complex architecture is required in order to fully expl
oit the associative power of the SWA network.
Back to Contents
SWA networks with multiple associative layers
Figure 5 illustrates the next 'evolutionary' step forward for the SWA model. Essentially this network comprises an input layer, two associative layers, and an output layer. Note that the input layer projects onto the first associative layer only, while
units in the output layer project to both associative layers. For single-input to output associations this network will perform identically to the simple three-layer network described above, as the output layer still connects to the first associative lay
er. The second associative layer becomes important however, when two inputs are entered simultaneously, or temporally near to one another.
Now, one further assumption must be allowed before the dual associative layers become useful. Recall that processing units have the property that the number of connections to other units decreases as a function of distance. Recall also that a certain le
vel of activation in a cluster of units is required to regenerate an activation wave. The former property means that if the distance between layers is greater than the average distance between units within layers (as would be expected), then there will b
e many connections to units in the same layer, and relatively few connections to units in adjacent layers. Thus, a faint trace of wavefronts traversing one layer will be 'echoed' in adjacent layers. However, because the amplitude of the trace is much we
aker than the original, the wavefront trace will not have sufficient amplitude to regenerate itself.
A brief digression is necessary here, in order to highlight a very important feature of wave-dynamic systems. Some of the most interesting and scientifically important properties of waves in physical systems (whether they be matter waves, sound waves or
electromagnetic waves,) derive from the interference patterns produced when two or more waves interact. Where two waves meet, generally the energy of each wave is summed at that point, according to the principle of superposition. If the peak of o
ne wave meets the trough of another, the two waves will cancel each other out. If the peak of one wave meets the peak of another, the energy at that point will equal the sum of the energy of each wave. When two circular spreading wavefronts such as thos
e proposed in the SWA model (or those observed in ripples on the surface of a pond) meet, a unique interference pattern that is a result of the combined energies of each wavefront occurs. Importantly, the interference pattern changes over time, initially
creating a single area of particularly high energy which subsequently splits into two areas that move away from one another, as can be seen in Figure 6.
Returning again to the theoretical SWA model, the principle of wave interference can be utilised to vastly increase the associative power of the network. Consider the effect of simultaneously stimulating two receptor sites on the input layer. This will
lead to two particular unit clusters being simultaneously activated in the sensory zone of the first associative layer, which will in turn initiate the propagation of two wavefronts. At a point half way between the original clusters, interference will pr
oduce a concentrated area of high activation near the sensory zone. Recall that a weaker trace of activity on the first associative layer is constantly being transmitted to the second associative layer. Usually, the weaker trace is insufficient to initi
ate any further activity, but the high level of activation resulting from interference in the first associative layer may result in enough activation in the second associative layer to begin generation of a new wavefront. This second-order wavefront prop
agates from a cluster in the second associative layer directly above the meeting point of the original wavefronts. Thus, the second-order wavefront represents a combination of the original inputs.
According to now-familiar principles, the second-order wavefront will spread across its associative layer until it passes through its corresponding motor zone where any motor unit could become associated with its signature activation pattern. Hence, the
network can associate a response to either input alone, and/or to a representation of both inputs combined. Again, the association is performed without the need for pre-existing direct or near-direct connections between elements, so spatially distant inp
uts can easily be combined. Solving the exclusive-or problem becomes a simple matter of increasing weights on inhibitory connections to whatever output units are associated with the individual inputs. In this way, the response that is turned 'on' when e
ither input is presented alone can be turned 'off' via second-order activity when both inputs are present. Output units here are connected to all associative layers, projecting a local footprint onto each. This would mean that output depends not only on
one associative layer, but on the activation patterns across small areas of all layers, and in fact it would appear that this is necessary for effective discrimination to occur between input patterns.
The mechanism of wavefront interference allows combination of two inputs not only presented simultaneously, but also those distributed temporally. In fact, if stimulation of two receptor sites on the output layer are caused by stimulus A and stimulus B,
then the system easily discriminates between the following three cases: i) A and B presented simultaneously; ii) A presented before B; and iii) B presented before A. Furthermore, the system is able to discriminate between the case of i) A preceding B by
1 second; and ii) A preceding B by .5 seconds. In other words, true temporal encoding (viz a viz sequential encoding) of inputs is achieved. Temporal coding is possible because when, for example, stimulus B is presented before stimulus A, the wavefront
in the associative layer begins to propagate from the cluster associated with stimulus B before the cluster associated with stimulus A is generated. Thus the point at which the two wavefronts meet will be further from B than A. This will lead to the pro
pagation of a different second-order wavefront in the second associative layer that will transmit a unique signature activation pattern to second-level motor units. The greater the time lag between stimulation of B and A, the further away from B the poin
t of initial interference becomes. Figure 7 demonstrates these points.
Back to Contents
Further considerations of the SWA model
It was argued above that the SWA model with two associative layers will be able to successfully combine information from two inputs presented to the system simultaneously, or distributed over time. Conventional networks are able to exhibit the same behav
iour, but they do so using architectures that rely on extensive interconnectedness among processing units. It is hoped that the SWA model may provide a useful connectionist framework without such reliance. Before concluding this discussion there are seve
ral further considerations of the SWA model that I wish to highlight briefly.
First, an objection which has already been raised a number of times to the SWA model is that the animal brain is not comprised of perfect two-dimensional layers as depicted in the above discussion. Of course, it is true that such a rigid laminar structur
e is a fiction. However, the layers in the SWA model ought not be taken as literal layers, but as virtual layers. The model desribed above utilises two-dimensional layers because it is easier to conceptualise circular wavefronts spreading across a flat
surface than it is to imagine irregular propagation of activity throughout a three-dimensional space. However, as long as the general assumptions (eg. that some neurons are nearer one another than others, that activation will spread approximately outward
from an activation centre, that activation can spread through 'motor areas'), then the principle of superposition should still enable a SWA network to operate. Further, it is not argued that all areas of the brain operate according to SWA principle
s -- this would be obviously untenable. What is being offered is a general model that allows for the association of separate representations without pre-existing connections, in the same way that connectionism itself offers a general model for inf
ormation processing. It is envisaged that an SWA network would be one of many network types that might be found in the animal brain (see for example the discussion below about pattern recognition).
A second concern relates to timing of second-order wavefronts during execution of an XOR process (refer to the discussion
under Figure 6). It has quite rightly been pointed out that the second-order wavefront used to negate the effects of first-order wavefronts will be delayed. Thus, as the model is presented above, the output unit will not be inhib
ited until after it has been excited. As was mentioned above, some of the more complicated details were omitted in the introductory discussion for the sake of clarity. One of those details is that any input will not be represented by a single wav
efront. Rather, multiple wavefronts will continue to propagate at a constant frequency (dependent upon the units' refractory period) for as long as the original stimulus is present. In the preceding discussion output units have summed inputs spatially.
If they instead require spatial and temporal summation of inputs in order to reach threshold, then the timing problem is solved: as long as the second-order wavefront passes through the motor zone before the output unit receives sufficient spatio-tempora
l excitation, and provided that inhibitory connections are at least as strong as excitatory ones, the output unit will never reach threshold.
While a model with two associative layers ought to operate effectively when two or fewer inputs are presented to the system, it is unclear precisely how unstable the system will become when confronted with a larger number of inputs. Introducing larger nu
mbers of inputs will generate many wavefronts and very complex interference patterns. Without actually constructing and testing a model it is difficult to speculate how the system will operate when confronted with many inputs. It would be expected howev
er that a dual-associative layer model will be insufficient in terms of discriminability and stability. One solution to this problem would be to add more associative layers. Theoretically, each additional layer may operate to reduce or dampen the amount
of activation in the system. Thus, in the same way that the model above reduces two wavefronts in the first associative layer to a single wavefront in the second associative layer, it is conceivable that four wavefronts in a first associative layer may
reduce to two in a second, and one in a third. Again, however, the precise manner in which even a system with multiple associative layers will operate will be difficult to predict given the non-linear dynamical nature of the system.
It is also unclear whether SWA networks could be made to exhibit the desirable properties of generalisation that are present in conventional networks. The models described in this paper will display a form of generalisation based upon 'perceptual similar
ity' of stimuli. Inputs that are spatio-temporally similar will create very similar activation wavefronts. Thus, motor units will have most difficulty discriminating between inputs that are near to one another spatially or temporally. In a sense, this
inability to discriminate is generalisation -- responses given to one input will tend to be given to similar inputs. But it is unlikely that this simple form of generalisation will be a sufficient replacement for conventional models of pattern recognitio
n etc. Thus, it is proposed that in more advanced models, the associative layer does not interface directly with input units. Instead a more conventional network (eg. an unsupervised pattern recognisor) could be employed at the lowest levels of the syst
em, processing input before passing a modified representation of the input layer to the associative layer. Thus properties of generalisation and default assignment may be provided by the perceptual sub-network, while the associative capabilities of an SW
A network increase the whole system's associative power.
Finally, it is interesting to note that the fundamental principles underlying the SWA model were actually described (though in rather general terms) by Karl Lashley as far back as 1951, in his influential paper "The Problem of Serial Order in Behavior".
To quote directly from the paper:
I believe, that there exist in the nervous organization, elaborate systems of interrelated neurons capable of imposing certain types of integration upon a large number of widely spaced effector elements; in the one case transmitting temporally spaced wave s of facilitative excitation to all effector elements; in the other imparting a directional polarization to both receptor and effector elements...
I can best illustrate this conception of nervous action by picturing the brain as the surface of a lake. The prevailing breeze carries small ripples in its direction, the basic polarity of the system. Varying gusts set up crossing systems of waves, whic h do not destroy the first ripples, but modify their form, a second level in the system of space coordinates. A tossing log with its own period of submersion sends out periodic bursts of ripples, a temporal rhythm. The bow wave of a speeding boat moment arily sweeps over the surface, seems to obliterate the smaller waves yet leaves them unchanged by its passing, the transient effect of a strong stimulus... (Lashley, 1951, p133).
Atrens, D. and Curthoys, I. (1982)
The Neurosciences and Behaviour: An introduction
, 2nd edn. Sydney: Academic Press.
Bairaktaris, D. (1994) The problem of temporal order in connectionist networks and its implications for short-term memory modelling. In M. Oaksford and G.D.A. Brown (Eds),
Neurodynamics and Psychology
, London: Academic Press.
Cairns, D.E., Baddeley, R.J. and Smith, L.E. (1994) Phase constraints on synchronising oscillator networks. In M. Oaksford and G.D.A. Brown (Eds)
Neurodynamics and Psychology
, London: Academic Press.
Chater, N. and Conkey, P. (1994) Sequence processing with recurrent neural networks. In M. Oaksford and G.D.A. Brown (Eds),
Neurodynamics and Psychology
, London: Academic Press.
Cleeremans, A. (1993)
Mechanisms of Implicit Learning: Connectionist models of sequence processing
Cambridge, MA: MIT Press..
Cleeremans, A, Servan-Schreiber, D, and McClelland, J.L. (1989) Finite state automata and simple recurrent networks.
Neural Computation
, 1(3), 372-381.
Elman, J. (1990) Finding structure in time.
Cognitive Science
, 14, 179-211.
Feldman, J.A, and Ballard, D.H. (1982) Connectionist models and their properties.
Cognitive Science
, 6(3), 205-254.
Fodor, J.A. and Pylyshyn, Z.W. (1988) Connectionism and cognitive architecture: A critical analysis.
Cognition
, 28, 3-71.
Gershon, M.D, Schwartz, J.H, and Kandel, E.R. (1985) Morphology of chemical synapses and patterns of interconnection. In E.R Kandel and J.H Schwartz (Eds),
Principles of Neural Science
, 2nd edn. NY: Elsevier Science Publishing.
Kamin, L.J. (1969) Predictability, surprise, attention, and conditioning. In Campbell and Church (Eds),
Punishment and Aversive Behavior
, NY: Appleton-Century-Crofts.
Kandel, E.R. (1985) Factors controlling transmitter release. In E.R Kandel and J.H Schwartz (Eds),
Principles of Neural Science
, 2nd edn. NY: Elsevier Science Publishing.
Kandel, E.R. and Seiglebaum, S. (1985) Principles underlying electrical and chemical synaptic transmission. In E.R Kandel and J.H Schwartz (Eds),
Principles of Neural Science
, 2nd edn. NY: Elsevier Science Publishing.
Kentridge, R.W. (1994) Critical dynamics of neural networks with spatially localised connections. In M. Oaksford and G.D.A. Brown (Eds)
Neurodynamics and Psychology
, London: Academic Press.
Lashley, K.S (1951) The problem of serial order in behavior. In L.A. Jeffries (Ed.),
Cerebral mechanisms in behavior: The Hixon Symposium
. New York: Wiley.
McClelland, J.L, Rumelhart, D.E, and Hinton, G.E. (1986) The appeal of parallel distributed processing. In Rumelhart, McClelland and the PDP Research Group (Eds),
Parallel Distributed Processing volume 1: Foundations
, Cambridge, MA: MIT Press.
Marczell, Zs., Kalmar, Zs., and Lorincz, A. (1995) Generalized skeleton formation for texture segmentation.
Neural Network World.
Oaksford, M, and Brown, G.D.A (Eds) (1994)
Neurodynamics and Psychology
, London: Academic Press.
Olah, M. and Lorincz, A. (1995) Analog VLSI implementaion of grassfire transformation for generalised skeleton formation.
Proceedings of ICANN'95.
Pearce, J.M. (1994) Discrimination and categorization. In N.J. Mackintosh (Ed),
Animal Learning and Cognition
, London: Academic Press.
Reeke, G.N. (1992) Neural nets and neuronal nets: How much like the nervous system should a model be? In Fidia Research Foundation (Eds),
Neuropsychology: The Neuronal Basis of Cognitive Function
, volume 2, NY: Thieme Medical Publishers, Inc.
Rumelhart, D.E, Hinton, G.E, and McClelland, J.L. (1986) A general framework for parallel distributed processing. In Rumelhart, McClelland and the PDP Research Group (Eds),
Parallel Distributed Processing volume 1: Foundations
, Cambridge, MA: MIT Press.
Stevens, C.F. (1989) How cortical interconnectedness varies with network size.
Neural Computation
, 1, 473-479.
Stratford, K.J. (1992) Nerve cell modeling: The benefits of compartmentalized thinking. In Fidia Research Foundation (Eds),
Neuropsychology: The Neuronal Basis of Cognitive Function
, volume 2, NY: Thieme Medical Publishers, Inc.
Waibel, A. (1992) Modular construction of time-delay neural networks for speech recognition.
Neural Computation
, 1, 39-46.
Wearden, J. (1994) Prescriptions for models of biopsychological time. In M. Oaksford and G.D.A. Brown (Eds),
Neurodynamics and Psychology
, London: Academic Press.
Back to Contents