[Table of Contents]

Introducing a New Connectionist Model:

The Spreading Waves of Activation network


Scott Anthony Gazzard

Department of Psychology

University of Sydney





Abstract


This paper introduces a novel approach to connectionist cognitive modelling. The Spreading Waves of Activation (SWA) model involves circular waves of activation that radiate across layers of processing units. Unlike conventional connectionist networks, there is no reliance on direct or near-direct pre-existing pathways between input and output units. Rather, inputs lead to the propagation of activation waves which pass throughout local areas of units connected to output units. In this way, particular activation patterns unique to any input can be potentially associated with any output, even though the input and output units may be spatially distant. Further, it is proposed that interference patterns created by multiple activation waves enable the mod el to truly encode patterns of inputs distributed throughout time as well as those presented simultaneously. It is thus hoped that this approach may provide a solution to several of connectionism's present structural and functional limitations.


Contents:



Introduction

This paper introduces a new cognitive model based on parallel-processing connectionism, but with a number of unique functional properties. The Spreading Waves of Activation (SWA) model is being presented in the hope of taking a step towards a comprehensi ve, neurally plausible solution to the problems of temporal encoding and creating configural representations. Essentially, the model is based upon a conventional neural network architecture with several significant modifications to patterns of connectivi ty. The model encodes information as 'waves' of activation rather than activity patterns among separate processing units. The primary advantages of such a system stem from the principle of superposition in wave-dynamic systems, whereby two or more waves 'interfere' with one another. It is envisaged that by exploiting interference patterns, the system will be able to associate complex input patterns in a way that allows it to efficiently deal with temporal phenomena.

It should be stressed that the model presented below is in its earliest stages of development, and efforts are only beginning to implement the SWA model into an actual working network. It is therefore likely that at least some of the assertions made in l ater sections may not prove to be accurate in a real SWA network. The purpose of this paper is to introduce the concept of the SWA operating principles and to encourage commentary and development.

Back to Contents

Dealing with temporal phenomena

The conventional three layer network (comprising input, hidden and output layers) has proven very effective in an increasingly diverse range of tasks involving mapping particular output patterns to input patterns. However, early models could not (and wer e not designed to) take account of temporal phenomena. They existed in a theoretical temporal vacuum, where output patterns of activation were mapped to input patterns that did not incorporate the dimension of time. If time was considered at all, it was usually with regard to abstract operational constraints such as Feldman's 100-step constraint (Feldman and Ballard, 1982). However, as cognitive scientists have long been aware (cf. Lashley, 1951), in the real world no set of input patterns exists outsi de of the dimension of time. Neither is real cognition a matter of 'settling' into a static solution based upon stable inputs. Rather, the intelligent system's environment is in a continual state of flux. New inputs are continually arriving and need t o be processed with regard for inputs that have been recently apprehended. An AI system must then take account of four dimensions (3 space + 1 time) before even the simplest forms of learning such as classical and instrumental conditioning (which both in volve learning time-dependent contingencies) can be properly modelled.

There are several methods that have been proposed for the incorporation of time into a connectionist framework. One of the earliest and most obvious solutions to the representation of time, known as the "moving window" approach, involves the explicit rep resentation of a system's history as an activation pattern over a set of dedicated processing units. While models employing this approach have proved successful, they are subject to several limitations, and are questionable on the grounds of biological p lausibility (see Elman, 1990). An alternative to the moving window paradigm was introduced by Jordan in 1986, whereby the output from a conventional three-layer connectionist network is re-entered as a form of input to hidden units at a subsequent time s tep. This model was modified by Elman (1990) such that a copy of the activation pattern over hidden units is re-entered as a form of input after a delay period. These networks have been termed Simple Recurrent Networks (SRNs). As SRNs contain some repr esentation of their own recent history, they are able to use this information to mediate behaviour. Functionally, SRNs have demonstrated impressive capabilities, including learning to solve a form of the exclusive-or problem distributed over time, learni ng complex letter and word sequences constrained by pseudo-grammars, and speech recognition. Further, SRNs have demonstrated the ability to abstract structure from sets of stimulus sequences, and do not require absolute precision in the timing of stimulu s presentation (Elman, 1990; Cleeremans, Servan-Schreiber, and McClelland, 1989).

Recently, an approach termed Neurodynamics has been suggested (Oaksford & Brown, 1994), with the aim of specifying neurophysiologically sound systems that take account of change across time. A neurodynamic approach views the intelligent system as an extr emely complex non-linear system constantly moving along a trajectory in high-dimensional phase space. The behaviour of the brain is thus unpredictable in terms of specifying exact micro-level patterns of activation, although more general (and stable) pat terns of activation may be observed. In fact, by virtue of the fact that SRNs re-enter information, they are non-linear systems with the capacity for complex and varied behaviour. The major challenge facing such systems is their potential instability a nd their sensitivity to initial conditions. Unless systems are naturally stable (in terms of being resistant to minor variations in system parameters), it is unlikely that they will make useful cognitive models. An unstable system may completely alter i ts behaviour in the face of only minor changes to initial parameter settings, something that neurodynamicists have already discovered (eg. Kentridge, 1994). This instability contrasts greatly with observations of the animal brain, which is able to retain a great deal of functionality in the face of relatively substantial alterations to initial conditions, for example those brought about by lesioning or psychoactive chemicals. However, it is possible to design complex systems based upon a small number of simple operating principles with a high degree of stability and which behave in orderly ways, and this is precisely the intention held when creating the SWA network.

Back to Contents

Creating configural representations

Learning theorists have long been interested in the ability of animals to acquire complex associations among elemental features of a stimulus and the stimulus as a whole. Of particular interest is the question of whether a compound stimulus (a stimulus c omprising a number of features such as colour, shape and size) is represented as a whole that enters into an associative relationship with other stimuli, or whether associations form separately to individual features. The latter view reflects an elementa l view of conditioning, while the former embodies a configural approach. For a number of reasons, it appears necessary accept some form of configural theory over elemental theories in order to account for findings in animal learning experiments (see Pear ce, 1994). Interestingly, connectionists have come to essentially the same conclusion about configural representations, though their motivation stems from the computational limitations inherent in non-configural systems.

To illustrate the problem, consider the typical exclusive-or (XOR) problem that has long concerned connectionists. In learning theory terminology, this corresponds to the A+ B+ AB0 contingency. Initially, the network is trained to associate an arbitrary input pattern (stimulus A) with a particular output pattern (response R). Then, the network is trained to associate a second input pattern (stimulus B, which does not comprise any of the units included in stimulus A) with response R also. After such tr aining the system will respond with output pattern R when either stimulus A or B is presented. Naturally, if both A and B are presented simultaneously to the system, it will also respond with R (and its tendency to do so should be stronger than that elic ited by presentation of A or B alone).

However, the system ought to be capable of learning not to respond with R when both A and B are presented, since animals can readily learn such a contingency (Pearce, 1994). This contingency exemplifies the XOR logic problem, and conventional thre e-layer connectionist systems are also quite capable of learning this contingency, but in order to do so, both input representations (or at least some parts of the representations) must be bound in some way. The binding of stimulus elements effectively c reates a new representation, which might be called 'A&B' (whether the representation 'A&B' subsumes 'A' and 'B' or is entirely separate from either element is a separate question which will not be addressed here). Note that I refer here to a representati on not necessarily as a physical entity (ie. it is not a designated unit, nor even a collection of units), but rather as a product of the connections themselves -- consistent with the original notions of connectionism. Without such a configural represent ation the system cannot distinguish between the presentation of A alone or B alone, and the compound stimulus AB. With it, the system can learn to inhibit response R when the compound stimulus is presented, thereby solving the XOR problem. In other word s, in order to learn the XOR contingency a system must form a configural representation that is something more than the simultaneous activation of representations of A and B.

To date, connectionist models appear to have been very good at demonstrating learning involving composite representations. However, this apparent success incurs the cost of a degree of physiological implausibility. Understandably, cognitive modellers ha ve approached the problem of the complexity of the animal brain by adopting a reductionist approach. Instead of dealing with over 10,000,000,000,000 processing units, they often construct models with fewer than one hundred, believing that the principles that apply to their reduced models can be applied to the animal brain. The assumption is that big systems and small systems behave identically, except that big systems are 'smarter'. However, studies of non-linear systems show us that this will not nece ssarily be the case - especially in systems as complex as the animal brain - and that a more holistic view of the brain should be pursued.

Most feed-forward three-layer networks are designed so that each unit in the hidden layer is connected to each unit in the input layer, and that each unit in the hidden layer is also connected to each unit in the output layer. That is, every unit can pot entially be bound to every other unit in the same layer via a hierarchically superior or inferior unit. This structural characteristic of direct or near-direct connectivity is an essential feature if the network is to learn to map arbitrary input and ou tput patterns. While this connectivity results in extremely efficient networks, an application of the same principle to something approaching the magnitude of the brain would require that each neuron in the brain communicated with up to two thirds of all other neurons, which is clearly unrealistic.

This problem of scaling is a serious limitation for cognitive modellers using conventional connectionism. In the animal brain, how can any two or more representations be combined logically (in the sense of OR, AND, NOT relationships) if there are no pre- existing direct pathways that effectively link every unit to every other unit in any layer? This is the problem, along with the problem of temporal encoding, to be addressed by the SWA model. Notably, both problems share the same solution, since the SWA model is able to deal with representations that are presented simultaneously in precisely the same manner as those that are distributed across time. Furthermore, the model solves these problems without the conventional necessity of prescribing extensive interconnectedness among processing units.

Back to Contents

Basic architecture of the SWA network

The basic architecture of the SWA model comprises a large number of pyramidal neuron-like processing units arranged in horizontal layers (note that pyramidal neurons are believed to be the most common type of neuron in the mammalian cortex; Stratford, 199 2). The most important property of these processing units is their general pattern of connectivity. As is the case for pyramidal neurons, the probability that any two processing units will be connected decreases as a function of the distance between the two units. That is, any one unit will have a large number of connections to units that are nearby, and progressively fewer connections to units that are farther away. It is not necessary to apply a similar principle to the initial assignment of weight s between units. Though it is not essential to the SWA model, these units can be considered linear threshold units that output an oscillatory signal which increases in frequency according to a sigmoidal function of the unit's activation level. The units must have either (or both) a short refractory period following supra-liminal activation; or the functional property of adaptation. Either of the above attributes will result in a unit that remains at relatively low levels of activity for a small period of time after sustained supra-threshold activation.

For the purposes of this introductory discussion, a three-layer network will be described, though ultimately a system comprising more layers will be required, for reasons to be explained below. Consider then, three layers which at a first approximation correspond to the conventional connectionist three-layer networks comprising an input layer, an output layer, and an intervening (hidden) layer. In the SWA model the intervening layer is critical, and will be referred to henceforth as the associative lay er, to denote its functional significance. The general pattern of connectivity between layers in this model differ greatly from those of the conventional network, as can be seen in Figure 1. Whereas a conventional network will generally have connections from units across the entire surface of the intervening layer to input and output layers, Figure 1 shows that the associative layer in the SWA model interfaces with the input and output layers in only small bands that extend across the layer's surface. To more easily demonstrate the functional properties of the model, the input and output areas on the associative layer have been placed at opposite ends of the plane, however this is not a necessary feature of the model. These input and output areas will be referred to as the sensory zone and motor zone, respectively. Figure 1 also indicates that the associative layer is relatively large compared to input and output layers.




Back to Contents

Functional properties of the SWA network

The structural features introduced above can be used to construct a network that propagates a circular wavefront of activation which spreads from a cluster of units that are activated simultaneously. The mechanisms of this wavefront propagation are descr ibed below, and are represented graphically in Figure 2. Consider for the following discussion only the associative layer of the network.

In Figure 2, the associative layer is viewed from above, and activation is represented by tone; dark areas represent high activity, light areas represent low activity. Note that in a state of relaxation (prior to external activation of any units) there e xists a level of noisy background activation, and that a high level of activation over a number of units is required before further activity is generated. The first step in the formation of an activation wavefront is the simultaneous activation of a clus ter of nearby units (Figure 2a). By virtue of the large number of connections to nearby units, the activation of this cluster will tend to activate units in the immediate vicinity (Figure 2b). Due to adaptation by the original cluster of units, activati on will decrease in these units for a short period. The activation of the ring of units surrounding the original cluster subsequently activates adjacent units on the outer edge of the ring (units on the inner edge of the ring will not return to a highly active state due to adaptation/the refractory period). Thus the wave of activation propagates outward only.


In sum, the SWA model is based upon the propagation of circular waves of activation which spread across a single layer of pyramidal neuron-type processing units. With the simple architecture illustrated above (ie. a single associative layer with sensory and motor zones at opposite ends), the network will be able to associate any single input signal with any number of output patterns. More complex input patterns require a more complicated model, but before considering such complexities it is important to explain the operation of the three layer network as displayed in Figure 1.

Two additional assumptions must be made before the network will operate as desired;

i) activation of a single 'receptor site' on the input layer will lead to the activation of a cluster of units in the sensory zone of the associative layer. This will best be achieved via coarse coding, so that information about any single receptor site is encoded a number of times. Ideally, each unit in the associative layer wil l have a small circular 'footprint' or catchment area in the input layer, and there will be a great deal of overlapping footprints covering any particular receptor site. Figure 3a illustrates how coarse coding may lead to cluster activation in the associ ative layer, with dark circles representing activated units or receptor sites.
ii) a single motor unit in the output layer will receive input from a number of units in a localised area of the motor zone in the associative cortex (see Figure 3b). This principle is the converse of the above, in that a large number of units in a lower-level layer project to a single unit in the superior layer, whereas the former principle entails a single receptor site resulting in a number of units being activated in a superior layer. However, both principles involve single units at higher levels projecting connections to a localised area (the 'footprint') in the pr eceding layer. Through modification of weights on the many connections to the motor zone, any unit in the output layer could be trained to associate a given wavefront to a given output. This principle will be examined further below.


With the above assumptions made (neither of which are particularly problematic from a physiological viewpoint), any simple input can be associated with any output. When a single receptor site is stimulated on the input layer, coarse coding will result in a cluster of units being activated in the sensory area of the associative layer. The architecture described above will lead naturally to a topographic representation of the input layer in the sensory zone. Therefore, stimulation of different receptor s ites will result in different associative-layer clusters being activated. From the activated cluster of units, a wave of activation will spread across the associative layer, travelling from the sensory zone, to and through the motor zone. Note that the activation wave passes throughout all parts of the motor zone, so that every unit in the output layer can potentially be associated with activation.

However, for the network to be useful, units in the output layer must be able to discriminate between different inputs. This can be done because every cluster of units originally activated in the sensory zone creates a wavefront that has a unique radius and orientation. The wavefront produced by a particular cluster effectively transmits a 'signature' pattern of activation to the footprint of each motor unit. Figure 4 illustrates this point. Learning to activate a desired output unit in response to ac tivation of a given sensory zone-cluster then simply requires modifying weights among the connections that project from the output unit, so that when the sensory zone-cluster's signature wavefront passes through the output unit's footprint area, the outpu t unit is activated. I wish to again stress the point that every receptor will cause a wavefront to pass throughout the entire motor area, so that potentially any receptor site can be associated with any output unit, even though no direct links exist bet ween the two. Hence the model overcomes the awkward requirement of near-direct connections between input and output units in conventional networks. Also note that in this model all learning occurs in the direct connections to output units, so learning a lgorithms such as backpropagation of error are no longer required.


The simple three-layer SWA network that has been described above is able to do what conventional three-layer networks have not, that is to form associations between spatially distant input and output units. However, the model as it stands is severely lim ited in terms of its associative capabilities. For example, the ability to combine inputs so that the exclusive-or problem might be learned is non-existent, since the model only works effectively when one receptor at a time is stimulated. Similarly, the re would appear to be no mechanism for encoding information distributed temporally as well as spatially. Nevertheless, I believe that these problems can be solved using SWA principles, though a more complex architecture is required in order to fully expl oit the associative power of the SWA network.


Back to Contents

SWA networks with multiple associative layers

Figure 5 illustrates the next 'evolutionary' step forward for the SWA model. Essentially this network comprises an input layer, two associative layers, and an output layer. Note that the input layer projects onto the first associative layer only, while units in the output layer project to both associative layers. For single-input to output associations this network will perform identically to the simple three-layer network described above, as the output layer still connects to the first associative lay er. The second associative layer becomes important however, when two inputs are entered simultaneously, or temporally near to one another.

Now, one further assumption must be allowed before the dual associative layers become useful. Recall that processing units have the property that the number of connections to other units decreases as a function of distance. Recall also that a certain le vel of activation in a cluster of units is required to regenerate an activation wave. The former property means that if the distance between layers is greater than the average distance between units within layers (as would be expected), then there will b e many connections to units in the same layer, and relatively few connections to units in adjacent layers. Thus, a faint trace of wavefronts traversing one layer will be 'echoed' in adjacent layers. However, because the amplitude of the trace is much we aker than the original, the wavefront trace will not have sufficient amplitude to regenerate itself.


A brief digression is necessary here, in order to highlight a very important feature of wave-dynamic systems. Some of the most interesting and scientifically important properties of waves in physical systems (whether they be matter waves, sound waves or electromagnetic waves,) derive from the interference patterns produced when two or more waves interact. Where two waves meet, generally the energy of each wave is summed at that point, according to the principle of superposition. If the peak of o ne wave meets the trough of another, the two waves will cancel each other out. If the peak of one wave meets the peak of another, the energy at that point will equal the sum of the energy of each wave. When two circular spreading wavefronts such as thos e proposed in the SWA model (or those observed in ripples on the surface of a pond) meet, a unique interference pattern that is a result of the combined energies of each wavefront occurs. Importantly, the interference pattern changes over time, initially creating a single area of particularly high energy which subsequently splits into two areas that move away from one another, as can be seen in Figure 6.


Returning again to the theoretical SWA model, the principle of wave interference can be utilised to vastly increase the associative power of the network. Consider the effect of simultaneously stimulating two receptor sites on the input layer. This will lead to two particular unit clusters being simultaneously activated in the sensory zone of the first associative layer, which will in turn initiate the propagation of two wavefronts. At a point half way between the original clusters, interference will pr oduce a concentrated area of high activation near the sensory zone. Recall that a weaker trace of activity on the first associative layer is constantly being transmitted to the second associative layer. Usually, the weaker trace is insufficient to initi ate any further activity, but the high level of activation resulting from interference in the first associative layer may result in enough activation in the second associative layer to begin generation of a new wavefront. This second-order wavefront prop agates from a cluster in the second associative layer directly above the meeting point of the original wavefronts. Thus, the second-order wavefront represents a combination of the original inputs.

According to now-familiar principles, the second-order wavefront will spread across its associative layer until it passes through its corresponding motor zone where any motor unit could become associated with its signature activation pattern. Hence, the network can associate a response to either input alone, and/or to a representation of both inputs combined. Again, the association is performed without the need for pre-existing direct or near-direct connections between elements, so spatially distant inp uts can easily be combined. Solving the exclusive-or problem becomes a simple matter of increasing weights on inhibitory connections to whatever output units are associated with the individual inputs. In this way, the response that is turned 'on' when e ither input is presented alone can be turned 'off' via second-order activity when both inputs are present. Output units here are connected to all associative layers, projecting a local footprint onto each. This would mean that output depends not only on one associative layer, but on the activation patterns across small areas of all layers, and in fact it would appear that this is necessary for effective discrimination to occur between input patterns.

The mechanism of wavefront interference allows combination of two inputs not only presented simultaneously, but also those distributed temporally. In fact, if stimulation of two receptor sites on the output layer are caused by stimulus A and stimulus B, then the system easily discriminates between the following three cases: i) A and B presented simultaneously; ii) A presented before B; and iii) B presented before A. Furthermore, the system is able to discriminate between the case of i) A preceding B by 1 second; and ii) A preceding B by .5 seconds. In other words, true temporal encoding (viz a viz sequential encoding) of inputs is achieved. Temporal coding is possible because when, for example, stimulus B is presented before stimulus A, the wavefront in the associative layer begins to propagate from the cluster associated with stimulus B before the cluster associated with stimulus A is generated. Thus the point at which the two wavefronts meet will be further from B than A. This will lead to the pro pagation of a different second-order wavefront in the second associative layer that will transmit a unique signature activation pattern to second-level motor units. The greater the time lag between stimulation of B and A, the further away from B the poin t of initial interference becomes. Figure 7 demonstrates these points.



Back to Contents

Further considerations of the SWA model

It was argued above that the SWA model with two associative layers will be able to successfully combine information from two inputs presented to the system simultaneously, or distributed over time. Conventional networks are able to exhibit the same behav iour, but they do so using architectures that rely on extensive interconnectedness among processing units. It is hoped that the SWA model may provide a useful connectionist framework without such reliance. Before concluding this discussion there are seve ral further considerations of the SWA model that I wish to highlight briefly.

First, an objection which has already been raised a number of times to the SWA model is that the animal brain is not comprised of perfect two-dimensional layers as depicted in the above discussion. Of course, it is true that such a rigid laminar structur e is a fiction. However, the layers in the SWA model ought not be taken as literal layers, but as virtual layers. The model desribed above utilises two-dimensional layers because it is easier to conceptualise circular wavefronts spreading across a flat surface than it is to imagine irregular propagation of activity throughout a three-dimensional space. However, as long as the general assumptions (eg. that some neurons are nearer one another than others, that activation will spread approximately outward from an activation centre, that activation can spread through 'motor areas'), then the principle of superposition should still enable a SWA network to operate. Further, it is not argued that all areas of the brain operate according to SWA principle s -- this would be obviously untenable. What is being offered is a general model that allows for the association of separate representations without pre-existing connections, in the same way that connectionism itself offers a general model for inf ormation processing. It is envisaged that an SWA network would be one of many network types that might be found in the animal brain (see for example the discussion below about pattern recognition).

A second concern relates to timing of second-order wavefronts during execution of an XOR process (refer to the discussion under Figure 6). It has quite rightly been pointed out that the second-order wavefront used to negate the effects of first-order wavefronts will be delayed. Thus, as the model is presented above, the output unit will not be inhib ited until after it has been excited. As was mentioned above, some of the more complicated details were omitted in the introductory discussion for the sake of clarity. One of those details is that any input will not be represented by a single wav efront. Rather, multiple wavefronts will continue to propagate at a constant frequency (dependent upon the units' refractory period) for as long as the original stimulus is present. In the preceding discussion output units have summed inputs spatially. If they instead require spatial and temporal summation of inputs in order to reach threshold, then the timing problem is solved: as long as the second-order wavefront passes through the motor zone before the output unit receives sufficient spatio-tempora l excitation, and provided that inhibitory connections are at least as strong as excitatory ones, the output unit will never reach threshold.

While a model with two associative layers ought to operate effectively when two or fewer inputs are presented to the system, it is unclear precisely how unstable the system will become when confronted with a larger number of inputs. Introducing larger nu mbers of inputs will generate many wavefronts and very complex interference patterns. Without actually constructing and testing a model it is difficult to speculate how the system will operate when confronted with many inputs. It would be expected howev er that a dual-associative layer model will be insufficient in terms of discriminability and stability. One solution to this problem would be to add more associative layers. Theoretically, each additional layer may operate to reduce or dampen the amount of activation in the system. Thus, in the same way that the model above reduces two wavefronts in the first associative layer to a single wavefront in the second associative layer, it is conceivable that four wavefronts in a first associative layer may reduce to two in a second, and one in a third. Again, however, the precise manner in which even a system with multiple associative layers will operate will be difficult to predict given the non-linear dynamical nature of the system.

It is also unclear whether SWA networks could be made to exhibit the desirable properties of generalisation that are present in conventional networks. The models described in this paper will display a form of generalisation based upon 'perceptual similar ity' of stimuli. Inputs that are spatio-temporally similar will create very similar activation wavefronts. Thus, motor units will have most difficulty discriminating between inputs that are near to one another spatially or temporally. In a sense, this inability to discriminate is generalisation -- responses given to one input will tend to be given to similar inputs. But it is unlikely that this simple form of generalisation will be a sufficient replacement for conventional models of pattern recognitio n etc. Thus, it is proposed that in more advanced models, the associative layer does not interface directly with input units. Instead a more conventional network (eg. an unsupervised pattern recognisor) could be employed at the lowest levels of the syst em, processing input before passing a modified representation of the input layer to the associative layer. Thus properties of generalisation and default assignment may be provided by the perceptual sub-network, while the associative capabilities of an SW A network increase the whole system's associative power.

Finally, it is interesting to note that the fundamental principles underlying the SWA model were actually described (though in rather general terms) by Karl Lashley as far back as 1951, in his influential paper "The Problem of Serial Order in Behavior". To quote directly from the paper:

I believe, that there exist in the nervous organization, elaborate systems of interrelated neurons capable of imposing certain types of integration upon a large number of widely spaced effector elements; in the one case transmitting temporally spaced wave s of facilitative excitation to all effector elements; in the other imparting a directional polarization to both receptor and effector elements...
I can best illustrate this conception of nervous action by picturing the brain as the surface of a lake. The prevailing breeze carries small ripples in its direction, the basic polarity of the system. Varying gusts set up crossing systems of waves, whic h do not destroy the first ripples, but modify their form, a second level in the system of space coordinates. A tossing log with its own period of submersion sends out periodic bursts of ripples, a temporal rhythm. The bow wave of a speeding boat moment arily sweeps over the surface, seems to obliterate the smaller waves yet leaves them unchanged by its passing, the transient effect of a strong stimulus... (Lashley, 1951, p133).


Back to Contents

Conclusion

This paper has endeavoured to introduce the Spreading Waves of Activation model, and to discuss some of its potential advantages over conventional networks with the intention of encouraging commentary and development. Specifically, it was argued that if conventional networks are to be applicable on large scales and with realistic structural elements, they will experience difficulties in encoding temporal phenomena and configural representations -- two of the most basic learning requirements of an intelli gent system. SWA principles involve waves of activation that radiate throughout entire layers of units in a system, thus enabling widespread and arbitrary outputs to be associated with any inputs, including inputs distributed in time. The model at prese nt stands only as a theoretical proposition, and awaits implementation in an actual network. Though the model is as yet untested, it may hold the potential to alter the way cognitive modellers think about neural networks, and may even have broader implic ations for cognitive science in general.





Back to Contents

References



Atrens, D. and Curthoys, I. (1982) The Neurosciences and Behaviour: An introduction , 2nd edn. Sydney: Academic Press.

Bairaktaris, D. (1994) The problem of temporal order in connectionist networks and its implications for short-term memory modelling. In M. Oaksford and G.D.A. Brown (Eds), Neurodynamics and Psychology , London: Academic Press.

Cairns, D.E., Baddeley, R.J. and Smith, L.E. (1994) Phase constraints on synchronising oscillator networks. In M. Oaksford and G.D.A. Brown (Eds) Neurodynamics and Psychology , London: Academic Press.

Chater, N. and Conkey, P. (1994) Sequence processing with recurrent neural networks. In M. Oaksford and G.D.A. Brown (Eds), Neurodynamics and Psychology , London: Academic Press.

Cleeremans, A. (1993) Mechanisms of Implicit Learning: Connectionist models of sequence processing Cambridge, MA: MIT Press..

Cleeremans, A, Servan-Schreiber, D, and McClelland, J.L. (1989) Finite state automata and simple recurrent networks. Neural Computation , 1(3), 372-381.

Elman, J. (1990) Finding structure in time. Cognitive Science , 14, 179-211.

Feldman, J.A, and Ballard, D.H. (1982) Connectionist models and their properties. Cognitive Science , 6(3), 205-254.

Fodor, J.A. and Pylyshyn, Z.W. (1988) Connectionism and cognitive architecture: A critical analysis. Cognition , 28, 3-71.

Gershon, M.D, Schwartz, J.H, and Kandel, E.R. (1985) Morphology of chemical synapses and patterns of interconnection. In E.R Kandel and J.H Schwartz (Eds), Principles of Neural Science , 2nd edn. NY: Elsevier Science Publishing.

Kamin, L.J. (1969) Predictability, surprise, attention, and conditioning. In Campbell and Church (Eds), Punishment and Aversive Behavior , NY: Appleton-Century-Crofts.

Kandel, E.R. (1985) Factors controlling transmitter release. In E.R Kandel and J.H Schwartz (Eds), Principles of Neural Science , 2nd edn. NY: Elsevier Science Publishing.

Kandel, E.R. and Seiglebaum, S. (1985) Principles underlying electrical and chemical synaptic transmission. In E.R Kandel and J.H Schwartz (Eds), Principles of Neural Science , 2nd edn. NY: Elsevier Science Publishing.

Kentridge, R.W. (1994) Critical dynamics of neural networks with spatially localised connections. In M. Oaksford and G.D.A. Brown (Eds) Neurodynamics and Psychology , London: Academic Press.

Lashley, K.S (1951) The problem of serial order in behavior. In L.A. Jeffries (Ed.), Cerebral mechanisms in behavior: The Hixon Symposium . New York: Wiley.

McClelland, J.L, Rumelhart, D.E, and Hinton, G.E. (1986) The appeal of parallel distributed processing. In Rumelhart, McClelland and the PDP Research Group (Eds), Parallel Distributed Processing volume 1: Foundations , Cambridge, MA: MIT Press.

Marczell, Zs., Kalmar, Zs., and Lorincz, A. (1995) Generalized skeleton formation for texture segmentation. Neural Network World.

Oaksford, M, and Brown, G.D.A (Eds) (1994) Neurodynamics and Psychology , London: Academic Press.

Olah, M. and Lorincz, A. (1995) Analog VLSI implementaion of grassfire transformation for generalised skeleton formation. Proceedings of ICANN'95.

Pearce, J.M. (1994) Discrimination and categorization. In N.J. Mackintosh (Ed), Animal Learning and Cognition , London: Academic Press.

Reeke, G.N. (1992) Neural nets and neuronal nets: How much like the nervous system should a model be? In Fidia Research Foundation (Eds), Neuropsychology: The Neuronal Basis of Cognitive Function , volume 2, NY: Thieme Medical Publishers, Inc.

Rumelhart, D.E, Hinton, G.E, and McClelland, J.L. (1986) A general framework for parallel distributed processing. In Rumelhart, McClelland and the PDP Research Group (Eds), Parallel Distributed Processing volume 1: Foundations , Cambridge, MA: MIT Press.

Stevens, C.F. (1989) How cortical interconnectedness varies with network size. Neural Computation , 1, 473-479.

Stratford, K.J. (1992) Nerve cell modeling: The benefits of compartmentalized thinking. In Fidia Research Foundation (Eds), Neuropsychology: The Neuronal Basis of Cognitive Function , volume 2, NY: Thieme Medical Publishers, Inc.

Waibel, A. (1992) Modular construction of time-delay neural networks for speech recognition. Neural Computation , 1, 39-46.

Wearden, J. (1994) Prescriptions for models of biopsychological time. In M. Oaksford and G.D.A. Brown (Eds), Neurodynamics and Psychology , London: Academic Press.

Back to Contents


Submitted November 29 1995.
Published January 5 1996.