[Table of Contents]

Theory of Distributed Associative Memory: A Tutorial

Peter Butterworth
Simon Dennis

Contents

Introduction

The Theory of Distributed Associative Memory (TODAM) was introduced by Murdoch (1982) and shares many mathematical assumptions with with Eich's (1982) Composite Holographic Associative Recall Model (CHARM). In this exposition we focus on TODAM, but many of the processes are identical in CHARM.

General features of the model:

  1. Items and events are represented as distributed vectors of abstract attributes;
  2. Item and associative information are differentiated e.g., having studied the pair AB, memory contains information about the occurrence of the items A and B as well as associative information of the joint occurrence of AB;
  3. All item and associative vectors are added to a composite memory vector.
  4. Cued recall and recognition of both individual items and associative pairs are modelled.
  5. The speed - accuracy trade off within recognition is also modelled.

Representations and Processes

Items are represented as random vectors with N elements. Each element is a random sample from a normal distribution with mean µ = 0, and variance o ² = P/N (P = power of vector which is generally set to 1, so o ² = 1/N). Similarity between items is reflected in similar patterning over vector elements so that the dot product of 2 identical vectors = 1.

Associations are represented as the convolution of item vectors (a method which produces the conjunction of two vectors) e.g., (f * g)

where the range of the subscripts on the item vectors (i) varies from - (N - 1)/2 to (N - 1)/2
and the range of subscripts on the convoluted outcome vector (x) varies from - (2N -1)/2 to (2N -1)/2

The convoluted vector bears no resemblance (i.e., is statistically independent from) the contributing item vectors.

Example of Convolution

Consider 2 item vectors f = (f1, f2, f3) and g = (g1, g2, g3).

The convolution f*g =

(f1 g1, f1 g2 + f2 g1, f1 g3 + f2 g2 + f3 g1, f2 g3 + f3 g2, f3 g3)

New vector has (2N -1) elements

e.g., ((2 x 3) -1) = 5 elements

To enable addition of item and associative vectors into common memory vector M, convoluted associative vectors are truncated to the number of elements in each item vector. (N - 1)/2 elements are removed from either end.

( f1 g2 + f2 g1, f1 g3 + f2 g2 + f3 g1, f2 g3 + f3 g2 )

Cued Recall is based upon the correlation of vectors which is denoted with a #. Correlation is the inverse of convolution.

Recognition (similarity/familiarity) is based upon the dot product of vectors

Properties of Convolution, Correlation and Dot Products

(from Eich (1982) and Murdoch (1982))

  1. Convolution is commutative

    f*g = g*f

    (correlation is not, f#g != g#f )

  2. The convolution of a vector and a delta vector (d - 0's on all but the central element (0, 0, 1, 0, 0)) - is the item itself.

    d * g = g

  3. The convolution of a vector and a zero vector (0 = (0, 0, 0, 0, 0)) is a zero vector.

    0 * g = 0

  4. The correlation of a vector with itself is an approximation of a delta vector.

    f # f ~= d

  5. The correlation of 2 unrelated/independent vectors is an approximation of a zero vector.

    f # g ~= 0

  6. The relationship between convolution and correlation can be described as:

    f # (f*g) = (f # f)*g + (f # g)*f + noise

Example of Cued Recall from an Association

Suppose the subject has studied items f and g together. The association would be stored as f*g.

If we subsequently cue with the item f
= f # (f*g) = (f # f)* g + (f # g)* f + noise (from 6)
~=d * g + 0 * f + noise (from 4 & 5)
~=g + 0 + noise (from 2 & 3)
=g' (an approximation to g)

When g' = g; the dot product g . g' = 1.

Storage

For each studied pair - both item (f, g) and associative (f*g) vectors are added to the composite memory vector M. The relative contributions of item and associated information are weighted. Earlier studied items are weighted for forgetting (serial position).

where
a = forgetting parameter
y1 and y2 = item weighting parameters (for f and g)
y3 = associative weighting parameter

and all parameters vary between 0 and 1.

e.g, Composition of M after presentation of the jth pair


   j       Pair                                Mj                            

                                                                             
   1        A-B     A + B + A*B                                              
                                                                             

   2        C-D     C + D + C*D + (A + B + A*B)                              
                                                                             

   3        E-F     E + F + E*F + (C + D + C*D + (A + B + A*B))              
                                                                             



Example of Storage

Consider composite memory vector M before and after study of a word pair (fg). Suppose that f = (1, 0, -1, 1, 0), g = (1, 1, 0, 0, -1), a = .75, y1 and y2 = .5, y3 = .75, and N = 5. Also suppose that before studying the new pair M = (8, 4, -4, 0, 0).

The convolution f * g = (1, 1, -1, 0, 0, 0, 1, -1, 0) which when truncation becomes = (-1, 0, 0, 0, 1).

So Mj = y1 f + y2 g + y3 (f * g) + a Mj-1

= .5 (1, 0, -1, 1, 0) + .5 (1, 1, 0, 0, -1) + .75 (-1, 0, 0, 0, 1) + .75 (8, 4, -4, 0, 0)

= (.5, 0, -.5, .5, 0) + (.5, .5, 0, 0, -.5) + (-.75, 0, 0, 0, .75) + (6, 3, -3, 0, 0)

= (6.25, 3.5, -3.5, .5, .25)

The resulting memory vector still has five components.

Noise is due to:

  1. Forgetting parameter (a);
  2. Increasing number of items and associations in the composite memory vector M;
  3. The truncation of associative vectors which results in a loss of information.

Recognition

Item Recognition

In TODAM, item recognition involves finding the dot product between the cue fkand the composite memory vector M. Scalar outcome is then compared to some criteria (depending upon whether forced-choice or Yes-No recognition task).

Expected Value of Item Recognition

E[fk . M] = y1 Ck

E[gk . M] = y2 Ck

where Ck is a serial position constant for position k, given study of p pairs.

      p-k
Ck = a      
Strength of match is dependent upon:
  1. how strong the item studied
  2. how much has been forgotten

Expected Value of Recognition of New (unstudied) Items

E[h . M] = 0

Associative (Pair) Recognition

Associative recognition in TODAM is the same as item recongition except that it is the convolution of the two test cues fkand gk that is compared against the composite memory vector.

Expected Value of Intact Pair

E[(fk*gk) . M] = y3 Ck

Expected Value of Incorrect (Rearranged) Pair

E[(fk*gl) . M] = 0

Recall

Cued Recall

Cued recall in TODAM involves taking the correlation of the Cue vector f and the composite memory vector M (i.e., f # M). If the cue was studied, one of the terms within M is the convolution of f and g (i.e., f*g).

The output (g') is an approximation to the vector g. The mathematics are similar to those outlined in the example of cued recall from an association, except that there will be additional noise from extra items within M, how early the pair was studied within the list (forgetting parameter), and the process of vector truncation.

Once the output vector g' has been computed it is compared against a list of possible vectors to determine which is closest (highest match) (e.g., g . g'). If the matching strength > criterion, then the model considers recall to be successful. Intrusion errors occur when the output vector is more similar to another item vector than to the target item vector (e.g., h . g' > g . g')

Expected Value for accurate Cued Recall Performance

[g (f # M)] [g (f # (f * g))] [g g']

N ²o^4 = N ² P ² / N ² = P ² = 1

Decisions

Decisions in recall and recognition use an approach (similar to) signal detection theory.

Depends in part upon the specific task;

However, when modelling speed (as well as accuracy) Yes/No Recognition uses 2 criterion parameters.

Yes/No Recognition

Input to decision is output from memory comparison process (see above).

Upper (accept) and Lower (reject) criteria are set.

Extraneous random noise is added to the output from the memory system.

Matching strength may initially fall into either the Yes, No or Wait decision regions.

If within the "wait" region - noise continually varies across time - may eventually put "wait" into a Yes or No decision region.

Depends upon the distance between criteria, rate to which the criteria converge, and variance of the random noise.

Allows simulation of speed-accuracy trade-off

  1. distant criteria => more accurate performance, greater role for random noise in decision process. Larger noise components necessary (on average) & therefore there is likely to be greater no. of iterations necessary to achieve required noise component to reach decision region.

    When criteria are set further apart - more accurate - though slower

  2. close criteria => less accurate performance and less outcomes within the "wait" range and therefore less reliance on the addition of random noise to make decisions, also likely fewer iterations as generally smaller noise effect necessary to achieve criteria (i.e., quicker).

    When criteria are set close together - less accurate - though more rapid

Benefits of TODAM

Todam can account for:

  1. Similarity effects (e.g., see Eich; but consider difficulties with unitary memory vector containing item vectors and associative convoluted vectors)
  2. Independence of item and associative information e.g., recognition failure of recallable words (Tulving and Thomson)
  3. Different weightings to item and associative information (dependent upon prior experience with items or nature of study)
  4. Associative symmetry - study AB; can cue with A for B or cue with B for A. Forward and backward associative strengths should be equivalent (all other things being equal - but not necessarily a desirable quality in all situations)
  5. Speed - accuracy trade-off in recognition.