Notes

1. The difference between Connectionist networks in which the state of a single unit encodes properties of the world (i.e., the so-called Vocalist' networks) and ones in which the pattern of states of an entire population of units does the encoding (the so-called 'distributed' representation networks) is considered to be important by many people working on Connectionist models.

Although Connectionists debate the relative merits of Iocalist (or 'compact') versus distributed representations (e.g., Feldman, 1986), the distinction will usually be of little consequence for our purposes, for reasons that we give later. For simplicity, when we wish to refer indifferently to either single unit codes or aggregate distributed codes, we shall refer to the 'nodes' in a network. When the distinction is relevant to our discussion, however, we shall explicitly mark the difference by referring either to units or to aggregate of units.

2. One of the attractions of Connectionism for many people is that it does employ some heavy mathematical machinery, as can be seen from a glance at many of the chapters of the two volume collection by Rumelhart, McClelland and the PDP Research Group (1986). But in contrast to many other mathematically sophisticated areas of cognitive science, such as automata theory or parts of Artificial Intelligence (particularly the study of search, or of reasoning and knowledge representation), the mathematics has not been used to map out the limits of what the proposed class of mechanisms can do. Like a great deal of Artificial Intelligence research, the Connectionist approach remains almost entirely experimental; mechanisms that look interesting are proposed and explored by implementing them on computers and subjecting them to empirical trials to see what they will do. As a consequence, although there is a great deal of mathematical work within the tradition, one has very little idea what various Connectionist networks and mechanisms are good for in general.

3. Smolensky seems to think that the idea of postulating a level of representations with a semantics of Subconceptual features is unique to network theories. This is an extraordinary view considering the extent to which Classical theorists have been concerned with feature analyses in every area of psychology from phonetics to visual perception to lexicography. In feet the question whether there are 'sub-conceptual' features is neutral with respect to the question whether cognitive architecture is Qassical or Connectionist.

4. Sometimes, however, even Representationaliste fail to appreciate that it is representation that distinguishes cognitive from noncognitive levels. Thus, for example, although Smolensky (1988) is dearly a Representationalist, his official answer to the question "What distinguishes those dynamical systems that are cognitive from those that are not?" makes the mistake of appealing to complexity rather than intentionality: "A river... fails to be a cognitive dynamical system only because it cannot satisfy a large range of goals under a large range of conditions." But, of course, that depends on how you individuate goals and conditions; the river that wants to get to the sea wants first to get half way to the sea, and then to get half way more,..., and so on; quite a lot of goals all told. The real point, of course, is that states that represent goals play a role in the etiology of the behaviors of people but not in the etiology of the 'behavior' of rivers.

5. That Classical architectures can be implemented in networks is not disputed by Connectionists; see for example Rumelhart and McGelland (1986a, p. 118): "... one can make an arbitrary computational machine out of linear threshold units, inducting, for example, a machine that can carry out all the

Connectionism and Cognitive Architecture: A Critical Analysis 319 operations necessary for implementing a Turing machine; the one limitation is that real biological systems cannot be Turing machines because they have finite hardware.".

6. There is a different idea, frequently encountered in the Connectionist literature, that this one is easily confused with: viz., that the distinction between regularities and exceptions is merely stochastic (what makes 'went' an irregular past tense is just that the more frequent construction is the one exhibited by 'walked'). It seems obvious that if this claim is correct it can be readily assimilated to Classical architecture.

7. This way of putting it will do for present purposes. But a subtler reading of Connectionist theories might take it to be total machine states that have content, e.g., the state of having such and such a node excited. Postulating connections among labelled nodes would then be equivalent to postulating causal relations among the corresponding content bearing machine states: To say that the excitation of the node labelled 'dog' is caused by the excitation of nodes labelled [d], [o(, Lg] is to say that the machine's representing its input as consisting of the phonetic sequence (dog] causes it to represent its input as consisting of the word 'dog'. And so forth. Most of the time the distinction between these two ways of talking does not matter for our purposes, so we shall adopt one or the other as convenient.

8. Sometimes the difference between simply postulating representational states and postulating representations with a combinatorial syntax and semantics is marked by distinguishing theories that postulate symbols from theories that postulate symbol systems. The latter theories, but not the former, are committed to a "language of thought". For this usage, see Kosslyn and Hatfield (1984) who take the refusal to postulate symbol systems to be the characteristic respect in which Connectionist architectures differ from Classical architectures. We agree with this diagnosis.

9. Perhaps the notion that relations among physical properties of the brain instantiate (or encode) the combinatorial structure of an expression bears some elaboration.

One way to understand what is involved is to consider the conditions that must hold on a mapping (which we refer to as the 'physical instantiation mapping') from expressions to brain states if the causal relations among brain states are to depend on the combinatorial structure of the encoded expressions. In defining this mapping it is not enough merely to specify a physical encoding for each symbol; in order for the structures of expressions to have causal roles, structural relations must be encoded by physical properties of brain states (or by sets of functionally equivalent physical properties of brain states).

Because, in general. Classical models assume that the expressions that get physically instantiated in brains have a generative syntax, the definition of an appropriate physical instantiation mapping has to be built up in terms of (a) the definition of a primitive mapping from atomic symbols to relatively elementary physical states, and (b) a specification of how the Sturcutre of complex expressions maps onto the structure of relatively complex or composite physical states. Such a structure-preserving mapping is typically given recursively, making use of the combinatorial syntax by which complex expressions are built up out of simpler ones. For example, the physical instantiation mapping F for complex expressions would be defined by recursion, given the definition of F for atomic symbols and given the structure of the complex expression, the latter being specified in terms of the 'structure building' rules which constitute the generative syntax for complex expressions. Take, for example, the expression '(A⅛B)⅛C'. A suitable definition for a mapping in this case might contain the statement that for any expressions P and Q, F(P6rQJ = B(F[P], F[Ql), where the function B specifies the physical relation that holds between physical states F[P] and F[Q]. Here the property B serves to physically encode, (or 'instantiate') the relation that holds between the expressions Pand Q, on the one hand, and the expressions P&Q on the other.

In using this rule for the example above P and Q would have the values 'A&B' and 'C respectively, so that the mapping rule would have to be applied twice to pick the relevant physical structures. In defining the mapping recursively in this way we ensure that the relation between the expressions 'A' and Ъ', and the composite expression 'A&B', is encoded in terms of a physical relation between Constitutent states that is identical (or functionally equivalent) to the physical relation used to encode the relation between expressions 'A&B' and 'C, and their composite expression '(A&B)&C. This type of mapping is well known because of its use in Tarski's definition of an interpretation of a language in a model. The idea of a mapping from symbolic expressions to a structure of physical states is discussed in Pylyshyn (1984a, pp. 54-69), where it is referred to as an 'instantiation function' and in Stabler (1985), where it is called a 'realization mapping'.

10. This Ulustration has not any particular Connectionist model in mind, though the caricature presented is, in fact, a simplified version of the Ballard (1987) Connectionist theorem proving system (which actually uses a more restricted proof procedure based on the unification of Hom dauses). To simplify the exposition, we assume a Iocalist' approach, in which each semantically interpreted node corresponds to a single Connectionist unit; but nothing relevant to this discussion is changed if these nodes actually consist of patterns over a cluster of units.

11. This makes the "compositionality" of data structures a defining property of Classical architecture. But, of course, it leaves open the question of the degree to which natural languages (like English) are also compositional.

12. Labels aren't part of the causal structure of a Connectionist machine, but they may play an essential role in its causal history insofar as designers wire their machines to respect the semantical relations that the labels express.

For example, in Ballard's (1987) Connectionist model of theorem proving, there is a mechanical procedure for wiring a network which will carry out proofs by unification. This procedure is a function from a set of node labels to a wired-up machine. There is thus an interesting and revealing respect in which node labels are relevant to the operations that get performed when the function is executed. But, of course, the machine on which the labels have the effect is not the machine whose states they are labels of; and the effect of the labels occurs at the time that the theorem-proving machine is constructed, not at the time its reasoning process is carried out. This sort of case of labels ^having effects' is thus quite different from the way that symbol tokens (e.g., tokened data structures) can affect the causal processes of a Oassical machine.

13. Any relation specified as holding among representational states is, by definition, within the 'cognitive level'. It goes without saying that relations that are 'within-level' by this criterion can count as 'between-level' when we use criteria of finer grain. There is, for example, nothing to prevent hierarchies of levels of representational states.

14. Smolensky (1988, p. 14) remarks that "unlike symbolic tokens, these vectors lie in a topological space, in which some are close together and others are far apart." However, this seems to radically conflate claims about the Connectionist model and claims about its implementation (a conflation that is not unusual in the Connectionist literature). If the space at issue is physical, then Smolensky is committed to extremely strong claims about adjacency relations in the brain; claims which there is, in bet, no reason at all to believe. But if, as seems more plausible, the space at issue is semantical then what Smolensky says isn't true. Practically any cognitive theory will imply distance measures between mental representations. In Classical theories, for example, the distance between two representations is plausibly related to the number of computational steps it takes to derive one representation from the other. In Connec- tionist theories, it is plausibly related to the number of intervening nodes (or to the degree of overlap between vectors, depending on the version of Connectionism one has in mind). The interesting claim is not that an architecture offers a distance measure but that it offers the right distance measure—one that is empirically certifiable.

15. The primary use that Connectionists make of microfeatures is in their accounts of generalization and abstraction (see, for example, Hintoa McClelland, & Rumelhart, 1986). Roughly, you get generalization by using overlap of microfeatures to define a similarity space, and you get abstraction by making the vectors that correspond to types be subvectors of the ones that correspond to their tokens. Similar proposals have quite a long history in traditional Empiricist analysis; and have been roundly criticized over the centuries. (For a discussion of abstractionism see Geach, 1957; that similarity is a primitive relation—hence not reducible to partial identity of feature sets—was, of course, a main tenet of Gestalt psychology, as well as more recent approaches based on "prototypes"). The treatment of microfeatures in the Connectionist literature would appear to be very close to early proposals by Katz and Fodor (1963) and Katz and Postal (1964), where both the idea of a feature analysis of concepts and the idea that relations of semantical containment among concepts should be identified with set- theoretic relations among feature arrays are explicitly endorsed.

16. Another disadvantage is that, strictly speaking it doesn't work; although it allows us to distinguish the belief that John loves Mary and Bill hates Sally from the belief that John loves Sally and Bill hates Mary, we don't yet have a way to distinguish believing that Oohn loves Mary because Bill hates Sally) from believing that (Bill hates Sally because John loves Mary). Presumably nobody would want to have microfeatures corresponding to these.

17. It's especially important at this point not to make the mistake of confusing diagrams of Connectionist networks with constituent structure diagrams (see section 2.1.2). Connecting SUBJECT-OF with НЕЮ and BITES does not mean that when all three are active НЕЮ is the subject of BITES. A network diagram is not a specification of the internal structure of a complex mental representation. Rather, it's a specification of a pattern of causal dependencies among the states of activation of nodes. Connectivity in a network determines which sets of simultaneously active nodes are possible; but it has no semantical significance.

The difference between the paths between nodes that network diagrams exhibit and the paths between nodes that constituent structure diagrams exhibit is precisely that the latter but not the former specify parameters of mental representations. (In particular, they specify part/whole relations among the constituents of complex symbols). Whereas network theories define semantic interpretations over sets of (causally interconnected) representations of concepts, theories that acknowledge complex symbols define semantic interpretations over sets of representations of concepts together with specifications of the constituency relations that hold among these representations.

18. And it doesn't work uniformly for English conjunction. Compare: John and Mary are friends → *]ohn are friends; or The flag is red, white and blue → The flag is blue. Such cases show either that English is not the language of thought, or that, if it is, the relation between syntax and semantics is a good deal subtler for the language of thought than it is for the standard logical languages.

19. It needn't, however, be strict truth-preservation that makes the syntactic approach relevant to cognition. Other semantic properties might be preserved under syntactic transformation in the course of mental processing—e.g., warrant, plausibility, heuristic value, or simply semantic non-arbitrariness. The point of Gassical modeling isn't to characterize human thought as supremely logical; rather, it's to show how a family of types of semantically coherent (or knowledge-dependent) reasoning are mechanically possible. Valid inference is the paradigm only in that it is the best understood member of this family; the one for which syntactical analogues for semantical relations have been most systematically elaborated.

20. It is not uncommon for Connectionists to make disparaging remarks about about the relevance of logic to psychology, even thought they accept the idea that inference is involved in reasoning. Sometimes the suggestion seems to be that it's all right if Connectionism can't reconstruct the theory of inference that formal deductive logic provides since it has something even better on offer. For example, in their report to the U.S. National Science Foundatioa McGelland, Feldmaa Adelsoa Bower fc McDermott (1986) state that"... Connectionist models realize an evidential logic in contrast to the symbolic logic of conventional computing (p. 6; our emphasis)" and that "evidential logics are becoming increasingly important in cognitive science and have a natural map to Connectionist modeling." (p. 7). It is, however, hard to understand the implied contrast since, on the one hand, evidential logic must surely be a fairly conservative extension of "the symbolic logic of conventional computing" (Le., most of the theorems of the latter have to come out true in the former) and, on the other, there is not the slightest reason to doubt that an evidential logic would 'run' on a Gassical machine. Prima fade, the problem about evidential logic isn't that we've got one that we don't know how to implement; it's that we haven't got one.

21. Compare the "little s's" and "little r^,s" of neo-Hullean "mediational" Assodationists like Charles Osgood.

22. This way of putting the productivity argument is most dosely identified with Chomsky (e.g., Chomsky, 1965; 1968). However, one does not have to rest the argument upon a basic assumption of infinite generative capadty. Infinite generative capadty can be viewed, instead, as a consequence or a corollary of theories formulated so as to capture the greatest number of generalizations with the fewest independent principles. This more neutral approach is, in fact, very much in the spirit of what we shall propose below. We are putting it in the present form for expository and historical reasons.

23. McClelland and Kawamoto (1986) discuss this sort of recursion briefly. Their suggestion seems to be that parsing such sentences doesn't really require recovering their recursive structure: "... the job of the parser [with resped to right-recursive sentences] is to spit out phrases in a way that captures their local context. Such a representation may prove suffident to allow us to reconstrud the corred bindings of noun phrases to verbs and prepositional phrases to nearby nouns and verbs" (p. 324; emphasis ours). It is, however, by no means the case that all of the semantically relevant grammatical relations in readily intelligible embedded sentences are local in surface structure. Consider: 'Where did the man who owns the cat that chased the rat that frightened the girl say that he was going to move to (X)T or 'What did the girl that the children loved to listen to promise your friends^kthat she would read (X) to themT Notice that, in such examples, a binding element (italicized) can be arbitrarily displaced from the position whose interpretation it controls (marked X) without making the sentence particularly difficult to understand. Notice too that the 'semantics' doesn't determine the binding relations in either example.

24. See Pinker (1984, Chapter 4) for evidence that children never go through a stage in which they distinguish between the internal structures of NPs depending on whether they are in subject or object position; i.e., the dialects that children speak are always systematic with respect to the syntactic structures that can appear in these positions.

25. It may be worth emphasizing that the structural complexity of a mental representation is not the same thing as, and does not follow from, the structural complexity of its propositional content (i.e„ of what we're calling "the thought that one has"). Thus, Connectionists and Classicists can agree to agree that the thought that P&Q is complex (and has the thought that P among its parts) while agreeing to diagree about whether mental representations have internal syntactic structure.

26. These considerations throw further light on a proposal we discussed in Section 2. Suppose that the mental representation corresponding to the thought that John loves the girl is the feature vector { +John-subject; +laves; + the-girl-object} where 'John-subject' and 'the-girl-object' are atomic features; as such, they bear no more structural relation to ’John-abject' and 'the-girl-subject' than they do to one another or to, say, ^,has-a-handle'. Since this theory recognizes no structural relation between ‘John- subject and 'John-abject', it offers no reason why a representational system that provides the means to express one of these concepts should also provide the means to express the other. This treatment of role relations thus makes a mystery of the (presumed) fact that anybody who can entertain the thought that John loves the gid can also entertain the thought that the gid loves John (and, mutatis mutandis, that any natural language that can express the proposition that John loves the gid can also express the proposition that the gid loves John). This consequence of the proposal that role relations be handled by "role specific descriptors that represent the conjunction of an identity and a role" (Hinton, 1987) offers a particularly dear example of how failure to postulate internal structure in representations leads to failure to capture the Systematidty of representational systems.

27. WeareindebtedtostevePinkerforthispoint.

28. The hedge is meant to exdude cases where inferences of the same logical type nevertheless differ in complexity in virtue of, for example, the length of their premises. The inference from (AvBvCvDvE) and (—Bfc-Cfc-Dfc-E) to A is of the same logical type as the inference from AvB and —B to A. But it wouldn't be very surprising, or very interesting, if there were minds that could handle the second inference but not the first.

29. Historical footnote Connectionists are Associationists, but not every Associationist holds that mental representations must be unstructured. Hume didn't, for example. Hume thought that mental representations are rather like pictures, and pictures typically have a compositional semantics: the parts of a picture of a horse are generally pictures of horse parts.

On the other hand, allowing a compositional semantics for mental representations doesn't do an Associationist much good so long as he is true to this spirit of his Assodationism. The virtue of having mental representations with structure is that it allows for structure sensitive operations to be defined over them; specifically, it allows for the sort of operations that eventuate in productivity and system- atidty. Association is not, however, such an operation; all it can do is build an internal model of redundancies in experience by altering the probabilities of transitions among mental states. So far as the problems of productivity and Systematidty are concerned, an Associationist who acknowledges structured representations is in the position of having the can but not the opener.

Hume, in fact, cheated: he allowed himself not just Association but also "Imagination", which he takes to be an 'active' faculty that can produce new concepts out of old parts by a process of analysis and recombination. (The idea of a unicorn is pieced together out of the idea of a horse and the idea of a horn, for example.) Qua assodationist Hume had, of course, no right to active mental faculties. But allowing imagination in gave Hume precisely what modem Connectionists don't have: an answer to the question how mental processes can be productive. The moral is that if you've got structured representations, the temptation to postulate structure sensitive operations and an executive to apply them is practically irresistible.

<< | >>

↑

Source: Beakley Brian, Ludlow Peter (eds.). The Philosophy of Mind: Classical Problems/Contemporary Issues, 2nd edition. — Bradford Book Publication,2006. — 1080 p.. 2006

Notes

More on the topic Notes: