There is no System 2

Mar 17

An NDM account of Dual Process Theories

36 Comments

This is so inline with my thoughts - anything we term as logical is actually based on our experience, the new things (new scenarios that we come up with) actually are also based on system 1/pattern recognition applied at a meta level. I think we are able to support and think of contradictions in only those cases where we can apply at a meta level system 1. This is interesting to me cause if you think of LLM's they are pattern recognizers, so applying them at the right meta level can lead to higher level of performance unlock. I have one question though, expertise often involves at some level figuring out what information to keep and what to ignore in addition to chunking, how do we think that relates to the system 1 setup above? Any ways that chunking can be sped up?

Reply (1)

Jared Peterson

Not sure I entirely follow, but let me take a shot at it.

To be able to take lots of information, and condense it into a chunk is a type of pattern matching. In fact, you could argue a "chunk" is just a pattern. For example, as someone who is bad at Chess, it takes me quite a while to understand the board and how the pieces relate to each other (eg which pieces are at risk, and which ones are putting pressure on which other pieces). But a chess master is able to understand the board pretty quickly in terms of chunks/patterns and will often be able to re-create the board from memory very quickly.

What is interesting is the way a chess master fails when re-creating the board from memory. Sometimes they get the exact positions on the board wrong, but the functional relationship between the pieces stays the same because that is how they are chunking the information. And if you place the pieces randomly on the board, their ability to re-create the board from memory completely disappears - there needs to be a recognizable pattern in order for information to be chunked.

I have long wanted to do an experiment on this to understand the minimum necessary conditions for developing pattern recognition and efficient chunking. But alas, I'm outside of academia, and that experiment is a little too involved for my current life situation.

Reply (1)

Ash B

I think the crux of my question was around the chunking aspect only, I wonder if we chunk unconsciously and whether there are certain conditions required for it to occur. One more point was on what to look for (experts are able to look at the relevant details of the problem) a skill which is again developed over time, but I wonder if that comes directly by experience or even that has some conditions that need to be satisfied to be developed accordingly. One thing I do believe is the way we pay attention matters a lot in this scenario and that chunking might require us to keep the concepts in working memory together for consistent amount of time for pattern recognition to develop.

Reply (1)

Jared Peterson

I think the short answer is that I don't know. As I mentioned, I would love to do an experiment on this. The idea I have is to figure out the minimum necessary conditions for pattern recognition and expertise. Could you train someone on their ability to recognize good chess moves by showing them thousands of moves (combined with a positive or negative stimulus)? My colleague is extremely skeptical of this, and his thought is that you need to have a good understanding of the functional relationship between pieces in order to recognize patterns and form chunks. I'm 90% convinced by that argument, and I do think someone needs to understand WHY a chess move is good/bad in order to recognize and/or execute on a good move, as meaningful chunks are defined in terms of these functional relationships. But I'm not super confident about this, and I would love to do an actual experiment.

Reply (1)

Ash B

I feel like that could be the case, like you could figure out at-least whether something is a good move or not, and the reason why I think that is is Japanese baby chicks sex determination. In japan there are experts who can figure out whether a chick is male or female based on just looking (they do hold it and their are some steps involved but its incredible how they are able to do it). The thing is they aren't trained with reasoning they are trained over sorting chicks with feedback whether they were correct or not (note this training does take a lot of time apparently). Once they become expert they are able to determine whether a chick is male or female but if you ask them how they are able to figure that out they do not even have any conscious reason (Neither do they get one during their training). Essentially they just learn somehow to pattern match over time. I think the chess bit could work similarly. On the point of learning the relationship between pieces and developing a functional understanding I do think that it helps a lot since it allows one to simulate various moves thus sort of working like a simulation with feedback which in turn allows one to better pattern train on good chess moves. (I don't think anyone has spent time just training themselves on good vs bad chess moves so the jury is still out, but I feel the example I provided gives a solid reason why it would be the case). I know you have not experimented but based on your study so far do you have any hypothesis on how chunking occurs?

David Gibson

An interesting argument that seems simple enough: it's all neurons, and all pattern recognition, so there's no reason to speak of different systems. But I'd like to see you explain some of the findings that Kahneman marshalled since I seem to recall that it was a big book with more than a few facts. Also, Kahneman's theory is more commensurate with our experience of consciousness, which might include recognizing patters but also imaging possible futures and reasons and objections to those reasons. Finally, the distinction between pattern matching and structured pattern matching is passed off as a fairly small thing but it seems pretty big to me, big enough to support (if not mandate) a reified distinction like system 1 vs. system 2.

Reply (1)

Jared Peterson

Is there anything specific you think I haven't explained?

Here are the two most relevant empirical facts I can think of which Kahneman uses to justify his System 1/2 framework, and how Kahneman and I differ in our explanations of those facts.

Finding: On the Cognitive Reflection Test (CRT), people routinely get the findings wrong when trying to solve it quickly

Kahneman: We rely on System 1 which is error prone

Me: By his own admission, System 1 is a metaphor and not an actual system in the brain, and therefore this is a bad explanation. Instead, we should explain this in terms of people defaulting to pattern matching to things they have seen before. In situations where something doesn't exactly obey previous patterns, this can lead to errors. (I think we probably actually agree on this point, though I am unsure exactly how Kahneman sees the relationship between learned pattern matches and universal heuristics)

Finding: On the CRT, people do better when they reason slowly about the answers

Kahneman: System 2 is a corrective to System 1

Me: But by his own admission, System 2 doesn't exist, and so this explains nothing. We know people are capable of "imaging possible futures and reasons and objections to those reasons", but Kahneman just asserts that this is different than Heuristics/Pattern matching, and gives the metaphor of System 1 and 2 as a way to differentiate them. But he doesn't provide an actual explanation of what System 2 is, merely its attributes (slow and deliberative). He also doesn't provide any neurological evidence that System 1 and 2 are different, and in fact, all neurological evidence points to the idea that they are in fact the same; System 1 and 2 are not taken seriously in neuroscience.

So I propose an actual cognitive process that explains what is happening when we are thinking slow: a series of pattern matches. This seems to jive with actual experience as solving any of the CRT problems requires breaking them down into a series of easier problems that one CAN pattern match correctly. So what need is there to propose System 2? It provides absolutely no additional explanatory value over System 1 pattern matching other than to note that sometimes you can pattern match multiple times in a row - but pattern matching multiple times in a row is not a different "cognitive system" and so such language is misleading. As you say, this difference is still a big deal, and I agree. But it is not a big deal in the "completely different cognitive process/system" sense.

Hopefully that addresses your concerns, but let me know what I missed

Reply (1)

Tango

Granted, I read this book a decade ago. But my memory of Kahneman’s position is different from your summary here. He was a psychologist, not a neuroscientist. Yes, System 1 & 2 were metaphors and not actual brain regions, because he’s describing observed behavior. Lots of psychology works that way. The book has oodles of examples outlining the pattern-matching he did to note these two seemingly distinct approaches to thinking (the “findings” the previous commenter mentioned).

Why does its being a metaphor automatically make it a bad explanation? It would help me understand your point if you directly addressed some of the book’s examples. Because, from my current perspective, it seems like you just collapsed his distinction merely because you can name one quality both have in common and you can point out that there’s yet no respective structure found anatomically. (But for example, the ego is also a metaphor. Parts work runs on metaphor. Etc. And, these are extremely helpful/healing/usable concepts. This has me struggling to follow your logic here.)

So what you’re saying - it’s pattern-matching all the way down - doesn’t seem wrong…but I’m also not yet seeing how it invalidates what Kahneman proposed..?

You say, “So what need is there for S2?” and this feels to me like an orthogonal-to-the-point or maybe missing-the-point question. The proposal of S2 was to have a pointer at types of thinking/answering that are distinct from S1 types. Like, he noticed distinct types of pattern-matching and attempted to name them (with terrible names, which I believe he himself agrees are terrible).

So then, from my perspective, it seems like you’re just kinda going ‘nah-uh’ and re-collapsing his distinction because there’s no anatomical foundation behind it and/or because they aren’t literally systems.

Sorry if any of this sounds rude. I’m only writing this all out because I’m really curious to understand what you mean. I love exploring this topic. And I feel like I must be missing something in your argument…

Reply (1)

Jared Peterson

I think you actually understand this issue better than anyone else, which might be the problem. ha!

Kahneman probably would have responded with this exact argument, and so my disagreement isn't really with him. He proposed a 'useful but wrong model,' and there is nothing wrong with that. Useful but wrong models are a hallmark of science.

The problem is that some researchers (and much of the lay public) have reified this metaphor into something it isn't. I think it probably would have been better if Kahneman had labeled the distinction Tortoise and Hare, because at least then it would have been obvious that he wasn't proposing an actual explanation. But as it stands, the terms acts as curiosity killers in that people accept the terms as if they explain something rather than merely describing a distinction (fast vs slow) which still needs an explanation in terms of actual cognitive processes.

Hopefully that answers your question. Kahneman was doing science communication, and he found a metaphor useful for that purpose (which is great!). But that metaphor has developed a life of its own in ways that are rather misleading.

Reply (1)

Tango

Oh wow, I’m so glad I asked. Thanks for this reply!

Dr. Michael Netzley

Superb work. This connected a missing piece in my work. I must thank you for that. I definitely will be rereading this.

Reply (1)

Jared Peterson

Glad you found it useful. Will be curious to hear how that missing piece fits into your thinking

Reply (1)

Dr. Michael Netzley

I was not familiar with Stanovich or the three minds. As i dug in, it seems he is saying we can be highly intelligent but still lack good judgment. Developing that third mind…key to my work with senior executives. Lots of experience but ultimately judgment is less than stellar. It seems he gives me a path to think about that along with others like Gluck or Weststrate. All new to me and so far extremely useful. Thank you

Chris Schuck

6dEdited

You've really made nerdy memes your trademark! That rabbit tower abyss you describe recalls this Being John Malkovich moment: https://www.youtube.com/watch?v=R-aI39FC6XE

A few random thoughts:

You could try the terms "articulate" or "present" rather than "represent," if you wanted to avoid that baggage (those latter terms sound more enactive and/or interactive).

It's interesting that you read Kahneman's Type 1/Type 2 framing as a successful example of pattern matching in itself. My first thought was sort of the opposite, that we have a fundamental tendency to posit binaries everywhere which not only makes an artificial "Type 1/Type 2" opposition appealing, but ironically seems like a classically Type 2 way to over-rationalize everything (though maybe it would be a Type 1 error?)

If I recall, Bessis technically focuses on *System 3* as necessary to mediate dialogue between 1 and 2. But here 2 is an idealized model that we can never actually attain, whereas 1 corresponds to a real process. So rather than a division between two systems, it's like a dialogue between the ideal (which we still need, as a point of reference) and the actual process of thinking, but with a meta-reflective layer on top.

"We use such structured thinking precisely when Directly Pattern Matching to an answer is insufficient. In situations where you have insufficient experience for Direct Pattern Matching, you should add structure."

Maybe in some cases it's insufficient experience that necessitates structure, but could there also be certain cases where the very nature of the problem doesn't lend itself to direct matching but only is approachable through structure? I don't have a specific example in mind, just wondering.

Chunking is a nice way to describe the series of smaller pattern matches, but I always associated chunking with expanding memory rather than solving a new problem. This makes me wonder, Is pattern matching the same as remembering? Or is one simply a subset of the other?

Reply (1)

Jared Peterson

Some rapid fire responses

There is an argument to be made about how memes communicate ideas easily because how easily we pattern match to the form of the argument a certain meme template conveys...

Not sure I like represent, articulate, or present. I’ve been thinking about this for the last year, and I just hate every word.

Kahneman's framing is useful in terms of memorability and faux-scientific credibility. I wouldn't say it is successful in terms of conveying the actual shape of reality.

I actually think the ideal is to get to the point where you can directly pattern match. You want to get to a point where you look at a problem, and you instantly know the answer because you have understood the domain so well that you don't really need to think about it.

Predicting the stock market is too random for pattern matching to be useful, and so you have to create structure. Not sure if that is quite what you mean though. Maybe there are levels of higher mathematics where human intuition is just literally incapable of holding everything in mind, and so you always have to calculate in a series of steps.

Lisa Feldman Barrett would argue that the brain takes inputs and produces an output, and there isn't a principled distinction between constructs like memory, recognition, and prediction. In my updated SAME Cycle paper, I explain Predictive Processing by saying that ‘to predict is to recognize a situation as belonging to a familiar kind and to construct expectations and action-readiness associated with that kind based on that familiarity.’ The argument being that predictive processing/pattern recognition/constructionism are just three different emphases, but ultimately the same thing.

Reply (1)

Chris Schuck

6dEdited

Yes, I meant situations like the stock market or other complex problems where it would be either impossible or incoherent to direct pattern match at once (without making the level of abstraction so high it becomes useless).

Two more suggested alternatives to "represent": describe, depict..

I can see how memory might relate to the same process at the level of the brain, but I still think many of these conceptual distinctions remain very important at the level of understanding and discussion. For instance, episodic autobiographical memory seems different from the pattern matchjng you are talking about (for one thing you're matching past to past, not present to past). And memory in general serves much more than solving problems and making decisions.

Reply (1)

Jared Peterson

I agree the distinctions can be useful for understanding and discussion. But are they useful in a constructivist sense, or in a realist sense? Like if everything the brain does is just prediction, than a memory isn't really different than imagining as both are types of predictions.

Reply (1)

Chris Schuck

Not sure what I can support or justify, but I'm inclined to say it goes beyond just constructivist (or alternatively, that a constructivist account supports more difference than you're describing here). Memory may be a form of imagining, but past is still importantly different from future (plus there is procedural memory). I don't feel comfortable saying everything boils down to prediction, unless we define prediction much, much more broadly than it's normally used (rather than defining everything else more narrowly to fit under that).

Granted, I can't prove any of this - just where I'm at.

Reply (1)

Jared Peterson

Very fair. I think I'm pretty outside of the mainstream on this particular point, so the onus is really on me

Laura Creighton

6dEdited

1. got you a subscription. Will have to read a few more before I decide to pay.

2. The bottomless tower of bunnies is truly horrifying.

3. I'm old enough that 'making change' is a mental skill, not a matter of 'tap on the touch screen and do what it says'. And 'quarters' were a thing. Thus 24 * 17 is roughly 17 quarters. which is 425 ... minus 17 is 408. I can do this mentally faster than I can say it. So in Kahneman's introduction where it was contrasted with a photo of what I read was an angry young woman shouting, I was very confused. The math problem was fast for me. The 'identify the emotion' one failed to solve. Shouting in anger is not something I have seen much of. Angry people clench their jaws and hold things in. I did a survey of the people in my house at the time. Nobody else knew what the emotion was either. Since most of the people there were ex-military, it is not too surprising that we all converged on 'Sergeant. Giving an order she wants heard on the far side of the parade grounds' as the thing this most looked like. But what she felt like? puzzlement. I wonder how many people didn't get past the introduction because of a failure like this.

4. If you put a space between your footnote superscripts when you want 2 notes to apply to the same text, it will make things easier for those of us trying to tap the footnote with our fingers on a mobile device.

Reply (1)

Jared Peterson

1. I appreciate it! Keep in mind I offer no benefits other than the fact that I take recommendations of what to write about very seriously from paid subscribers. So please don't become a paid subscriber unless you REALLY like my stuff

2. It was truly unexpected how terrifying it would be

3. I absolutely love this story, and it is very in line with the work of Lisa Feldman Barrett who finds that emotions do not have unique facial expressions

4. Good point that I hadn't considered. I will make sure to do that in the future

Reply (1)

Chris Schuck

I've been meaning to switch to paid so this is a good reminder.

Wild Pacific

Read it carefully. Multiple examples and references help to shape the idea, don’t mind it at all.

Still, disagreed.

Author is describing Mechanisms, not Systems.

I agree that there is no Mechanism 2. Maybe some quantum exceptions that we will find later, but on the same page in general. Only neurons pattern matching.

But there is absolutely a System 2, and 3+ as well.

Tortoise parable is aptly described as a function of time. Neurons traverse patterns differently at different times.

Initial pattern differs from second take, and differs a lot from the morning after. Brain areas and cortical columns fire differently, there are orders of magnitude different levels of noise and distraction. Users of various substances can see this happen on accelerated time scale as well.

Overall, same set of inputs is memorized and then replayed many times. We are ruminators. Eventually this reprocessing is capable of producing different outcomes.

Hope author reconsiders his radical stance.

Systems are complex and are not the sum of their components, especially when stretched over time and space.

Democura

Great article Jared! Kahneman actually is believed to have said that system 1 and 2 are nothing but a concept, and definitely not a representation of the brain.

I have wrote about this issue two years ago but at certain points I use a different terminology. I think you are correct, there is only system 1. Our experience of slower thought arises because of a lack of familiarity (expertise), complexity of the problem, and when the delineation of the problem itself is difficult. This has to do with how our brains model the world (see: Feldman/Hawkins/Clark). For more, see the link to the post: https://democura.substack.com/p/how-to-think-faster

PS: I definitely like the memes, keep it up!

Reply (1)

Jared Peterson

It's such a relief to read your stuff and to see you using both Barrett and NDM. A lot of the questions I have answered in the comments section involve me citing her work. I would love to write a piece that more formally combines NDM into Barrett's work. Would you be interested in writing something together?

Reply (1)

Democura

Yes, great idea. Just sent you a DM.

Christopher Xu

I've been thinking about this too! All reasoning can be broken down into pattern matching, or in write about in my substack, it's all similarity search on different levels of abstraction:

"What we learn in school is primarily (multi-level) pattern matching. But I think in practice, reasoning becomes a more general form that is similarity search, which softens some of the constraints and allows for problem solving in unique cases where previous experiences are only vaguely relevant.

...I used to think that all useful reasoning could be represented by pattern matching...Real problems actually involve a lot of fuzzy matching, where you never can find all the information necessary, there is no guarantee that there would be an exact match between the current situation and your past experience, and the choice you eventually make is rarely objectively verifiable. This is when strict pattern matching becomes lenient similarity search. To complete the analogy, a Google search is more lenient than regex, but a human expert is more lenient than both, being able to suggest connections whose relevance is defined by a type of similarity that transcends keywords.

...I think that it’s not just the absolute amount of the data, but also this softening of constraints, the widening from pattern matching to similarity search, is why current large language model reasoning expanded capabilities as much as it did [compared to traditional rule-based systems]"

https://christopherxu.substack.com/i/174418340/abstractions-reduce-detail-to-combat-complexity

I'm not a psychologist, but this is really interesting to me and I'm curious to know what you think about this extra parameter of pattern matching "strictness". How do humans deal with cases that aren't perfect matches and how do we measure distance between ideas?

Reply (1)

Jared Peterson

Looks interesting, I'll check it out.

Strictness is definitely relevant here. Our brains don't operate on a true pattern match. In NDM research (as well as in Lisa Feldman Barretts work on emotions), we sometimes refer to what's called a "prototype," which is an amalgamation of past similar situations which have blurred together in memory (eg I can't remember any specific left hand turn I've made, as they all blur together). Prototypes organize our thoughts and inform the patterns we look for. But what's interesting is that prototypes are not stored in memory - they are constructed in the moment based on what patterns are recognized. So there is this sort of chicken and egg problem (called the Relevance Realization problem, or the Hoffding Problem) where it's unclear what comes first; the prototype or the pattern. I don't know know how to solve this problem, but I suspect the leniency of pattern recognition is part of the story. Cognitive Flexibility Theory is likely also part of the story, but I haven't worked out yet quite how these pieces fit.

Kevin McLeod

The brain has the PFC, this rages at beta, it simulates non-sensically ie it's not tied to the senses, it perceives indirectly, and descends evolutionarily from the prey-predation affinity. What it does is it tells the rest of the brain "what not to do" and the brain does this decision at gamma speeds. Get the gist? The cortex is operating a to do list at system 1-like speeds and the PFC is slower with a "what not to do list", add metacog and intteroception and the rest of the short wave ripple arrays—all wordless btw, and you get to think through a decision.

Claudio

Very interesting. Congratulations on the article. The next question behind the courtain is who's doing this, if it's all neurons. But in fact, it's precisely this epistemology that allows the brain to learn and internalize everything through repetition, in thought or action. As Lisa Barrett would say, that is the free will worth talking about.

Reply (1)

Jared Peterson

I plan on doing a follow-up article at some point where I summarize how this works using Lisa Feldman Barrett's 'concept cascade'

Reply (1)

Claudio

I highly recommend Shamil Chandaria’s Rewire course. He works at Google’s DeepMind and is one of the directors of the Centre for Human Flourishing at Oxford. It covers predictive brain models, deconstruction through contemplative practices, reconstruction (i.e., reprogramming the cascade of concepts), and then an incredible lecture on philosophy and happiness. It’s the best material I’ve come across to date on all these intertwined topics.

Denis Volkov

3dEdited

Default Mode Network (DMN) and the Executive Control Network (ECN) - these two networks are anti-correlated: when one activates, the other suppresses.

If that’s “all pattern matching” with no qualitative distinction — we will need to explain why these two networks are functionally antagonistic at the neural level.

The fact that the brain physically suppresses one mode to engage the other is hard to wave away with “they’re both just electrochemical cascades.”

What’s your take on functional antagonism between DMN and ECN?

Reply (1)

Jared Peterson

Interesting. Are there any researchers that claim ECN is System 2 and DMN is System 1? It's a claim I've never heard, but I can see the appeal of it. Part of my argument is based on the fact that every neuroscientist I've ever spoken with seems to think System 1/2 are totally silly, and lack any empirical evidence - but none have brought up this argument.

I'm not a neuroscientist, but my understanding is that ECN is related to attention and is more external facing, whereas DMN is more internal facing. So more about different types of problems than different types of cognition. But I'll admit ignorance to this one. What's your theory here?

Reply (2)

Claudio

I also think they’re different things. To me, System 1 and System 2 are much more like ways of thinking than actual systems. I would even prefer to call them Mode 1 and Mode 2. What we have in the brain are networks, and most of the time they work interdependently, although this isn’t the first time I’ve seen indications that the ECN and DMN are alternatives rather than concurrent (perhaps the equivalent of being present, and being lost in thought). On the other hand, it seems clear that we have some hierarchically superior capacity to investigate our own thoughts in order to make decisions, and even to simulate future situations. In other words, the very memory that underpins intuitive reactions (Mode 1) is also consulted in Mode 2 so that we can try to simulate how we will feel in the future about something (which seems to me to be one of the techniques of NDM).

Denis Volkov

1dEdited

Oh sorry, my argument is a lot simpler – as you noted, even the author of System 1 and System 2 is very explicit these are fictional characters :)

The question is narrower: ECN handles attention, judgment, directed cognition — broadly aligned with your account. DMN is more generative — prospection, simulation, constructing scenarios. That's documented in the DMN literature (e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC10524518/ ).

So the challenge is: if ECN is 'pattern matching,' is DMN doing something genuinely different, like 'pattern emerging'? Especially if DMN activity involves combining patterns in configurations never experienced, not just extrapolating trends.

I'm uncertain whether that distinction survives scrutiny — you could argue simulation is just complex recombination, which stays within your framework. But that feels like the interesting question anyways!

Another thing about DMN and ECN 'fighting each other' — which can be felt in practice without any science: when you're writing and stop to proofread or tag notes — a purely directed, analytical task — getting back to generative flow is genuinely hard. Whether that's suppression or something else mechanistically, the phenomenology at least suggests these aren't the same process running at different speeds...

A Failure To Disagree

There is no System 2