Language Model Contains Personality Subnetworks

(arxiv.org)

34 points | by PaulHoule 5 hours ago

2 comments

  • D-Machine 1 hour ago
    The personality thing seems kind of tautological / uninteresting, as I have pointed out before: https://news.ycombinator.com/item?id=46905692.

    Psychological instruments and concepts (like MBTI) are constructed from the semantics of everyday language. Personality models (being based on self-report, and not actual behaviour) are not models of actual personality, but the correlation patterns in the language used to discuss things semantically related to "personality". It would be thus extremely surprising if LLM-output patterns (trained on people's discussions and thinking about personality) would not also result in learning similar correlational patterns (and thus similar patterns of responses when prompted with questions from personality inventories).

    The real and more interesting part of the paper is the use of statistical techniques to isolate sub-networks which can then be used to emit outputs more consistent with some desired personality configuration. There is no obvious reason to me that this couldn't be extended to other types of concepts, and it kind reads to me like a way of doing a very cheap, training-free sort of "fine-tuning".

    • Nevermark 27 minutes ago
      Agreed.

      Everything in a model is a correlation of behavior with context and context with behavior.

      "Mind set" is a factor across the continuum of scales.

      Are we solving a math problem or deciding on entertainment? We become entirely "different brains" in those different contexts, as we configure our behavior and reasoning patterns accordingly.

      The study is still interesting. The representation, clustering, and bifurcations of roles may simply be one end of a continuum, but they are still meaningful things to specifically investigate.

    • devmor 59 minutes ago
      Thank you, I came here to say so much in less eloquent terms.

      It's not surprising to find clustered sentiment from a slice of statistically correlated language. I wouldn't call this a "personality" any more than I would say the front grill of a car has a "face".

      Deterministically isolating these clusters however, could prove to be an incredibly useful technique for both using and evaluating language models.

      • D-Machine 33 minutes ago
        It's not even really the researchers' fault, academic psychological personality research is in general philosophically very weak / poor, in that they also almost always conflate "models of / talking about personality" with actual personality, and rarely actually check if things like the MBTI or Five-Factor Model actually correlate meaningfully with real behaviours.

        Those that do find correlations between self-reported personality and actual behaviours tend to find those to be in a range of something like 0.0 to 0.3 or so, maybe 0.4 if you are really lucky. Which means "personality" measured this way is explaining something like 16% of the variance in behaviour, at max.

        • devmor 12 minutes ago
          I don’t think this is even limited to this part of academia - or academia at all, but I do think it’s a bit irresponsible of them to assume prior rigor in those personality tests.

          On top of that, a confounding issue is that human nature is to anthropomorphize things. What is more likely to be anthropomorphized than a construct of written language - the now primary method of knowledge transfer between humans? I can’t help but feel that this wishful bias contributes to missing the due diligence of choosing an appropriate metric with which to measure.

  • sarducci 2 hours ago
    to me this suggests that language strongly influences behavior
    • mitthrowaway2 1 hour ago
      My interpretation is that it's the other way around. The language model trainer's job is to find the network weights that make the model best at compressing the data in the training set. So what this means is that, say, professional work-speak text samples and hacker l33t-speak text samples are different enough that they end up being predicted by different sparse sub-networks; it was apparently too hard to find a smaller solution in which the same sub-network weights predict both outputs.
    • yorwba 1 hour ago
      All LLM behavior is mediated through language by construction. That doesn't mean the same applies to humans.
    • soulofmischief 1 hour ago
      I think specifically, certain psychological modes require different levels of articulation, and language is one way to get there in a bandwidth-limited system.

      See also: https://en.wikipedia.org/wiki/Newspeak

      • PaulHoule 1 hour ago
        People are fascinated by controlling the vocabulary for political purposes but I think it mostly doesn't work. "Illegal Alien" is the exception that proves the rule.

        Usually it results in an "equal and opposite backlash". Once they started calling children "Special" in school, "Special" became the ultimate insult.

        • D-Machine 57 minutes ago
          It is a wordcel problem, i.e. the belief that language is all there is for modeling reality, even though this is obviously false and has been clearly disproven by decades of research in psychology, cognitive science, and neuroscience. At best we can say that sometimes language has a strong influence on our perceptions of reality.

          EDIT: For a neuroscience reference that also argues why the general perspective is obviously false: https://pmc.ncbi.nlm.nih.gov/articles/PMC4874898/. But really, these things ought to be obvious from introspection.

    • uoaei 1 hour ago
      Language constrains your perception of reality to only the set of concepts conceivable within that language.

      Agents who only speak Rust have no conception of what runtime errors are, for instance. Fascists won't understand concepts like "universal human rights" as in their worldview there is nothing universal about humanity as a whole.

      • PurpleRamen 16 minutes ago
        > Language constrains your perception of reality to only the set of concepts conceivable within that language.

        It's the opposite. People make up new concepts all the time for which they have no words, to then give it a name. Language is composable, words and names are just a mean to improve communication, make it faster, more efficient.

        > Agents who only speak Rust have no conception of what runtime errors are, for instance.

        Agents don't really learn. They have a fixed set of data and everything new has to be pressed into the prompt. This is unrelated to language.

      • D-Machine 59 minutes ago
        This is IMO largely false, and empirically things like Sapir-Worf and strong linguistic relativism, or that language == thought are widely considered disproven [1-3].

        This is also sort of a wordcel take, in that it neglects that there are plenty of mental structures that are not solely linguistic. I.e. visuo-spatial models, auditory models, kinaesthetic, proprioceptive, emotional, gustatory, or even maybe intuitive models, and symbolic models (which have both linguistic and visuo-spatial aspects). Yes, your models constrain your perception of reality, but it is not clear how important language really is to many of those models (and there is strong evidence it may not matter at all to a lot of cognition [3]).

        [1] https://en.wikipedia.org/wiki/Linguistic_relativity

        [2] https://plato.stanford.edu/archives/sum2015/entries/relativi...

        [3] https://pmc.ncbi.nlm.nih.gov/articles/PMC4874898/

      • PaulHoule 1 hour ago
        I'd argue that people can put words together to make new meanings or coin new words when they have to. The real magic of language is not "we have words for everything" but we have grammar.