Thursday, March 14, 2024

Religious Believers Normally Do and Should Want Their Religious Credences to Align with Their Factual Beliefs

Next week (at the Southern Society for Philosophy and Psychology) I'll be delivering comments on Neil Van Leeuwen's new book, Religion as Make-Believe. Neil argues that many (most?) people don't actually "factually believe" the doctrines of their religion, even if they profess belief. Instead, the typical attitude is one of "religious credence", which is closer to pretense or make-believe.

Below are my draft comments. Comments and further reactions welcome!

Highlights of Van Leeuwen’s View.

Neil distinguishes factual beliefs from religious credences. If you factually believe something – for example, that there’s beer in the fridge – that belief will generally have four functional features:

(1.) It is involuntary. You can’t help but believe that there’s beer in the fridge upon looking in the fridge and seeing the beer.

(2.) It is vulnerable to evidence. If you later look in the fridge and discover no beer, your belief that there is beer in the fridge will vanish.

(3.) It guides actions across the board. Regardless of context, if the question of whether beer is in your fridge becomes relevant to your actions, you will act in light of that belief.

(4.) It provides the informational background governing other attitudes. For example, if you imagine a beer-loving guest opening the fridge, you will imagine them also noticing the beer in there.

Religious credences, Neil argues, have none of those features. If you “religiously creed” that God condemns masturbators to Hell, that attitude is:

(1.) Voluntary. In some sense – maybe unconsciously – you choose to have this religious credence.

(2.) Invulnerable to evidence. Factual evidence, for example, scientific evidence of the non-existence of Hell, will not cause the credence to disappear.

(3.) Guides actions only in limited contexts. For example, it doesn’t prevent you from engaging in the condemned behavior in the way a factual belief of the same content presumably would.

(4.) Doesn’t reliably govern other attitudes. For example, if you imagine others engaging in the behavior, it doesn’t follow that you will imagine God also condemning them.

Although some people may factually believe some of their religious doctrines, Neil holds that commonly what religious people say they “believe” they in fact only religiously creed.

Neil characterizes his view as a “two map” view of factual belief and religious credence. Many religious people have one picture of the world – one map – concerning what they factually believe, and a different picture of the world – a different map – concerning what they religiously creed. These maps might conflict: One might factually believe that Earth is billions of years old and religiously creed that it is less than a million years old. Such conflict need not be rationally troubling, since the attitudes are different. Compare: You might believe that Earth is billions of years old but imagine, desire, or assume for the sake of argument that it is less than a million years old. Although the contents of these attitudes conflict, there is no irrationality. What you imagine, desire, or assume for the sake of argument needn’t match what you factually believe. There are different maps, employed for different purposes. On Neil’s view, the same holds for religious credence.

There’s much I find plausible and attractive in Neil’s view. In particular, I fully support the idea that if someone sincerely asserts a religious proposition but doesn’t generally act and react as if that proposition is true, they can’t accurately be described as believing, or at least fully believing, that proposition.

However, I think it will be more productive to focus on points of disagreement.

First Concern: The Distinction Is Too Sharp.

Neil generally speaks as though the attitudes of factual belief and religious credence split sharply into two distinct kinds. I’m not sure how much depends on this, but I’m inclined to think it’s a spectrum, with lots in the middle. Middling cases might especially include emotionally loaded attitudes where the evidence is not in-your-face compelling. Consider, for example, my attitude toward the proposition my daughter has a great eye for fashion. This is something she cares about, an important part of how she thinks of herself, and I sincerely and enthusiastically affirm it. Is this attitude voluntary or involuntary? Well, to some extent it is a reaction to evidence; but to some extent I suspect I hold on to it in part because I want to affirm her self-conception. Is it vulnerable to counterevidence? Well, maybe if I saw again and again signs of bad fashion taste, my attitude would disappear; but it might require more counterevidence than for an attitude in which I am less invested. It’s somewhat counterevidence resistant. Does it guide my inferences across contexts? Well, probably – but suppose she says she wants to pursue a career in fashion, the success of which would depend on her really having a great eye. Now I feel the bubbling up of some anxiety about the truth of the proposition, which I don’t normally feel in other contexts. It’s not a religious credence certainly, but it has some of those features, to some degree.

Another case might be philosophical views. I’m pretty invested, for example, in my dispositionalist approach to belief. Is my dispositionalism vulnerable to evidence? I’d like to hope that if enough counterevidence accumulated, I would abandon the view. But I also admit that my investment in the view likely makes my attitude somewhat counterevidence resistant. Did I choose it voluntarily? I remember being immediately attracted to it in graduate school, when two of my favorite interlocutors at the time, Victoria McGeer and John Heil, both described dispositionalism about belief as underappreciated. I felt its attractions immediately and perhaps in some sense chose it, before I had fully thought through the range of pro and con arguments. In general, I think, students quickly tend to find philosophical views attractive or repellent, even before they are familiar enough with the argumentative landscape to be able to effectively defend their preferred views against well informed opponents; and typically (not always) they stick with the views that initially attracted them. Is this choice? Well, it’s more like choice than what happens to me when I open the fridge and simply see whether it contains beer. If religious credences are chosen, perhaps philosophical attitudes are in a similar sense partly chosen. There might be a social component, too: People you like tend to have this philosophical view, people you dislilke tend to have this other one. As for widespread cognitive governance: There’s a small literature on the question of whether the views philosophers endorse in the classroom and in journal articles do, or do not, govern their choices outside of philosophical contexts. I suspect the answer is: partly.

I also suspect that typical religious credences aren’t quite as voluntary, evidentially invulnerable, and context constrained as would be suggested by a sharp-lines picture. Someone who religiously creeds that God condemns masturbators might feel to some extent correctly that that position is forced upon them by their other commitments and might be delighted to find and respond to evidence that it is false. And although as Neil notes, citing Dennett, they might engage in the activity in a way that makes little sense if they literally think they are risking eternal Hell, people with this particular credence might well feel nervous, guilty, and like they are taking a risk which they hope God will later forgive. If so, their credence affects their thinking in contexts beyond Sunday – and maybe generally when it’s relevant.

Second Concern: Much of Neil’s Evidence Can Be Explained by Weak Belief.

Reading the book, I kept being niggled by the idea that much (but not all) of the evidence Neil marshals for his view could be explained if religious people factually believe what they say they believe, but don’t factually believe it with high confidence. On page 226, Neil articulates this thought as the “weak belief” explanation of the seeming irrationality of religious attitudes.

Weak belief can’t be the whole story. Even a 60% confidence in eternal damnation ought to be enough to choke off virtually any behavior, so if the behavior continues, it can’t be a rational reaction to low confidence.

Still, Neil makes much out of the fact that Vineyard members who claim in religious contexts that a shock they experienced from their coffeemaker was a demonic attack will also repair their coffeemaker and describe the shock in a more mundane way in non-religious contexts (p. 78-80). People who engage in petitionary prayer for healing also go to see the doctor (p. 86-88). And people often confess doubt about their religion (p. 93-95, 124-125). Such facts are perhaps excellent evidence that such people don’t believe with 100% confidence that the demon shocked them, that the prayer will heal them, and that the central tenets of their religion are all true. But these facts are virtually no evidence against the possibility that people have ordinary factual belief of perhaps 75% confidence that the demon shocked them, that the prayer will heal, and that their religion is true. Their alternative explanations, backup plans, and expressions of anxious doubt might be entirely appropriate and rational manifestations of low-confidence factual belief.

Third Concern: If There Are Two Maps, Why Does It Feel Like They Shouldn’t Conflict?

Consider cases where religious credences conflict with mainstream secular factual belief, such as the creationist attitude that Earth is less than a million years old and the Mormon attitude that American Indians descended from Israelites (p. 123-124). There is no rational conflict whatsoever between believing that Earth is billions of years old or that American Indians descended from East Asians and desiring that Earth is not billions of years old and that American Indians did not descend from East Asians. Nor is there any conflict between mainstream secular factual beliefs and imagining or assuming for the sake of argument that Earth is young or that American Indians descended from Israelites. For these attitude pairs, we really can construct two conflicting maps, feeling no rational pressure from their conflict. Here’s the map displaying what I factually believe, and here’s this other different map displaying what I desire, or imagine, or assume for sake of the present argument.

But it doesn’t seem like we are, or should be, as easygoing about conflicts between our religious attitudes and our factual beliefs. Of course, some people are. Some people will happily say I factually think that Earth is billions of years old but my religious attitude is that Earth is young, and I feel no conflict or tension between these two attitudes. But for the most part, I expect, to the extent people are invested in their religious credences they will reject conflicting factual content. They will say “Earth really is young. Mainstream science is wrong.” They feel the tension. This suggests that there aren’t really two maps with conflicting content, but one map, either representing Earth as old or representing Earth as young. If they buy the science, they reinterpret the creation stories as myths or metaphors. If they insist that the creation stories are literally true, then they reject the scientific consensus. What most people don’t do is hold both the standard scientific belief that Earth is literally old and the religious credence that Earth is literally young. At least, this appears to be so in most mainstream U.S. religious Christian cultures.

A one-map view nicely explains this felt tension. Neil’s two maps view needs to do more to explain why there’s a felt need for religious credence and factual belief to conform to each other. I raised a version of this concern in a blog post in 2022, developing an objection articulated by Tom Kelly in oral discussion. Neil has dubbed it the Rational Pressure Argument.

Neil’s response, in a guest post on my blog, was to suggest that there are some attitudes distinct from belief that are also subject to this type of rational pressure. Guessing is not believing, for example, but your guesses shouldn’t conflict with your factual beliefs. If you factually believe that the jar contains fewer than 8000 jelly beans, you’d better not guess that it actually contains 9041. If you hypothesize or accept in a scientific context that Gene X causes Disease Y, you’d better not firmly believe that Gene X has nothing to do with Disease Y. Thus, Neil argues, it does not follow from the felt conflict between the religious attitude and the factual belief that the religious attitude is a factual belief. Guesses and hypotheses are not beliefs and yet generate similar felt conflict.

That might be so. But the Rational Pressure Argument still creates a challenge for Neil’s two map view. Guessing and hypothesizing are different attitudes from factual belief, but they use the same map. My map of the jelly bean jar says there are 4000-8000 jelly beans. I now stick a pin in this map at 7000; that’s my guess. My map of the causes of Disease Y doesn’t specify what genes are involved, and because of this vagueness, I can put in a pin on Gene X as a hypothesized cause. The belief map constrains the guesses and hypotheses because the guesses and hypotheses are specifications within that same map. I don’t have a separate and possibly conflicting guess map and hypothesis map in the way that I can have a separate desire map or imagination map.

I thus propose that in our culture people typically feel the need to avoid conflict between their religious attitudes and their factual beliefs; and this suggests that they feel pressure to fit their religious understandings together with their ordinary everyday and scientific understandings into a single, coherent map of how the world really is, according to them.


Thanks for the awesome book, Neil! I philosophically creed some concerns, but I invite you to infer nothing from that about my factual beliefs.

Friday, March 08, 2024

The Mimicry Argument Against Robot Consciousness

Suppose you encounter something that looks like a rattlesnake.  One possible explanation is that it is a rattlesnake.  Another is that it mimics a rattlesnake.  Mimicry can arise through evolution (other snakes mimic rattlesnakes to discourage predators) or through human design (rubber rattlesnakes).  Normally, it's reasonable to suppose that things are what they appear to be.  But this default assumption can be defeated -- for example, if there's reason to suspect sufficiently frequent mimics.

Linguistic and "social" AI programs are designed to mimic superficial features that ordinarily function as signs of consciousness.  These programs are, so to speak, consciousness mimics.  This fact about them justifies skepticism about the programs' actual possession of consciousness despite the superficial features.

In biology, deceptive mimicry occurs when one species (the mimic) resembles another species (the model) in order to mislead another species such as a predator (the dupe).  For example, viceroy butterflies evolved to visually resemble monarch butterflies in order to mislead predator species that avoid monarchs due to their toxicity.  Gopher snakes evolved to shake their tails in dry brush in a way that resembles the look and sound of rattlesnakes.

Social mimicry occurs when one animal emits behavior that resembles the behavior of another animal for social advantage.  For example, African grey parrots imitate each other to facilitate bonding and to signal in-group membership, and their imitation of human speech arguably functions to increase the care and attention of human caregivers.

In deceptive mimicry, the signal normally doesn't correspond with possession of the model's relevant trait.  The viceroy is not toxic, and the gopher snake has no poisonous bite.  In social mimicry, even if there's no deceptive purpose, the signal might or might not correspond with the trait suggested by the signal: The parrot might or might not belong to the group it is imitating, and Polly might or might not really "want a cracker".

All mimicry thus involves three traits: the superficial trait (S2) of the mimic, the corresponding superficial trait (S1) of the model, and an underlying feature (F) of the model that is normally signaled by the presence of S1 in the model.  (In the Polly-want-a-cracker case, things are more complicated, but let's assume that the human model is at least thinking about a cracker.)  Normally, S2 in the mimic is explained by its having been modeled on S1 rather than by the presence of F in the mimic, even if F happens to be present in the mimic.  Even if viceroy butterflies happen to be toxic to some predator species, their monarch-like coloration is better explained by their modeling on monarchs than as a signal of toxicity.  Unless the parrot has been specifically trained to say "Polly want a cracker" only when it in fact wants a cracker, its utterance is better explained by modeling on the human than as a signal of desire.

Figure: The mimic's possession of superficial feature S2 is explained by mimicry of superficial feature S1 in the model.  S1 reliably indicates F in the model, but S2 does not reliably indicate F in the mimic.

[click to enlarge and clarify]

This general approach to mimicry can be adapted to superficial features normally associated with consciousness.

Consider a simple case, where S1 and S2 are emission of the sound "hello" and F is the intention to greet.  The mimic is a child's toy that emits that sound when turned on, and the model is an ordinary English-speaking human.  In an ordinary English-speaking human, emitting the sound "hello" normally (though of course not perfectly) indicates an intention to greet.  However a child's toy has no intention to greet.  (Maybe its designer, years ago, had an intention to craft a toy that would "greet" the user when powered on, but that's not the toy's intention.)  F cannot be inferred from S2, and S2 is best explained by modeling on S1.

Large Language Models like GPT, PaLM, and LLaMA, are more complex but are structurally mimics.

Suppose you ask ChatGPT-4 "What is the capital of California?" and it responds "The capital of California is Sacramento."  The relevant superficial feature, S2, is a text string correctly identifying the capital of California.  The best explanation of why ChatGPT-4 exhibits S2 is that its outputs are modeled on human-produced text that also correctly identifies the capital of California as Sacramento.  Human-produced text with that content reliably indicates the producer's knowledge that Sacramento is the capital of California.  But we cannot infer corresponding knowledge when ChatGPT-4 is the producer.  Maybe "beliefs" or "knowledge" can be attributed to sufficiently sophisticated language models, but that requires further argument.  A much simpler model, trained on a small set of data containing a few instances of "The capital of California is Sacramento" might output the same text string for essentially similar reasons, without being describable as "knowing" this fact in any literal sense.

When a Large Language Model outputs a novel sentence not present in the training corpus, S2 and S1 will need to be described more abstractly (e.g., "a summary of Hamlet" or even just "text interpretable as a sensible answer to an absurd question").  But the underlying considerations are the same.  The LLM's output is modeled on patterns in human-generated text and can be explained as mimicry of those patterns, leaving open the question of whether the LLM has the underlying features we would attribute to a human being who gave a similar answer to the same prompt.  (See Bender et al. 2021 for an explicit comparison of LLMs and parrots.)

#

Let's call something a consciousness mimic if it exhibits superficial features best explained by having been modeled on the superficial features of a model system, where in the model system those superficial features reliably indicate consciousness.  ChatGPT-4 and the "hello" toy are consciousness mimics in this sense.  (People who say "hello" or answer questions about state capitals are normally conscious.)  Given the mimicry, we cannot infer consciousness from the mimics' S2 features without substantial further argument.  A consciousness mimic exhibits traits that superficially look like indicators of consciousness, but which are best explained by the modeling relation rather than by appeal to the entity's underlying consciousness.  (Similarly, the viceroy's coloration pattern is best explained by its modeling on the monarch, not as a signal of its toxicity.)

"Social AI" programs, like Replika, combine the structure of Large Language Models with superficial signals of emotionality through an avatar with an expressive face.  Although consciousness researchers are near consensus that ChatGPT-4 and Replika are not conscious to any meaningful degree, some ordinary users, especially those who have become attached to AI companions, have begun to wonder.  And some consciousness researchers have speculated that genuinely conscious AI might be on the near (approximately ten-year) horizon (e.g., Chalmers 2023; Butlin et al. 2023; Long and Sebo 2023).

Other researchers -- especially those who regard biological features as crucial to consciousness -- doubt that AI consciousness will arrive anytime soon (e.g., Godfrey-Smith 2016Seth 2021).  It is therefore likely that we will enter an era in which it is reasonable to wonder whether some of our most advanced AI systems are conscious.  Both consciousness experts and the ordinary public are likely to disagree, raising difficult questions about the ethical treatment of such systems (for some of my alarm calls about this, see Schwitzgebel 2023a, 2023b).

Many of these systems, like ChatGPT and Replika, will be consciousness mimics.  They might or might not actually be conscious, depending on what theory of consciousness is correct.  However, because of their status as mimics, we will not be licensed to infer that they are conscious from the fact that they have superficial features (S2-type features) that resemble features in humans (S1-type features) that, in humans, reliably indicate consciousness (underlying feature F).

In saying this, I take myself to be saying nothing novel or surprising.  I'm simply articulating in a slightly more formal way what skeptics about AI consciousness say and will presumably continue to say.  I'm not committing to the view that such systems would definitely not be conscious.  My view is weaker, and probably acceptable even to most advocates of near-future AI consciousness.  One cannot infer the consciousness of an AI system that is built on principles of mimicry from the fact that it possesses features that normally indicate consciousness in humans.  Some extra argument is required.

However, any such extra argument is likely to be uncompelling.  Given the highly uncertain status of consciousness science, and widespread justifiable dissensus, any positive argument for these systems' consciousness will almost inevitably be grounded in dubious assumptions about the correct theory of consciousness (Schwitzgebel 2014, 2024).

Furthermore, given the superficial features, it might feel very natural to attribute consciousness to such entities, especially among non-experts unfamiliar with their architecture and perhaps open to, or even enthusiastic about, the possibility of AI consciousness in the near future.

The mimicry of superficial features of consciousness isn't proof of the nonexistence of consciousness in the mimic, but it is grounds for doubt.  And in the context of highly uncertain consciousness science, it will be difficult to justify setting aside such doubts.

None of these remarks would apply, of course, to AI systems that somehow acquire features suggestive of consciousness by some process other than mimicry.

Friday, March 01, 2024

The Leapfrog Hypothesis for AI Consciousness

The first genuinely conscious robot or AI system would, you might think, have relatively simple consciousness -- insect-like consciousness, or jellyfish-like, or frog-like -- rather than the rich complexity of human-level consciousness. It might have vague feelings of dark vs light, the to-be-sought and to-be-avoided, broad internal rumblings, and not much else -- not, for example, complex conscious thoughts about ironies of Hamlet, or multi-part long-term plans about how to form a tax-exempt religious organization. The simple usually precedes the complex. Building a conscious insect-like entity seems a lower technological bar than building a more complex consciousness.

Until recently, that's what I had assumed (in keeping with Basl 2013 and Basl 2014, for example). Now I'm not so sure.

[Dall-E image of a high-tech frog on a lily pad; click to enlarge and clarify]

AI systems are -- presumably! -- not yet meaningfully conscious, not yet sentient, not yet capable of feeling genuine pleasure or pain or having genuine sensory experiences. Robotic eyes "see" but they don't yet see, not like a frog sees. However, they do already far exceed all non-human animals in their capacity to explain the ironies of Hamlet and plan the formation of federally tax-exempt organizations. (Put the "explain" and "plan" in scare quotes, if you like.) For example:

[ChatGPT-4 outputs for "Describe the ironies of Hamlet" and "Devise a multi-part long term plan about how to form a tax-exempt religious organization"; click to enlarge and clarify]

Let's see a frog try that!

Consider, then the Leapfrog Hypothesis: The first conscious AI systems will have rich and complex conscious intelligence, rather than simple conscious intelligence. AI consciousness development will, so to speak, leap right over the frogs, going straight from non-conscious to richly endowed with complex conscious intelligence.

What would it take for the Leapfrog Hypothesis to be true?

First, engineers would have to find it harder to create a genuinely conscious AI system than to create rich and complex representations or intelligent behavioral capacities that are not conscious.

And second, once a genuinely conscious system is created, it would have to be relatively easy thereafter to plug in the pre-existing, already developed complex representations or intelligent behavioral capacities in such a way that they belong to the stream of conscious experience in the new genuinely conscious system. Both of these assumptions seem at least moderately plausible, in these post-GPT days.

Regarding the first assumption: Yes, I know GPT isn't perfect and makes some surprising commonsense mistakes. We're not at genuine artificial general intelligence (AGI) yet -- just a lot closer than I would have guessed in 2018. "Richness" and "complexity" are challenging to quantify (Integrated Information Theory is one attempt). Quite possibly, properly understood, there's currently less richness and complexity in deep learning systems and large language models than it superficially seems. Still, their sensitivity to nuance and detail in the inputs and the structure of their outputs bespeaks complexity far exceeding, at least, light-vs-dark or to-be-sought-vs-to-be-avoided.

Regarding the second assumption, consider a cartoon example, inspired by Global Workspace theories of consciousness. Suppose that, to be conscious, an AI system must have input (perceptual) modules, output (behavioral) modules, side processors for specific cognitive tasks, long- and short-term memory stores, nested goal architectures, and between all of them a "global workspace" which receives selected ("attended") inputs from most or all of the various modules. These attentional targets become centrally available representations, accessible by most or all of the modules. Possibly, for genuine consciousness, the global workspace must have certain further features, such as recurrent processing in tight temporal synchrony. We arguably haven't yet designed a functioning AI system that works exactly along these lines -- but for the sake of this example let's suppose that once we create a good enough version of this architecture, the system is genuinely conscious.

But now, as soon as we have such a system, it might not be difficult to hook it up to a large language model like GPT-7 (GPT-8? GPT-14?) and to provide it with complex input representations full of rich sensory detail. The lights turn on... and as soon as they turn on, we have conscious descriptions of the ironies of Hamlet, richly detailed conscious pictorial or visual inputs, and multi-layered conscious plans. Evidently, we've overleapt the frog.

Of course, Global Workspace Theory might not be the right theory of consciousness. Or my description above might not be the best instantiation of it. But the thought plausibly generalizes to a wide range of functionalist or computationalist architectures: The technological challenge is in creating any consciousness at all in an AI system, and once this challenge is met, giving the system rich sensory and cognitive capacities, far exceeding that of a frog, might be the easy part.

Do I underestimate frogs? Bodily tasks like five-finger grasping and locomotion over uneven surfaces have proven to be technologically daunting (though we're making progress). Maybe the embodied intelligence of a frog or bee is vastly more complex and intelligent than the seemingly complex, intelligent linguistic outputs of a large language model.

Sure thing -- but this doesn't undermine my central thought. In fact, it might buttress it. If consciousness requires frog- or bee-like embodied intelligence -- maybe even biological processes very different from what we can now create in silicon chips -- artificial consciousness might be a long way off. But then we have even longer to prepare the part that seems more distinctively human. We get our conscious AI bee and then plug in GPT-28 instead of GPT-7, plug in a highly advanced radar/lidar system, a 22nd-century voice-to-text system, and so on. As soon as that bee lights up, it lights up big!

Tuesday, February 20, 2024

Could Someone Still Be Collecting a Civil War Widow's Pension? A Possibility Proof

In 1865, a 14-year-old boy becomes a Union soldier in the U.S. Civil War. In 1931, at age 90, he marries an 18-year-old woman, who continues to collect his Civil War pension after he dies. Today, in early 2024, she is one hundred and ten years old, still collecting that pension.

I was inspired to this thought by reflecting about some long-dead people my father knew, who survive in my memory through his stories. How far back might such second-hand memories go? Farther than one might initially suppose -- in principle, back to the 1860s. An elderly philosopher, alive today, might easily have second-hand memories of William James (d. 1910) or Nietzsche (d. 1900), maybe even Karl Marx (d. 1883) or John Stuart Mill (d. 1873).

Second-hand memories have a quality to them that third-hand memories and historical accounts lack. Through my father's and uncle's stories, I feel a kind of personal connection to Timothy Leary (d. 1996), B.F. Skinner (d. 1990), and Abraham Maslow (d. 1970), even though I never met them, in a way I don't to other scholars of the era. It hasn't been so long since their heyday in the 1950s - 1960s, when my father and his brother knew them -- but I might still have several decades in me. My son David, currently a Cognitive Science PhD student at Institut Jean Nicod at ENS in Paris, has also heard such stories, and he could potentially live to see the 22nd century. (My daughter Kate was too young when my father died to have made much of his academic stories.)

The idea that the U.S. might still be paying a Civil War widow's pension is not as ridiculous as it seems. According to this website, the last pension-recieving Union widow died in 2003. According to this website, it was 2008. The last recipient of a Civil War children's benefit died from a hip injury in 2020.

GPT-4 representation of an elderly civil war widow in a cityscape in 2020:

Friday, February 16, 2024

What Types of Argument Convince People to Donate to Charity? Empirical Evidence

Back in 2020, Fiery Cushman and I ran a contest to see if anyone could write a philosophical argument that convinced online research participants to donate a surprise bonus to charity at rates statistically above control. (Chris McVey, Josh May, and I had failed to write any successful arguments in some earlier attempts.) Contributions were not permitted to mention particular real people or events, couldn't be narratives, and couldn't include graphics or vivid descriptions. We wanted to see whether relatively dry philosophical arguments could move people to donate.

We received 90 submissions (mostly from professional philosophers, psychologists, and behavioral economists, but also from other Splintered Mind readers), and we selected 20 that we thought represented a diversity of the most promising arguments. The contest winner was an argument written by Matthew Lindauer and Peter Singer, highlighting that a donation of $25 can save a child in a developing country from going blind due to trachoma, then asking the reader to reflect on how much they would be willing to donate to save their own child from going blind. (Full text here.)

Kirstan Brodie, Jason Nemirow, Fiery, and I decided to follow up by testing all 90 submitted arguments to see what features were present in the most effective arguments. We coded the arguments according to whether, for example, they mentioned children, or appealed to religion, or mentioned the reader's assumed own economic good fortune, etc. -- twenty different features in all. We recruited approximately 9000 participants. Each participant had a 10% chance of winning a surprise bonus of $10. They could either keep the whole $10 or donate some portion of it to one of six effective charities. Participants decided whether to donate, and how much, before knowing if they were among the 10% receiving the $10.

Now, unfortunately, proper statistical analysis is complicated. Because we were working with whatever came in, we couldn't balance argument features, most arguments had multiple coded features, and the coded features tended to correlate between submissions. I'll share a proper analysis of the results later. Today I'll share a simpler analysis. This simple analysis looks at the coded features one by one, comparing the average donation among the set of arguments with the feature to average donation among the set of arguments without the feature.

There is something to be said, I think, for simple analysis even when they aren't perfect: They tend to be easier to understand and to have fewer "researcher degrees of freedom" (and thus less opportunity for p-hacking). Ideally, simple and sophisticated statistical analyses go hand-in-hand, telling a unified story.

So, what argument features appear to be relatively more versus less effective in motivating charitable giving?

Here are our results, from highest to lowest difference in mean donation. "diff" is the dollar difference in mean donation, N is the number of participants who saw an argument with that feature, n is the number of arguments containing that feature, and p is the statistical p-value in a two-sample t test (without correction for multiple comparisons). All analyses are tentative, pending double-checking, skeptical examination, and possibly some remaining data clean-up.

Predictive Argument Features, Highest to Lowest

Does the argument appeal to the notion of equality?
$3.99 vs $3.39 (diff = $.60, N = 395, n = 4, p < .001)

... mention human evolutionary history?
$3.93 vs $3.39 (diff = $.55, N = 4940, n = 5, p < .001)

... specifically mention children?
$3.76 vs $3.26 (diff = $.49, N = 4940, n = 27, p < .001)

... mention a specific, concrete benefit to others that $10 or a similar amount would bring (e.g., 3 mosquito nets or a specific inexpensive medical treatment)?
$3.75 vs $3.44 (diff = $.41, N = 1718, n = 17, p < .001)

... appeal to the diminishing marginal utility of dollars kept by (rich) donors?
$3.69 vs $3.29 (diff = $.40, N = 2843, n = 27, p < .001)

... appeal to the massive marginal utility of dollars transferred to (poor) recipients?
$3.65 vs $3.25 (diff = $.40, N = 3758, n = 36, p < .001)

... mention, or ask the participant to bring to mind, a particular person who is physically or emotionally near to them?
$3.74 vs $3.34 (diff = $.34, N = 318, n = 3, p = .061)

... mention particular needs or hardships such as clean drinking water or blindness?
$3.56 vs $3.23 (diff = $.30, N = 4940, n = 49, p < .001)

... refer to the reader's own assumed economic good fortune?
$3.58 vs $3.31 (diff = $.27, N = 3544, n = 35, p < .001)

... focus on one, single issue? (e.g. trachoma)
$3.61 vs $3.40 (diff = $.21, N = 800, n = 8, p = .07)

... remind people that giving something is better than nothing? (i.e. corrective for drop-in-the-bucket thinking)
$3.56 vs $3.40 (diff = $.15, N = 595, n = 6, p = .24)

... appeal to the views of experts (e.g. philosophers, psychologists)?
$3.47 vs $3.39 (diff = $.07, N = 2629, n = 27, p = .29)

... reference specific external sources such as news reports or empirical studies?
$3.47 vs $3.40 (diff = $.07, N = 1828, n = 18, p = .41)

... explicitly mention that donation is common?
$3.46 vs $3.41 (diff = $.05, N = 736, n = 7, p = .66)

... appeal to the notion of randomness/luck (e.g., nobody chose the country they were born in)?
$3.43 vs $3.41 (diff = $.02, N = 1403, n = 14, p = .80)

... mention religion?
$3.35 vs $3.42 (diff = -$.07, N = 905, n = 9, p = .48)

... appeal to veil-of-ignorance reasoning or other perspective-taking thought experiments?
$3.29 vs $3.23 (diff = -$.14, N = 4940, n = 8, p = .20)

... mention that giving could inspire others to give? (i.e. spark behavioral contagion)
$3.29 vs $3.43 (diff = -$.14, N = 896, n = 9, p = .20)

... explicitly mention and address specific counterarguments?
$3.29 vs $3.45 (diff = -$.15, N = 1829, n = 19, p = .048)

... appeal to the self-interest of the participant?
$3.22 vs $3.49 (diff = -$.30, N = 2604, n = 22, p < .001)

From this analysis, several argument features appear to be effective in increasing participant donations:

  • mentioning children and appealing to the equality of all people,
  • mentioning concrete benefits (one or several),
  • mentioning the reader's assumed economic good fortune and the relatively large impact of a relatively small sacrifice (the "margins" features), and
  • mentioning evolutionary history (e.g., theories that human beings evolved to care more about near others than distant others).
  • Mentioning a particular near person might also have been effective, but since only three arguments were coded in this category, statistical power was poor.

    In contrast, appealing to the participant's self-interest (e.g., that donating will make them feel good) appears to have backfired. Mentioning and addressing counterarguments to donation (e.g., responding to concerns that donations are ineffective or wasted) might also have backfired.

    Now I don't think we should take these results wholly at face value. For example, only five of the ninety arguments appealed to evolutionary history, and all of those arguments included at least two other seemingly effective features: particular hardships, margins, or children. In multiple regression analyses and multi-level analyses that explore how the argument features cluster, it looks like particular hardships, children, and margins might be more robustly predictive -- more on that in a future post. ETA (Feb 19): Where the n < 10 arguments, effects are unlikely to be statistically robust.

    What if we combine argument features? There are various ways to do this, but the simplest is to give an argument one point for any of the ten largest-effect features, then perform a linear regression. The resulting model has an intercept of $3.09 and a slope of $.13. Thus, the model predicts that participants who read arguments with none of these features will donate $3.09, while participants who read a hypothetical argument containing all ten features will donate $4.39.

    Further analysis also suggests that piling up argument features is cumulative: Arguments with at least six of the effective features generated mean donations of $3.89 (vs. $3.37), those with at least seven generated mean donations of $4.46 (vs. $3.38), and the one argument with eight of the ten effective features generated a mean donation of $4.88 (vs. $3.40) (all p's < .001). This eight-feature argument was, in fact, the best performing argument of the ninety. (However, caution is warranted concerning the estimated effect size for any particular argument: With approximately only 100 participants per argument and a standard deviation of about $3, the 95% confidence intervals for the effect size of individual arguments are about +/- $.50.)

    ------------------------------------------------------

    Last month, I articulated and defended the attractiveness of moral expansion through Mengzian extension. On my interpretion of the ancient Chinese philosopher Mengzi, expansion of one's moral perspective often (typically?) begins with noticing how you react to nearby cases -- whether physically nearby (a child in front of you, about to fall into a well) or relationally nearby (your close family members) -- and proceeds by noticing that remote cases (distant children, other people's parents) are similar in important respects.

    None of the twenty coded features captured exactly that. ("Particular near person" was close, but neither necessary nor sufficient: not necessary, because the coders used a stringent standard for when an argument invoked a particular near person, and not sufficient since invoking a particular near person is only the first step in Mengzian extension.) So I asked UCR graduate student Jordan Jackson, who studies Chinese philosophy and with whom I've discussed Mengzian extension, to read all 90 arguments and code them for whether they employed Mengzian extension style reasoning. He found six that did.

    In accord with my hypothesis about the effectiveness of Mengzian extension, the six Mengzian extension arguments outperformed the arguments that did not employ Mengzian extension:

    $3.85 vs $3.38 (diff = $.47, N = 612, n = 6, p < .001)

    Among those six arguments are both the 2020 original contest winner written by Lindauer and Singer and also the best-performing argument in the present study -- though as noted earlier, the best-performing argument in the current study also had many other seemingly effective features.

    In case you're curious, here's the full text of that argument, adapted by Alex Garinther, and quoting extensively, from one of the stimuli in Lindauer et al. 2020

    HEAR ME OUT ON SOMETHING. The explanation below is a bit long, but I promise reading the next few paragraphs will change you.

    As you know, there are many children who live in conditions of severe poverty. As a result, their health, mental development, and even their lives are at risk from lack of safe water, basic health care, and healthy food. These children suffer from malnutrition, unsanitary living conditions, and are susceptible to a variety of diseases. Fortunately, effective aid agencies (like the Against Malaria Foundation) know how to handle these problems; the issue is their resources are limited.

    HERE'S A PHILOSOPHICAL ARGUMENT: Almost all of us think that we should save the life of a child in front of us who is at risk of dying (for example, a child drowning in a shallow pond) if we are able to do so. Most people also agree that all lives are of equal moral worth. The lives of faraway children are no less morally significant than the lives of children close to us, but nearby children exert a more powerful emotional influence. Why?

    SCIENTISTS HAVE A PLAUSIBLE ANSWER: We evolved in small groups in which people helped their neighbors and were suspicious of outsiders, who were often hostile. Today we still have these “Us versus Them” biases, even when outsiders pose no threat to us and could benefit enormously from our help. Our biological history may predispose us to ignore the suffering of faraway people, but we don't have to act that way.

    By taking money that we would otherwise spend on needless luxuries and donating it to an effective aid agency, we can have a big impact. We can provide safe water, basic health care, and healthy food to children living in severe poverty, saving lives and relieving suffering.

    Shouldn't we, then, use at least some of our extra money to help children in severe poverty? By doing so, we can help these children to realize their potential for a full life. Great progress has been made in recent years in addressing the problem of global poverty, but the problem isn't being solved fast enough. Through charitable giving, you can contribute towards more rapid progress in overcoming severe poverty.

    Even a donation $5 can save a life by providing one mosquito net to a child in a malaria-prone area. FIVE DOLLARS could buy us a large cappuccino, and that same amount of money could be used to save a life.

    Friday, February 09, 2024

    Grade Inflation at UC Riverside, and Institutional Pressures for Easier Grading

    Recent news reports have highlighted grade inflation at elite universities: Harvard gave 79% As in 2020-2021, as did Yale in 2022-2023, compared to 67% in 2010-2011. At Harvard, the average GPA has risen from 2.55 in 1950 to 3.05 in 1975 to 3.36 in 1995 to 3.80 now. At Brown, 67% of grades were As in 2020-2021, 10% Bs, and only 1% Cs. It's not just elite universities, however: Grades have risen sharply since at least the 1980s across a wide range of schools.

    I decided to look at UC Riverside's grade distributions since 2013, since faculty now have access to a tool to view this information. (It would be nice to look back farther, but even the changes since 2013 are interesting.)

    The following chart lists grade distributions quarter by quarter for the regular academic year, from 2013 through the present. The dark blue bars at the top are As, medium blue Bs, light blue Cs, and red is D, F, or W.

    [click to enlarge and clarify]

    Three things are visually obvious from this graph:

  • First, there's a spike of high grades in Spring 2020 -- presumably due to the chaos of the early days of the pandemic.
  • Second, the percentage of As is higher in recent years than in earlier years.
  • Third, the percentage of DFWs has remained about the same across the period.
  • In Fall 2013, 32% of enrolled students received As. In Fall 2023, 45% did. (DFW's were 9% in both terms.)

    One open question is whether the new normal of about 45% As reflects a general trend independent of the pandemic spike or whether the pandemic somehow created an enduring change. Another question is whether the higher percentage of As reflects easier grading or better performance. The term "inflation" suggests the former, but of course data of this sort by themselves don't distinguish between those possibilities.

    The increase in percentage As is evident in both lower division and upper division classes, increasing from 32% to 43% in lower division and from 33% to 49% in upper division.

    How about UCR philosophy in particular? I'd like to think that my own department has consistent and rigorous standards. However, as the figure below shows, the trends in UCR philosophy are similar, with an increase from 26% As in Fall 2013 to 41% As in Fall 2024:

    [click to enlarge and clarify]

    Lower division philosophy classes at UCR increased from 25% As in Fall 2013 to 40% As in Fall 2023, while upper division classes increased from 26% to 47% As.

    Smoothing out quarter-by-quarter differences, here is the percentage of As, Fall 2013 - Spring 2014 vs Winter 2023 - Fall 2023 for Philosophy and some selected other disciplines at UCR for comparison:
    Philosophy: 27% to 43% (28% to 42% lower, 25% to 46% upper)
    English: 20% to 33% (15% to 28% lower, 38% to 64% upper)
    History: 28% to 52% (23% to 52% lower, 48% to 52% upper)
    Business: 28% to 46% (20% to 24% lower, 29% to 49% upper)
    Psychology: 32% to 51% (33% to 51% lower, 31% to 51% upper)
    Biology: 22% to 38% (28% to 36% lower, 17% to 41% upper)
    Physics: 26% to 39% (26% to 37% lower, 40% to 41% upper)

    As you can see, in some disciplines at some levels, the percentage of As has almost doubled over the ten-year time period.

    UCR is probably not unusual in the respects I have described. However, if other people have similar analyses for their own institutions, I'd be interested to hear, especially if the pattern is different.

    I doubt, unfortunately, that students are actually performing that much better. UCR philosophy students in 2023 were not dramatically better at writing, critical thinking, and understanding historical material than were students in 2013. I conjecture that the main cause of grade inflation is institutional pressures toward easier grading.

    I see two institutional pressures toward higher grades and more relaxed standards:

    Teaching evaluations: Generally students give better teaching evaluations to professors from whom they expect better grades.[1] Other things being equal, a professor who gives few As will get worse evaluations than one who gives many As. Since professors' teaching is often judged in large part on student evaluations, professors will tend to be institutionally rewarded for giving higher grades, ensuring happier students who give them better evaluations. Professors who are easier graders, if this fact is known among the student body, will also tend to get higher enrollments.

    Graduation rates: At the institutional level, success is often evaluated in terms of graduation rates. If students fail to complete their degrees or take longer than expected to so do because they are struggling with classes, this looks bad for the institution. Thus, there is institutional pressure toward lower standards to ensure high levels of student graduation and "success".

    There are fewer countervailing institutional pressures toward higher rigor and more challenging grading schemes. If classes are too unrigorous, a school might risk losing its WASC accreditation, but few well-established colleges and universities are at genuine risk of losing their accreditation.

    At some point, the grade "A" loses its strength as a signal of excellence. If over 50% of students are receiving As, then an A is consistent with average performance. Yes, for some inspiring teachers and some amazing student groups, average performance might be truly excellent! But that's not the typical scenario.

    I have one positive suggestion for how to deal with grade inflation. But before I get to it, I want to mention one other striking phenomenon: the variation in the grade distributions between terms for what is nominally the same course. For example, here is the distribution chart for one of the lower division classes in UCR's Philosophy Deparment:

    [click to enlarge and clarify]

    The distribution ranges from 11% As in Fall 2014 to 72% As in Fall 2020.

    Some departments in some universities have moved to standardized curricula and tests so that the same class in each term is taught and graded similarly. In philosophy, this is probably not the right approach, since different instructors can reasonably want to focus on different material, approached and graded differently. Still, that degree of term-by-term variation in what is nominally the same class raises issues of fairness to students.

    My suggestion is: sunlight. Let course grade distributions be widely shared and known.

    Sunlight won't solve everything -- far from it -- but I do think that in looking at students' teaching evaluations, seeing the professor's grade distribution provides valuable context that might disincentivize cynical strategies to inflate grades for good evaluations. I've evaluated teaching for teaching awards, for visiting instructors, and for my own colleagues, and I'm struck by how rare it is for information about grade distributions even to be supplied in the context of evaluating teaching. A full picture of a professor's teaching should include an understanding of the range of grades they are distributing and, ideally, random samples of tests and assignments that earn As and Bs and Cs. This situates us to better celebrate the work of professors with high standards and the students in their classes who live up to those high standards.

    Similarly, grade distributions should be made available at the departmental and institutional level. In combination with other evidence -- again, ideally random samples of assignments awarded A, B, and C -- this can help in evaluating the extent to which those departments and institutions are holding students to high standards.

    Student transcripts, too, might be better understood in the context of institutions' and departments' grading standards. This would allow viewers of the transcript to know whether a student's 3.7 GPA is a rare achievement in their institutional context, or simply average performance.

    --------------------------------------------------

    [1] A recent study suggests that grade satisfaction might be the primary driver of the correlation between students' expected grades and their course evaluations, rather than grading leniency per se -- these can come apart when a student is satisfied with their grade as a result of their hard work for it -- but grading leniency is an instructor's easiest path to generating student grade satisfaction, generating the institutional pressure.

    Friday, February 02, 2024

    Swallows and Moles in Philosophy

    In his review (in the journal Science -- cool!) of my recently released book, The Weirdness of the World, Edouard Machery writes:

    There are two kinds of philosophers: swallows and moles. Swallows love to soar and to entertain philosophical hypotheses at best loosely connected with empirical knowledge. Plato and Gottfried Leibniz are paradigmatic swallows. Moles, on the contrary, rummage through mundane facts about our world and aim at better understanding it. Aristotle, William James, and Hans Reichenbach are paradigmatic moles. Eric Schwitzgebel is unabashedly a swallow.

    Machery admits to having a mole's-eye view of the swallows. He praises the book, but he is frustrated by my admittedly wild speculations about radical skepticism, group consciousness, an infinite future, etc.

    Machery's goal in his own recent book Philosophy Within Its Proper Bounds was, he says, "to curtail the flights of fancy with which contemporary philosophers are enamored". The Weirdness of the World celebrates such flights of fancy -- so naturally, Machery and I are going to disagree about the value of wild philosophical speculation.

    Reading Machery's contrast of swallows and moles, I was immediately reminded of how the ancient Chinese philosopher Zhuangzi opens his Inner Chapters:

    There is a fish in the Northern Oblivion named Kun, and this Kun is quite huge, spanning who knows how many thousands of miles. He transforms into a bird named Peng, and this Peng has quite a back on him, stretching who knows how many thousands of miles. When he rouses himself and soars into the air, his wings are like clouds draped across the heavens. The oceans start to churn, and this bird begins his journey toward the Southern Oblivion....

    The quail laughs at him, saying, "Where does he think he's going? I leap into the air with all my might, but before I get farther than a few yards I drop to the ground. My twittering and fluttering between the branches is the utmost form of flying! So where does he think he's going? (Ziporyn trans., p. 3-4).

    Zhuangzi is the swallowiest of swallows, soaring far beyond mundane empirical facts, wondering if life might be a dream, speculating about trees who measure eight thousand years as a single autumn, and celebrating "spirit men" with skin like ice and snow who eat only wind and dew, riding upon the air and clouds.

    Zhuangzi's quail, however, raises a good point: It's much clearer where you're going if you confine yourself to small hops between familiar branches. The Peng is neither practical nor grounded, and Zhuangzi's philosophy is arguably the same. Zhuangzi's friend Huizi scolds him: "Your words are... big and useless, which is why they are rejected by everyone who hears them" (Ziporyn trans., p. 8).

    In defense against Machery and the quail critique, I offer three thoughts:

    First, if anyone is going to speculate about wild possibilities concerning the fundamental nature of things, philosophers should be among them.

    It would be a sad, gray world if our reasoning was always confined to "proper bounds" and we couldn't reflect on issues like dream skepticism, group consciousness, and infinitude. Shouldn't it be part of the job description of philosophy to explore such ideas, considering what can or should be made of them?

    Such speculations needn't be entirely unconstrained by empirical facts, even if empirical science fails to deliver decisive answers. In The Weirdness of the World my speculations always start from empirical observation. My discussion of dream skepticism engages with the science of dreams; my discussion of group consciousness engages with the science of consciousness; my chapter on the possible infinite future -- collaborative with physicist and philosopher of physics Jacob Barandes -- is grounded in the standard working assumptions of mainstream physics. Scientifically informed philosophers are as well-positioned as anyone to speculate about wild hypotheticals that naturally intrigue us (at least some of us). To stand athwart such speculations, saying "Thou shalt not enter this epistemic wilderness!" is to reject an intrinsically valuable form of human philosophical curiosity.

    Second, we can distinguish two types of swallow: those confident that their wild hypotheses are correct and those who merely entertain and explore such hypotheses.

    Maybe Plato was convinced of the reality of Forms and the recollection theory of memory. Maybe Leibniz was convinced that the world was composed of monads in pre-established harmony. But Zhuangzi was a self-undermining skeptic who appears to have taken none of his wild speculations as established fact.

    I don't argue that the United States definitely has conscious experiences; I argue that if we accept standard materialist approaches to consciousness, they seem to imply that it does and that therefore we should take the idea seriously as a possibility. I don't argue that this is a dream or a short-term simulation; I argue that our ordinary culturally-given understanding of the world and mainstream scientific assumptions combine to justify assigning a non-trivial (maybe about 0.1%) credence to both of those possibilities. Barandes and I don't argue that there definitely is an infinite future in which future counterparts of you enact almost every possible action, but only that it follows from "certain not wholly implausible assumptions".

    When soaring in speculation far beyond the mundane local tree branches, doubt is appropriate. The most natural critique of swallows is that they appear to believe wild things on thin evidence. That critique is harder to sustain when the swallow explicitly treats the speculations as speculations only, rather than as established facts.

    Third, the swallow and the mole can collaborate -- even in the work of a single philosopher. As Jonathan Birch comments in my Facebook post linking to Machery's book review, two of Edouard's paradigmatic examples of moles -- Aristotle and William James -- are probably not best thought of as pure moles, but rather as swallow-moles. They dug around quite a bit in mundane empirical facts, yes. But they sometimes also soared with the swallows. Aristotle speculated on the existence of a supraphysical unmoved mover responsible for the existence of the physical world. James speculated about metaphysical "neutral monism" concerning mind and matter and celebrated religious belief beyond the evidence.

    I too have done a fair bit of mundane empirical work -- for example, on the moral behavior of ethics professors (e.g., here and here), on introspective method (e.g., here and here), and on the consequences of exposure to ethical argumentation (e.g., here and here). Even when I am not myself running the empirical studies, much of my work engages with nitty-gritty empirical detail (e.g., on the history of reports of coloration in dreams, on the cognitive capacites of garden snails, on the accuracy of visual imagery reports, and on psychological measures of well-being).

    Often, I think, deep empirical mole-digging is valuable for one's subsequent speculative soaring. Digging into the details of cosmological models enables better informed speculation about the distant future. Digging into the details of the behavior of ethics students and professors enables better informed speculation about the general relation between ethical reflection and ethical behavior. Digging into the details of dream reports enables better informed speculation about dream skepticism. As Zhuangzi imagines, a low-lying fish can transform into a soaring phoenix.

    No single researcher needs to do both the digging and the soaring, even if some of us enjoy both types of task. But it's valuable to have a whole ecosystem of moles and swallows, foxes and hedgehogs, ants and anteaters, truth philosophers and dare philosophers, and so on.

    I'm honored that Machery counts me among the swallows. I celebrate his moleishness. Let's dig and soar!

    Thursday, January 25, 2024

    Imagining Yourself in Another's Shoes vs. Extending Your Concern: Empirical and Ethical Differences

    [new paper in draft]

    The Golden Rule (do unto others as you would have others do unto you) isn't bad, exactly -- it can serve a valuable role -- but I think there's something more empirically and ethically attractive about the relatively underappreciated idea of "extension" found in the ancient Chinese philosopher Mengzi.

    The fundamental idea of extension, as I interpret it, is to notice the concern one naturally has for nearby others -- whether they are relationally near (like close family members) or spatially near (like Mengzi's child about to fall into a well or Peter Singer's child you see drowning in a shallow pond) -- and, attending to relevant similarities between those nearby cases and more distant cases, to extend your concern to the more distant cases.

    I see three primary advantages to extension over the Golden Rule (not that these constitute an exhaustive list of means of moral expansion!).

    (1.) Developmentally and cognitively, extension is less complex. The Golden Rule, properly implemented, involves imagining yourself in another's shoes, then considering what you would want if you were them. This involves a non-trivial amount of "theory of mind" and hypothetical reasoning. You must notice how others' beliefs, desires, and other mental states relevantly differ from yours, then you must imagine yourself hypothetically having those different mental states, and then you must assess what you would want in that hypothetical case. In some cases, there might not even be a fact of the matter about what you would want. (As an extreme example, imagine applying the Golden Rule to an award-winning show poodle. Is there a fact of the matter about what you would want if you were an award winning show poodle?) Mengzian extension seems cognitively simpler: Notice that you are concerned about nearby person X and want W for them, notice that more distant person Y is relevantly similar, and come to want W for them also. This resembles ordinary generalization between relevant cases: This wine should be treated this way, therefore other similar wines should be treated similarly; such-and-such is a good way to treat this person, so such-and-such is probably also a good way to treat this other similar person.

    (2.) Empirically, extension is a more promising method for expanding one's moral concern. Plausibly, it's more of a motivational leap to go from concern about self to concern about distant others (Golden Rule) than to go from concern from nearby others to similar more distant others (Mengzian Extension). When aid agencies appeal for charitable donations, they don't typically ask people to imagine what they would want if they were living in poverty. Instead, they tend to show pictures of children, drawing upon our natural concern for children and inviting us to extend that concern to the target group. Also -- as I plan to discuss in more detail in a post next month -- in the "argument contest" Fiery Cushman and I ran back in 2020, the arguments most successful in inspiring charitable donation employed Mengzian extension techniques, while appeals to "other's shoes" style reasoning did not tend to predict higher levels of donation than did the average argument.

    (3.) Ethically, it's more attractive to ground concern for distant others in the extension of concern for nearby others than in hypothetical self-interest. Although there's something attractive about caring for others because you can imagine what you would want if you were them, there's also something a bit... self-centered? egoistic? ... about grounding other-concern in hypothetical self-concern. Rousseau writes: "love of men derived from love of self is the principle of human justice" (Emile, Bloom trans., p. 235). Mengzi or Confucius would never say this! In Mengzian extension, it is ethically admirable concern for nearby others that is the root of concern for more distant others. Appealingly, I think, the focus is on broadening one's admirable ethical impulses, rather than hypothetical self-interest.

    [ChatGPT4's rendering of Mengzi's example of a child about to fall into a well, with a concerned onlooker; I prefer Helen De Cruz's version]

    My new paper on this -- forthcoming in Daedalus -- is circulating today. As always, comments, objections, corrections, connections welcome, either as comments on this post, on social media, or by email.

    Abstract:

    According to the Golden Rule, you should do unto others as you would have others do unto you. Similarly, people are often exhorted to "imagine themselves in another's shoes." A related but contrasting approach to moral expansion traces back to the ancient Chinese philosopher Mengzi, who urges us to "extend" our concern for those nearby to more distant people. Other approaches to moral expansion involve: attending to the good consequences for oneself of caring for others, expanding one's sense of self, expanding one's sense of community, attending to others' morally relevant properties, and learning by doing. About all such approaches, we can ask three types of question: To what extent do people in fact (e.g., developmentally) broaden and deepen their care for others by these different methods? To what extent do these different methods differ in ethical merit? And how effectively do these different methods produce appropriate care?

    Tuesday, January 16, 2024

    The Weirdness of the World: Release Day and Introduction

    Today is the official U.S. release day of my newest book, The Weirdness of the World!

    As a teaser, here's the introduction:

    In Praise of Weirdness

    The weird sisters, hand in hand,
    Posters of the sea and land,
    Thus do go about, about:
    Thrice to thine and thrice to mine
    And thrice again, to make up nine.
    Peace! the charm’s wound up.
    —Shakespeare, Macbeth, Act I, scene iii

    Weird often saveth
    The undoomed hero if doughty his valor!
    —Beowulf, X.14–15, translated by J. Lesslie Hall


    The word “weird” has deep roots in old English, originally as a noun for fate or magic, later evolving toward its present use as an adjective for the uncanny or peculiar. By the 1980s, it had fruited as the choicest middle-school insult against unstylish kids like me who spent their free time playing with figurines of wizards and listening to obscure science fiction radio shows. If the “normal” is the conventional, ordinary, and readily understood, the weird is what defies that.

    The world is weird -- deeply, pervasively so, weird to its core, or so I will argue in this book. Among the weirdest things about Earth is that certain complex bags of mostly water can pause to reflect on the most fundamental questions there are. We can philosophize to the limits of our comprehension and peer into the fog beyond those limits. We can contemplate the foundations of reality, and the basis of our understanding of those foundations, and the necessary conditions of the basis of our understanding of those foundations, and so on, trying always to peer behind the next curtain, even with no clear method and no great hope of a satisfying end to the inquiry. In this respect, we vastly outgeek bluebirds and kangaroos and are rightly a source of amazement to ourselves.

    I will argue that careful inquiry into fundamental questions about consciousness and cosmology reveals not a set of readily comprehensible answers but instead a complex blossoming of bizarre possibilities. These possibilities compete with one another, or combine in non-obvious ways. Philosophical and cosmological inquiry teaches us that something radically contrary to common sense must be true about the fundamental structures of the mind and the world, while leaving us poorly equipped to determine where exactly the truth lies among the various weird possibilities.

    We needn’t feel disappointed by this outcome. The world is richer and more interesting for escaping our understanding. How boring it would be if everything made sense!

    1. My Weird Thesis

    Consider three huge questions: What is the fundamental structure of the cosmos? How does human consciousness fit into it? What should we value? What I will argue in this book -- with emphasis on the first two questions but also sometimes touching on the third -- is (1) the answers to these questions are currently beyond our capacity to know, and (2) we do nonetheless know at least this: Whatever the truth is, it’s weird. Careful reflection will reveal that every viable theory on these grand topics is both bizarre and dubious. In chapter 2 (“Universal Bizarreness and Universal Dubiety”), I will call this the Universal Bizarreness thesis and the Universal Dubiety thesis. Something that seems almost too preposterous to believe must be true, but we lack the means to resolve which of the various preposterous-seeming options is in fact correct. If you’ve ever wondered why every wide-ranging, foundations-minded philosopher in the history of Earth has held bizarre metaphysical or cosmological views (I challenge you to find an exception!) -- with each philosopher holding, seemingly, a different set of bizarre views -- chapter 2 offers an explanation.

    I will argue that every approach to cosmology and consciousness has implications that run strikingly contrary to mainstream “common sense” and that, partly in consequence, we ought to hold such theories only tentatively. Sometimes we can be justified in simply abandoning what we previously thought of as common sense, when we have firm scientific grounds for thinking otherwise; but questions of the sort I explore in this book test the limits of scientific inquiry. Concerning such matters, nothing is firm -- neither common sense, nor science, nor any of our other epistemic tools. The nature and value of scientific inquiry itself rely on disputable assumptions about the fundamental structure of the mind and the world, as I discuss in chapters on skepticism (chapter 4), idealism (chapter 5), and whether the external world exists (chapter 6).

    On a philosopher’s time scale -- where a few decades ago is “recent” and a few decades from now is “soon” -- we live in a time of change, with cosmological theories and theories of consciousness rising and receding in popularity based mainly on broad promise and what captures researchers’ imaginations. We ought not trust that the current range of mainstream theories will closely resemble the range in a hundred years, much less the actual truth.

    2. Varieties of Cosmological Weirdness

    To establish that the world is cosmologically weird, maybe all that is needed is relativity theory and quantum mechanics.

    According to relativity theory, if your twin accelerates away from you at very high speed, then returns, much less time will have passed for the traveler than for you who stayed here on Earth -- the so-called Twin Paradox. According to the most straightforward interpretation of quantum mechanics, if you observe what we ordinarily consider to be a chance event, there’s also an equally real, equally existing version of you in another “world” who shares your past but for whom the event turned out differently. (Or maybe your act of observation caused the event to turn out one way rather than the other, or maybe some other bizarre thing is true, depending on the correct interpretation of quantum mechanics, but it’s widely accepted that there are no non-bizarre interpretations.) So if you observe the chance decay of a uranium atom, for example, there’s another world branching of from this one, containing a counterpart of you who observes the atom not to have decayed. If we accept that view, then the cosmos contains a myriad of different, equally real worlds, each with different versions of you and your friends and everything you know, all splitting off from a common past.

    I won’t dwell on those particular cosmological peculiarities, since they are familiar to academic readers and well handled elsewhere. However, some equally fundamental cosmological issues are typically addressed by philosophers rather than scientific cosmologists.

    One is the possibility that the cosmos is nowhere near as large as we ordinarily assume -- perhaps just you and your immediate environment (chapter 4) or perhaps even just your own mind and nothing else (chapter 6). Although these possibilities might appear unlikely, they are worth considering seriously, to assess how confident we ought to be in their falsity, and on what grounds. I will argue that it’s reasonable not to entirely dismiss such skeptical possibilities. Alternatively, and more in line with mainstream physical theory, the cosmos might be infinite, which brings its own train of bizarre consequences (chapter 7).

    Another possibility is that we live inside a simulated reality or a pocket universe, embedded in a much larger structure about which we know virtually nothing (chapters 4 and 5). Yet another possibility is that our experience of three-dimensional spatiality is a product of our own minds that doesn’t reflect the underlying structure of reality (chapter 5) or that our sensory experience maps only loosely onto the underlying structure of reality (chapter 9).

    Still another set of questions concerns the relationship of mind to cosmos. Is conscious experience abundant in the universe, or does it require the delicate coordination of rare events (chapter 10)? Is consciousness purely a matter of having the right physical structure, or might it require something non-physical (chapter 2)? Under what conditions might a group of organisms give rise to group-level consciousness (chapter 3)? What would it take to build a conscious machine, if that is possible at all -- and what should we do if we don’t know whether we have succeeded (chapter 11)?

    In each of our heads there are about as many neurons as stars in our galaxy, and each neuron is arguably more structurally complex than any star system that does not contain life. There is as much complexity and mystery inside as out.

    The repeated theme: In the most fundamental matters of consciousness and cosmology, neither common sense, nor early twenty-first-century empirical science, nor armchair philosophical theorizing is entirely trustworthy. The rational response is to distribute our credence across a wide range of bizarre options.

    Each chapter is meant to be separately comprehensible. Please feel free to skip ahead, reading any subset of them in any order.

    3. Philosophy That Closes versus Philosophy That Opens

    You are reading a philosophy book -- voluntarily, let’s suppose. Why? Some people read philosophy because they believe it reveals profound, fundamental truths about the way the world really is and the one right manner to live. Others like the beauty of grand philosophical systems. Still others like the clever back-and-forth of philosophical dispute. What I like most is none of these. I love philosophy best when it opens my mind -- when it reveals ways the world could be, possible approaches to life, lenses through which I might see and value things around me, which I might not other wise have considered.

    Philosophy can aim to open or to close. Suppose you enter Philosophical Topic X imagining three viable, mutually exclusive possibilities, A, B, and C. The philosophy of closing aims to reduce the three to one. It aims to convince you that possibility A is correct and the others wrong. If it succeeds, you know the truth about Topic X: A is the answer! In contrast, the philosophy of opening aims to add new possibilities to the mix -- possibilities that you hadn’t considered before or had considered but too quickly dismissed. Instead of reducing three to one, three grows to maybe five, with new possibilities D and E. We can learn by addition as well as subtraction. We can learn that the range of viable possibilities is broader than we had assumed.

    For me, the greatest philosophical thrill is realizing that something I’d long taken for granted might not be true, that some “obvious” apparent truth is in fact doubtable -- not just abstractly and hypothetically doubtable, but really, seriously, in-my-gut doubtable. The ground shifts beneath me. Where I’d thought there would be floor, there is instead open space I hadn’t previously seen. My mind spins in new, unfamiliar directions. I wonder, and the world itself seems to glow with a new wondrousness. The cosmos expands, bigger with possibility, more complex, more unfathomable. I feel small and confused, but in a good way.

    Let’s test the boundaries of the best current work in science and philosophy. Let’s launch ourselves at questions monstrously large and formidable. Let’s contemplate these questions carefully, with serious scholarly rigor, pushing against the edge of human knowledge. That is an intrinsically worthwhile activity, worth some of our time in a society generous enough to permit us such time, even if the answers elude us.

    My middle-school self who used dice and thrift-shop costumes to imagine astronauts and wizards is now a middle-aged philosopher who uses twenty-first-century science and philosophy to imagine the shape of the cosmos and the magic of consciousness. Join me! If doughty our valor, mayhap the weird saveth us.

    Friday, January 12, 2024

    Demographic Trends in the U.S. Philosophy Major, 2001-2022 -- Including Total Majors, Second Majors, Gender, and Race

    I'm preparing for an Eastern APA session on the "State of Philosophy" next Thursday, and I thought I'd share some data on philosophy major bachelor's degree completions from the National Center for Education Statistics IPEDS database, which compiles data on virtually all students graduating from accredited colleges and universities in the U.S., as reported by administrators.

    I examined all data from the 2000-2001 academic year (the first year in which they started recording data on second majors) through 2021-2022 (the most recent available year).

    Total Numbers of Philosophy Majors: The Decline Has Stopped

    First, the sharp decline in philosophy majors since 2013 has stopped:

    2001:  5836
    2002:  6529
    2003:  7023
    2004:  7707
    2005:  8283
    2006:  8532
    2007:  8541
    2008:  8778
    2009:  8996
    2010:  9268
    2011:  9292
    2012:  9362
    2013:  9427
    2014:  8820
    2015:  8184
    2016:  7489
    2017:  7572
    2018:  7667
    2019:  8074
    2020:  8209
    2021:  8328
    2022:  7958

    (The decline between 2021 and 2022 reflects a general decline in completions of bachelor's degrees due to the pandemic that year, rather than a trend specific to philosophy.)

    In general, the humanities have declined sharply since 2010, and history, English, and foreign languages and literature continue to decline.  This graph shows the trend:
    [click image to enlarge and clarify]

    The decline in the English major is particularly striking, from 4.5% of bachelor's degrees awarded in 2000-2001 to 1.8% in 2021-2022.  Philosophy peaked at 0.60% in 2005-2006 and has held steady at 0.39%-0.40% since 2015-2016.

    Philosophy Relies on Double Majors

    [Expanded and edited for clarity, Jan 15] Breaking the data down by first major vs second major, we can see that over time an increasing proportion of students have philosophy as their second major.  In some schools, the distinction between "first major" and "second major" is meaningful, with the first indicating the primary major.  In other schools the distinction is not meaningful.  In the 2021-2022 academic year, 24% of students who took a bachelor's degree in philosophy had it listed as their second major.

    [click image to enlarge and clarify]

    From these numbers we can estimate that philosophy students are at least moderately likely to be double majors.  While it's impossible to know what percentage of students who took philosophy as their first major also carried a second major, a ballpark estimate might assume that about half of students with philosophy plus one other major list philosophy first rather than second.  If so, then approximately half of all philosophy majors (48%) are double majors.  Overall, across all majors, only 5% of students double majored.

    The ease of double majoring is likely to influence the number of students who choose philosophy as a major.

    Gender Disparity Is Decreasing

    NCES classifies all students as men or women, with no nonbinary category and no unclassified students.  Since the beginning of the available data in the 1980s through the mid-2010s, the percentage of women among philosophy bachelor's recipients hovered steadily between 30% and 34%, not changing even as the total percentage of women increased from 51% to 57%.  However, the last several years have seen a clear decrease in gender disparity, with women now earning 41% of philosophy degrees.

    [click image to enlarge and clarify]

    Black Students Remain Underrepresented in Philosophy Compared to Undergraduates Overall, and Other Race/Ethnicity Data

    NCES uses the following race/ethnicity categories: U.S. nonresident, race/ethnicity unknown, Hispanic or Latino (any race), and among U.S. residents who are not Hispanic or Latino: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, White, and two or more races.  Before 2007-2008, Native Hawaiian or Other Pacific Islander was included with Asian, but inconsistently until 2010-2011.  The two-or-more races option was also introduced in the 2007-2008 academic year, again with inconsistent reporting for several years.

    I've charted these categories below.  As you can see, for most categories, the percentages are similar for philosophy and for graduates overall, except that non-Hispanic White is slightly higher for philosophy and non-Hispanic Black significantly lower. In 2021-2022, non-Hispanic Black people were 14% of the U.S. population age 18-24, 10% of bachelor's degree recipients, and 6% of philosophy bachelor's recipients.

    [as usual, click the figures to expand and clarify]

    I interpret the sharp increase in multi-racial students as reflecting reporting issues and an increasing willingness of students to identify as multi-racial.

    It's also worth noting that although philosophy majors are approximately as likely to be Hispanic/Latino as graduates overall, Hispanic/Latino students are underrepresented among bachelor's degree recipients relative to the U.S. population age 18-24 (17% vs 23%). Non-Hispanic American Indian / Alaska Native students are also underrepresented among overall graduates (0.46% vs. 0.84% of the population age 18-24), and maybe particularly so in philosophy (0.37% vs 0.46% in the most recent year).