AI in the cozyweb

TLDR

Google search and power-law-follower-count social networks have driven a convergence of the global information hierarchy, a context collapse that gives an appearance of objectively-ordered information. In the search for authentic expression, people are defecting away from these globalised spaces to the “cozyweb”, a loose archipelago of private chats and forums, where context is restored and communication is more meaningful.

AI will change this in two ways: it will fill the public web with generated content, further diminishing the possibility for human connection in public online spaces; and it will accelerate the development of private online cultures by providing them with the ability to ask, answer, and discuss questions about which the AI has information, but within the idioms and norms of their culture.

What if the web has become an increasingly hostile environment for people to express themselves? What if, following that, honest and sincere discussion has migrated away from the public web, to loose networks of private chats and pseudonymous forums?

This is the thesis of the “cozy web”, originated at Ribbonfarm and explained here by Maggie Appleton. As she puts it, “[w]e create tiny underground burrows of Slack channels, WhatsApp groups, Discord chats, and Telegram streams that offer shelter and respite from the aggressively public nature of Facebook, Twitter, and every recruiter looking to connect on LinkedIn”.

The web is characterised as a “dark forest”—like a forest at night, prey creatures hide in the trees and burrows, while only predators remain awake, lurking in wait for something to move. On the web, we’re the prey, and we’re learning not to make too much noise. We can still share the inner details of our lives, but we will do so only in private chats, approval-required locked accounts, or on alt accounts which provide us with enough pseudonymity that we can maintain some separation between our public and private selves.

Social networks are no longer places to share pictures of what you had for breakfast, but sites of culture war that can, seemingly, drive political movements and determine elections; what you post can lead to you losing your job, or not being hired at all; many governments now ask for social media account details as part of the visa-granting process; every link you click is a chance for someone to log your activity and sell the right to market things to you. We’re being watched, whether we like it or not.

In a follow-up article, Maggie asks, in effect, what happens when the people are gone? When the retreat to the cozy web is complete, and the dark forest is all that is left of the public web?

In Maggie’s view, the gap left behind is going to be filled by generative AI. More and more of the content you see on public websites will not be the output of individual humans trying to create something that other people might want to read, view, or listen to. Instead, it will be—wholly or in part—fabricated by generative AIs. In the very short run, this will involve real people turning to ChatGPT to ghost-write their blog posts or even their tweets. Sooner or later we will move beyond this, with virtual AI personas producing content under their “own” names. Because it’s so cheap to produce, there can be lots of it. Because it’s a good enough facsimile of human communication, it will be hard to identify. Many of us will be fooled, or will not care if we’re being fooled, because we like what the AI is telling us.

There are, of course, up-sides to this. Some people will like the content that is produced, and there will, after all, be lots of it. Perhaps some otherwise unprofitable niches will be filled. But outside of pure entertainment, it does seem like we’re still going to care about who produces the things we consume, and perhaps even how or why they do so. We would feel quite differently about something produced by an AI than something produced by an identifiable human, or group of collaborators.

The rest of Maggie’s essay examines ways in which humans can respond. In particular, how can we pass the “reverse Turing test”, proving to strangers that we’re not computers, but real humans trying to communicate? She makes several suggestions, and although no single idea is likely to solve the problem, they are each applicable in some circumstances, and could be combined to be more effective. She discusses this further on Bryan Kam’s podcast, and Bryan dives into the philosophical implications with Isabela Granic in a later episode.

I find that I agree with Maggie’s big-picture analysis: conversation will continue to migrate from the public web to the cozy-web, at accelerating rates. However, I have a different take on the role that AI will play in this process. In my view, AI is going to be deeply embedded within the cozy-web.

We can very loosely define the public web as being “anything that might show up on a Google search”. “Can you Google it?” is a reasonable proxy for whether a thing has objective existence, and PageRank even gives us the appearance of an objective global ranking of the importance of different sources for a topic. Being “#1 on Google”, or at least on the first page of search results, has been a goal for SEO marketers for over two decades now. Those top spots are worth a lot of money. And, with only few exceptions, the results are the same for everyone.

In truth, Google does vary the order of search results, based on broad criteria like location and language preference. “Right to be forgotten” laws mean that some results appear in some places and not others. There is some individual-level personalisation, but not much. The reason is simple economics: it costs a lot of money to crawl the entire web and to compute the relative importance of each search result, and it would cost a lot more to personalise those results for each individual.

The near-objectivity of Google search has a kind of inherent value to it, as it reinforces a social consensus about which sources of information are important (which is not always the same thing as true). You can tell someone “I found it on Google by searching for [x]”, confident that if they do the same search then they will find the same result. This is useful even if the results aren’t individually optimal.

However, with objectivity comes a kind of impersonality. Google for you is the same as Google for me. Google answers medical queries in the same ways to consultant surgeons as it does to their patients. In “organizing the world’s information and making it useful”, Google also flattens it, discarding valuable context. Part of the appeal of the cozyweb is that what you do and say there isn’t flattened in this way; the denizens of the cozyweb are expected to have a shared context, and can talk to each other using short-hand phrases, memes, and private jargon. This allows for better communication for simple information-theoretic reasons: the more context two agents share, the more meaning they can communicate in the same number of bits. Communities differ not just in their aesthetics, but in their ethical and epistemic norms, and Google has no meaningful concept of these differences.

For this reason, it seems like AI is much better-suited to the cozyweb. To Google, the cozyweb doesn’t really make any sense. Leaving aside the fact that Google’s web crawler can’t even see inside your group chat, there’s something about the structure of private communities that is intrinsically difficult to fit into the model of a global information hierarchy. An AI, however, can be trained and fine-tuned on the specific patterns, idioms, and perspectives of a community. This means that the AI can both know more about individuals, or about community norms. The AI-for-doctors would answer medical queries in a different way than the AI-for-patients would.

This works because of the concept of foundation models. A foundation model is an AI model that is trained on a very large data set, with the aim of producing generally-useful capabilities. GPT-4 and StableDiffusion are foundation models, aiming to be generally good at a whole class of things. However, it’s possible to take these foundation models and fine-tune them by adjusting their weights. The LoRA algorithm has shown that once you have a foundation model, the cost of fine-tuning it is much lower than the initial training, and results in a model that is much better at producing certain kinds of text or images than the foundation model.

This lets AIs do the thing that Google search can’t easily do: adapt to the specific context of a community or an individual. By fine-tuning a model with specific information, the community can create the AI that is most useful to them. Two AIs starting from the same foundation model could end up giving very different answers to different people, and this divergence is a feature rather than a bug. Whereas the universality of Google drives convergence at the cost of context collapse, fine-tuneable AIs diverge in order to give the most appropriate outputs.

For this to work, it has to be possible to fine-tune the model, and right now this isn’t always possible. Open source models like Stable Diffusion are fine-tuneable, even on a home PC (with a beefy GPU, admittedly). Closed-source models like GPT-4 aren’t fine-tuneable by users, although it seems that Microsoft was given access to fine-tune the model for use in Bing Chat (with some… interesting results).

This gives us two ways forward: one in which the fine-tuning is done by a company like OpenAI, which requires a lot of personal or community data; the second, in which individuals and communities control their own data, and the fine-tuned models. The second requires open source models like Stable Diffusion.

If we were back in the 90s, hearts and minds brimming with optimism for the bold future of the internet, I imagine we wouldn’t worry too much about the difference between the two. But if the intervening quarter-century has taught us anything, it’s that ownership and control of data matters. “An AI in every chat room” becomes a very different proposition if that AI is either open or closed-source, with the fine-tuning data owned and controlled the same vendor that created the foundation model or by the provider of the community’s choice.

The other big difference from the 90s is that the commercial and business implications are much easier to grasp now. In a way, the prototypical cozyweb is the corporate intranet; corporations are definitely going to want AIs that are specific to their culture and operations. If you want to build a business around fine-tuning foundation models for specific communities, your first customers will be enterprises who care about keeping their corporate DNA private.

For individuals and communities, perhaps this is the moment when decentralised technology becomes feasible. If your community has at least a few gamers in it, then you probably have the collective GPU power to fine-tune your models—and within a few years it might be possible to do this on smartphones.