Mattias Rost

Associate professor in Interaction Design

From Autonomy to Intent Alignment

Posted on 2025-06-20

I’m picking up on a shift in the narrative around AI agents. It’s subtle, but it’s there. For the past year or so, the dominant story has been about autonomy. Agents that act on our behalf, automate our workflows, and handle tasks end-to-end without our involvement. There’s been talk of 2025 as “the year of the agent”, a moment when AI systems would begin to replace human effort at scale. But what I see emerging instead is something quieter, and potentially more transformative: a move away from autonomy and toward human alignment. Not agents that replace us, but systems that collaborate with us. Not full delegation, but intent alignment through interaction.

The Autonomy Narrative

Autonomy has been the dominant framing. Agents were imagined as machines that could act independently, completing tasks without human input. Essentially replacing us in certain workflows. The appeal is obvious. You tell the machine what to do, and it gets it done. The engineering challenge has been to make these agents robust enough to handle edge cases, interpret instructions correctly, and recover when things go wrong. But this vision also leans heavily on the idea that human involvement is a bottleneck and something to be removed.

But I think this framing is starting to break. What I’m seeing instead is a narrative slowly turning away from full autonomy and towards something more nuanced: the idea that machines can now handle more complex tasks, but not all tasks. And not in isolation. The hard part isn’t just getting the machine to do the thing. It’s getting it to do the thing in a way that makes sense to us, in context, as part of an ongoing process. This turns the problem from an engineering problem into a design problem.

A Shift Toward Collaboration

This shift isn’t loud. It’s not dominating headlines. But it’s meaningful. It is closer to something we’ve seen before: augmentation instead of automation. But even augmentation doesn’t quite capture it. There’s a difference between tools that make us better at what we do and systems that can interpret intent, generate meaningful results, and adapt in conversation with us. It’s less like using a better tool and more like working with a new kind of collaborator.

And that changes the kind of problem we’re dealing with. It’s no longer just about getting the technology to work. It’s about how we work with it. This isn’t just an engineering problem. It’s a design problem. But not interface design in the traditional sense. This is closer to interaction design as relationship design. How we build patterns of engagement, feedback, and co-responsibility. The machines are becoming more capable of producing outcomes that make sense to us directly. That opens up a different kind of design space, one that feels closer to how we design workflows between people.

Systems That Make Sense in Use

This kind of interaction, where the system responds not just to commands, but to context, to intent, to ongoing feedback, starts to resemble how we work with other people. It’s not that the machine understands us in any deep human sense, but that it can interpret enough of our intent to stay in sync with what we’re trying to do. That’s new. And it opens the door to rethinking how we design for human–machine collaboration. Not as a question of interface layout or control, but of coordination, mutual adjustment, and shared activity.

It’s not completely new, of course. There’s been work on AI co-creation for years. Especially in the arts, and more recently in software development. But I think this way of thinking needs to move beyond those domains. If we’re serious about “agents” as the next step in AI, we need to stop imagining them as little autonomous workers and start thinking of them as collaborators. Partners in a process. Not general intelligence, not independent actors. But systems that become useful through interaction.

Intent Alignment Through Interaction

If anything, the move toward “autonomous agents” has masked how much interpretive labor is still required to make these systems actually do what we want. What’s happening now is that more of that interpretive work is being folded into the system itself. Not perfectly, but increasingly well. That’s why I think this moment is not about achieving autonomy, but about deepening collaboration. It’s about aligning intent through interaction.

If this is where things are headed, then the real question isn’t “how autonomous can we make agents?” but rather “how do we want to interoperate with them?” What kinds of interactions support alignment? What kinds of feedback loops actually help the system understand what we mean? These are not just technical challenges, but questions for interaction design.

From Capability to Sensibility

And perhaps most importantly, they’re questions of sensibility. Because not everything can or should be handed off to a machine. There are forms of judgment, care, attention, and context-awareness that aren’t easily captured in prompts or goals. Machines can accomplish a surprising number of tasks now, but that doesn’t mean they can step fully into the human roles those tasks once sat within. That’s why I think this shift is important: it’s a move away from pretending machines can replace us, and toward exploring how they can work with us in meaningful ways.

So let’s not think about autonomy when thinking about agents. Think about collaboration. Think about designing for co-creation, for systems that stay in the loop, interpret intent, and contribute to the work, without ever stepping outside the relationship. And I think this is the world we’re already entering. If you’re not already co-creating with AI, you probably should be. Because that’s not just the future. That’s the shift that’s happening right now.

Beyond Generation - Why “Interpretive AI” Is a Better Name for LLMs

Posted on 2025-05-27

The phrase Generative AI (or GenAI) has rapidly become the go-to term for the class of machine learning systems capable of producing media: text, images, music, and video. It is catchy, easy to remember, and broadly descriptive. However, when applied to large language models (LLMs), the term does more harm than good.

LLMs like GPT-4o, Claude, and others do not simply generate plausible text. They interpret. They respond. They situate their outputs in relation to complex and evolving input contexts. A better name, I argue, is Interpretive AI.

What Makes LLMs Different?

Generative image or music models typically synthesize outputs based on a latent space learned from training data. The act of generation is largely one-directional. These systems do not interpret a user’s sketch or melody in order to generate an image or song. They generate based on a prompt.

By contrast, LLMs are dialogic. They process language in a context-sensitive manner. Each token is generated not in isolation but in relation to the user’s input and the model’s own previous outputs. The system must keep track of what has been said, what is being asked, and what expectations are implied. It does not simply produce. It engages.

Interpretation as Core Operation

This engagement is not trivial. For a model to maintain coherence across a conversation, to stay on topic, answer questions, offer relevant elaborations, and handle ambiguity, it must perform a form of ongoing interpretation. This interpretation is not semantic in the human sense, but it is functional. The model must infer structure, intent, relevance, and tone in order to continue the interaction in a way that humans find meaningful.

In this light, calling it merely generative fails to account for its interpretive doing. These models work by continuously negotiating meaning in a co-constructed context with the user.

Why Terminology Matters

Terminology shapes understanding. When we call something generative, we focus on its outputs. When we call it interpretive, we shift focus to its relational, responsive, and situated behavior. This is not a minor semantic difference. It influences:

  • How we design interfaces
  • How we set expectations for use
  • How we regulate AI systems
  • How we think about responsibility and agency

Interpretive AI: A Clearer Frame

The term Interpretive AI highlights that these systems do more than generate statistically plausible outputs. Their responses are shaped by how they condition each token on a dynamic context, consisting of previous inputs, prior outputs, and linguistic structure. This ongoing adjustment allows them to maintain relevance, coherence, and tone in ways that appear attuned to the evolving interaction, even though they lack understanding in the human sense.

This shift in framing opens up a richer vocabulary for discussing how such systems behave, how they misinterpret, and how they mediate human communication.

Philosophical Considerations

I propose this term intentionally provocatively. Interpretation traditionally implies human understanding, grounded in a condition of being-in-the-world. LLMs do not understand in this sense. But meaning and understanding are not located within the system. Instead, they emerge in relation, unfolding between human and model. In this view, interpretation is a co-operative, situated process, enacted through interaction. Through its ongoing token generation, the LLM opens up a space of possibilities, participating in a kind of relational meaning-making. Interpretive AI is therefore not a claim about internal cognition, but about the model’s relational role in negotiating understanding as a dialogical partner.

Docent Lecture

Posted on 2025-03-20

This morning I gave my lecture for my docent application at the faculty of science and technology at university of gothenburg.

I explained that a computer is a simple device that can take input and produce output. Using it, we can make them do things. They can be cumbersome to get to do certain things, but once they can do them it is often trivial to replicate.

I went through four examples from my research, where I have had to make the computer do: buttons, maps, steps counts, and screen time.

With AI, this may come to change. Computers are great at instructing themselves, and capable of understanding us on our own terms. What does this mean for the future? Will there be developers? Will there be apps? Will there be computers as we know them? Or will we just have machines, and they will do things we ask them to do.

I took this opportunity to talk about where I see things going next, and what I think could be an interesting future worth exploring.

Watch the video on youtube here: https://youtu.be/5In2Zn5hx_w

Talk about ChatGPT

Posted on 2023-03-24

I recently made a public talk at a local meetup, based on the previous blog post. It was recorded and can be watched below.

I start off by describing ChatGPT as a wheel of fortune where every spin generates a new word based on a probability distribution conditioned on the given prompt.

I then talk about how new technology tends to be used to do the things we already do in the ways we already do them (in terms of Marshall McLuhan). I explain this based on the notion of bounded rationality.

I then show how people have been trying to use ChatGPT as an alternative to existing technology, and how it often fails as an existing technology.

I finish by showing how we tend to anthropomorphise ChatGPT because the interface is chat, and that it again is a bad idea to treat it as something with human traits.

To conclude I explain that the way we try and use it is normal, and that it will take some time before this technology find its own use. And when it does, it will be become incredibly powerful. We should therefore start using it now and experiment and explore what this is, and how it may enable us to do new things, in new ways.

Watch the video on youtube here: https://www.youtube.com/watch?v=lOfsDaZsh1o

 

New tech bad at old stuff

Posted on 2022-12-28

From the rise in popularity of ChatGPT and language models in general, I see two camps. Those who try and use ChatGPT as Google and find it gives bad results and thus think it's useless. And those who start using it in new ways to do things they didn't have technology for before and finds that it is super useful.

When new technology emerges, people have a tendency to try and use it for things we already do. The first TV programs were theatre plays. Marshall McLuhan talked about how it takes time before new technologies gets used for what the new technology affords, rather than just being used to do what we already can do. Mobile phones are not the same thing as the stationary phones in a household. Household phones did not allow for micro coordination, but required times and places to be agreed upon ahead of time.

ChatGPT and language models in general can do things we previously were not able to do. So we have a tendency to try and use it for what we already do. But what is interesting is not how well it can do things we already have good technology for. What is interesting is what new things it can do.

Previous tech could not generate a script for a book. LM can. They could not be used to write unit tests for already written code, or solve programming tasks. LM can. Current tech could not generate plausible text on a topic. LM can.

To evaluate the usefulness of new tech, it is not very relevant to ask how well it can do things we already can do, but to what new things it may allow us to do. And if this new thing is useful, then surely the tech is.

How bad it is at other things, become rather uninteresting. Don't use it for that.

In closing, here is what ChatGPT thinks Marshall McLucan would say about this new technology:

One aspect of large language models that might have interested McLuhan is their ability to process and generate human-like language, which he would likely see as an example of how media and communication technologies extend and amplify human capabilities. McLuhan believed that media and communication technologies have the power to transform society and culture by changing the way people think, communicate, and interact with the world around them.

It's possible that McLuhan might view large language models as a significant advancement in the development of artificial intelligence and as a tool that has the potential to transform how people communicate and interact with each other and with machines. He might also have seen the emergence of large language models as an example of how media and communication technologies are constantly evolving and shaping the way we understand and engage with the world.