A few thoughts about Humane’s Ai Pin

Tech Life

As I was gathering some notes for this piece and reading other people’s takes, I found myself in a very similar position as Jesper’s.

His article begins as follows:

The Humane Ai Pin has been announced, a phone alternative trying its best to not be a phone in any way. Humane famously spearheaded by ex-Apple luminaries Imran Chaudhri (with large amounts of the iPhone and multi-touch user experience to his name), Bethany Bongiorno (a Director of Software Engineering from the launch of the original iPad) and counting among its ranks Ken Kocienda (part of the initial Safari/WebKit team and designer of the first software keyboard and typing autocorrect), I’m finding myself wondering what I’m missing. 

There are a lot of people in Humane’s personnel who have an impressive background in understanding user interfaces and human-machine interaction. There are a few aspects of the Ai Pin I find fairly interesting, and I think they nailed both the hardware design of the device, and its wearability. I also think that the concept of the Pin, in its most abstract sense, is pretty intriguing. I’ve always believed that technology and machines should serve people and adapt to their needs, instead of the other way round. So, when at the beginning Humane vaguely hinted at working on something following this philosophy, my interest was piqued. I think smartphones have done a lot of good to society, but also a lot of harm when it comes to the interpersonal sphere. And whenever I’ve had the time to be at my most contemplative, I’ve often thought about what is the next step beyond smartphones. That’s why I was very interested in Humane’s intents and projects. Again, these people are not amateurs at all. Let’s hear them out.

When mentions of this upcoming device being fully ‘AI’-powered started to appear, my interest started waning a little. But wait — I said to myself — perhaps Humane has found an innovative way to make ‘AI’ work. Some kind of left-field implementation. Who knows. 

Then the Ai Pin was announced and demoed. And my reaction was just like that famously memetic GIF of the Star Trek character Jean-Luc Picard, with his resigned, frustrated facepalm.

For me, the most ironic aspect of what doesn’t work with the Ai Pin is that it underwhelms in the two main departments I least expected it to underwhelm: user interface and human-machine interaction. 

I know nothing about the thought process the people at Humane went through to bring forward the idea of the Pin, but I suspect that a lot of analysis about how people use and interact with smartphones was involved. They must have asked themselves, What can we do to go beyond this?What do we feel is wrong with the way people interact with their phones, and what can we do to improve things?What kind of human-device interaction can make things smoother and frictionless but also make the relationship less device-centric, less addictive? — You know, questions like these. It’s rather clear to me that they wanted to come up with a device that could be out of the way as much as possible but also be as useful in assisting users so that they would not miss using a smartphone.

This intent, this design, is worthy of praise. This is difficult territory. I know well all the little things that annoy me about smartphones, the way they’re used, the way I use mine, and I’m sure everyone has their peeves. But if someone asked me point-blank what kind of device or interface or interaction I would create to solve the issue, to make things better, less tech-addictive and more human-focused, I wouldn’t know what to say.

If I were given more time, I’d probably try to start with something people would familiarise with in no time and interact with in ways that are even easier and more intuitive than taking out a smartphone and fiddling with it. Maybe this, too, was something Humane considered in their brainstorming sessions. And maybe the Ai Pin is what they consider a good answer to such proposition.

But from what I’ve seen in the demonstrations showing how to use the Ai Pin, this device doesn’t seem all that intuitive and easy to pick up, and it doesn’t seem to be more frictionless than using a smartphone. It also faces the basically insurmountable challenge of winning over people who have accumulated years of smartphone-centric habits. But even I — who am terrible at marketing and predicting technology trends — understand that people only change their habits if the reward is receiving more comfort and convenience. The Ai Pin brings awkwardness in every sense of the term, and little else.

Oversimplifying, the Ai Pin is like having a Siri‑, Alexa‑, or Google Assistant-powered smart home device always with you. You wear it. The main form of interaction is you asking it to do things for you or to retrieve information you need. You have to talk to it. There is no really tangible interface. Any visual interaction may happen with a UI that is laser-projected on the palm of your hand. While you’re reading this, do this quick thought-experiment and ask yourselves, Would I seek to purchase such a device solely based on this description? Would I ditch my iPhone or Android phone for such a device? Yeah, I thought as much. 

Ever since Siri was introduced in 2011, I talked about how fundamentally sceptical I am regarding voice-only interaction with this type of virtual assistants. (It’s hard to find short quotes, but you should go back and read Siri’s fuzziness and friction from 2015 and more importantly A few stray observations on voice assistants from 2018).

  • They’re essentially black boxes, which is a real problem when it comes to feedback. Can I just talk to them in plain language? Do I need to use some kind of formulaic pattern so that my requests have a higher chance to be recognised and acted upon? Does the Assistant understand concatenated questions? (You ask question 1. Assistant responds. You ask question 2 based on the Assistant’s response to question 1. Is the Assistant still ‘following’ you or has it reset?) How does the Assistant handle ambiguity in language and speech?
  • This can lead to friction in the interaction, and I suppose things are not that different from what happens with Siri already (which has been happening since Siri appeared): like I wrote in Siri’s fuzziness and friction, “Siri is the kind of interface where, when everything works, there’s a complete lack of friction. But when it does not work, the amount of friction involved rapidly increases[…]”
  • Another by-product of being black boxes is their reliability. Both regarding how they handle communication failures, and regarding how reliable, i.e. trustworthy, is the information they relay. In these products’ demo videos everything happens flawlessly. In real life, virtual assistants misunderstand you more often than not. Like my dad had suggested in the conversation I reported in my afore-linked piece A few stray observations on voice assistants, “Reliability must be put first with these assistants. They ought to understand you at once, and if they don’t, they ought to allow you to correct them as quickly as possible. Otherwise they’re just like that subordinate at the office who is supposed to help you do the work, but he doesn’t understand or misunderstands what you want him to do, and you end up doing more work to fix the misunderstandings.” 
  • The Ai Pin requires a lot of trust on the part of the user. The user must be comfortable wearing a device which essentially constantly monitors its surroundings. And, as hinted above, the user has to trust any response coming from the Pin. Showing that you can hold some almonds and ask the Pin whether you can eat them or not is a cool interaction. But should we trust its response to be factual and correct? Some have already pointed out that in a few usage examples of the Ai Pin made by Humane, the Pin gave incorrect responses, which isn’t exactly trust-building.

Before this list grows and grows, let’s stop for a moment and focus on what I consider the fundamental point of failure of this device (and other similarly-working assistants): people don’t like and don’t want to interact with devices via voice commands and voice-based interaction. They just don’t. There are exceptions at the extremes of the spectrum, like spoiled tech bros on one side, and people with disabilities on the other. But the vast majority of regular people find this kind of interaction awkward, fatiguing, uncomfortable (especially in public) and ultimately inefficient. For the past ten years or so I have accumulated a fair amount of data through personal observation but also through repeated surveys targeting different demographics in the part of the world where I live, and the results have never changed. Only a negligible sample of people use these virtual assistants with some frequency. The vast majority still prefers doing things themselves: setting timers, choosing and changing music, looking for places to eat, checking their schedule, finding the shortest route to their destination. In other words, they like to be and feel in control, they find that taking out their smartphone or tablet and checking things themselves is way quicker than asking stuff to a virtual assistant the right way so that they can extract a meaningful response, and they really really don’t like talking out loud to an inanimate object.

I was talking with a friend recently about this subject, and during our chat another important aspect of this kind of voice-based interaction came to light — and it further explains why a lot of people find it fatiguing. It’s information retention on the part of the user. In the Ai Pin showcase page on Humane’s website, the Catch me up feature is presented like this: Simply say “Catch me up,” and your Ai Pin does all the work of sifting your texts and calls to give you the essence of what you need to know — and saving you precious time for what’s important. The response may vary according to how busy you are and what’s going on at the moment in your life, but I suppose that when you prompt the Pin this way, you’ll still receive a fair amount of information. How much of it will you actually remember? 

I don’t think I’m alone in preferring to go through my stuff myself, using a device where the information is presented clearly and can be interacted with easily and directly, and take note of what’s important. It may take longer, but I end up retaining more information in the process. It’s a more satisfying experience. Perhaps I may miss something, but given the black box nature of these devices, how sure can you be that they have caught everything there was to catch?

The Ai Pin makes the same conceptual mistake behind all the assistants that preceded it: to treat all people as if they were so utterly helpless and clueless to manage even basic stuff. And to grossly miscalculate which tasks people find tedious and willing to delegate to a machine. These assistants want to assist with stuff people have no problem doing themselves, and they do so through an interaction model that ultimately makes things more awkward, impractical, and longer to accomplish. (On the other hand, it’s a good interaction model for people who have different types of motoric or visual disabilities and need assistance when sending and receiving messages, collecting information, etc.). 

Perhaps it’s too early to say, but I strongly feel a certain similarity between the Ai Pin and Google Glass. Like I previously noted in A few stray observations on voice assistants:

It has been pointed out how Google Glass has turned out to be a failed attempt as a general-purpose device aimed at the general public, but a more successful one in limited, specialised applications and environments. I believe voice assistants have started with the wrong foot […] I think that if voice assistants had been originally designed having people with disabilities as first and sole target audience (instead of lazy tech dudes), and then gradually extended to everyone else, today they’d be a bit better. 

And:

[T]here’s a big difference when your goal is to develop a tool that makes your life-as-an-able-bodied-person easier (read: spoiled) instead of a tool that makes the life of a disabled person more tolerable. Your able-bodied person’s ‘friction’ is bullshit compared to the real friction of a person with any disability. A useful virtual assistant is one that, first and foremost, addresses a few crucial types of impairments. Design with that in mind, give precedence to solving problems related to the interaction between a person with impairments, develop against those, test against those, then worry about perfectly healthy twenty-somethings who are too inconvenienced to manually select the music they want to play. 

Instead there’s this urge to create The Next Big Thing that will be a hit for everyone, everywhere. And to create it in one fell swoop, skipping all the steps that might help you really get there.

And this insistence on treating ‘AI’ (in quotes, because artificial intelligence doesn’t exist) as a panacea for everything is as misguided as it is tiring. In wanting to feed these hungry ‘AI’ Black Boxes with all kind of data, and especially personal, sensitive data, we are quickly and surely creating that Big Brother George Orwell warned us about in his novel 1984. A novel that, I feel, is more cited than actually read and understood.

At the end of the day, like my friend and I were saying in our chat, what comes after the smartphone has to be something that it’s better, more pleasant to use, easier to interact with, more efficient in use, and providing an even more fulfilling experience. A device like the Ai Pin doesn’t fit this description, at least in its current state.

Back to his piece, Jesper wonders:

The Humane Ai Pin didn’t happen by chance and was not lazily extracted from between the couch cushions. A lot of talented people spent a lot of time at it, clearly chasing a deep vision.

So why does it seem so terribly, undeniably off?

My theory — and it’s just that, a theory — is that this final product isn’t exactly the embodiment of Humane’s original idea. That’s what feels off to me, for the most part. I may be completely wrong about this, but the more I look at it, the more I feel that Humane had a much more ambitious concept in the design phase than essentially putting Alexa in an iPod shuffle, but the technology they would have had to put inside it was perhaps still out of reach, or it would have been so expensive to implement and deploy that they would need to give the final product a price tag so ridiculous nobody would buy it. At $700, the Ai Pin is already a hard sell as it is.

The Author

Writer. Translator. Mac consultant. Enthusiast photographer. • If you like what I write, please consider supporting my writing by purchasing my short stories, Minigrooves or by making a donation. Thank you!