AlgoTransparency, Cookies and Middleware

Earlier this week I came across AlgoTransparency, an interest group founded by Guillaume Chaslot who many people will recognise from The Social Dilemma. AlgoTransparency has commendable aims, they wish to:

  • Raise public awareness about the lack of transparency provided by the world’s most significant algorithms.
  • Influence international regulators with regards to policy approaches.
  • Pressurise the owners of significant algorithms to make changes to the way they operate.

This is an agenda I can get behind. End users should be provided with simple explanations about how the algorithms that power any given app or website work. And these explanations should satisfy both the need to understand the goal the algorithm has, for instance “time on site”, and (to the extent that it’s possible) the process that led to its final decision.1

There is a comparison to be made here with the EU’s fated “cookie policy”. While many people (myself included) lament the annoyance of consent pop-ups, it’s hard to deny that their presence has had a significant impact on the public’s understanding of how they are being tracked as they surf the web. And while it’s unlikely that many people spend time reading the descriptions of the different cookies a website uses, the mandated openness has created a new dataset for examination by journalists and privacy researchers. An effect I would expect to be replicated with greater algorithm transparency.

Importantly, the cookie policy enables users to opt-out of a website’s “non-essential” cookies. In the process of trying to do this you’re likely to come across all sorts of dark patterns that attempt to trick you into submission, but it should still be possible, if not, the website’s owner is breaking the law. There is no such option with algorithms. Partly because many of the algorithms we come into contact with provide essential functionality, but also because the website/app has a perfect monopoly over which algorithms you use.

I raise these points not because I think AlgoTransparency has picked the wrong fight – transparency will lead to a greater public understanding about algorithms and this is a worthy cause – but because transparency can only ever take us so far.

As I find myself regularly writing at the moment, algorithm choice, or more broadly, a marketplace for middleware, should be the direction we are headed. By creating choice for algorithms we diffuse the power they hold to multiple actors while retaining much of the value that comes with large platforms. I would also argue that a marketplace for middleware furthers the transparency cause because it makes transparency a vector by which consumers can vet their choice of algorithm. Developers who are open about the nature of their algorithms and the goals that its success is measured against, will surely be more appealing than those who choose to hide this information.

Another idea that I continually return to is that the successful regulation of tech/platforms/algorithms will be made up of many overlapping changes rather a single headline one. Break ups diffuse power, but aren’t going to make it easier to compete against network effects that protect YouTube. Full protocol interoperability will support new market entrants but would likely come with new kinds of risk to personal data. Algorithm transparency does little to alter the distribution of power, but will make developers more accountable.

I wonder if taking a more “agile” approach to regulation is what is needed, smaller changes, call them tests, that try to target specific problems rather than change everything in one go. I’m no expert on the history of regulatory approaches but my gut tells me that this isn’t a strategy that many policy makers would consider.

1 I add this caveat as it's important to remember that full algorithm transparency may not be possible. As Jenna Burrell points out in her excellent paper on opacity within machine learning.

Social Media and Middleware

I had a long conversation last week with some friends who were less than convinced by the idea of algorithm middleware. Their argument was that creating choice for algorithms within social media platforms would increase the likelihood of people seeing harmful content: hate speech, disinformation, lies, bullying, phishing attempts — that kind of thing.

To sufficiently address this point I think it’s necessary to distinguish between the different editorial processes that are involved in publishing content on a social media platform. I see three broad areas:

  • Censorship, meaning the proactive and reactive removal of content from the platform.
  • Labelling, meaning the placement of notices next to content, ostensibly “fact checking”.
  • Ranking, meaning the ordering of content in a user’s feed.

If you assume that the developer of a middleware algorithm becomes responsible for each of these processes, then yes, creating a middleware market would increase the chances of harmful content being seen. In fact, it is inevitable that someone would create a system with minimal, if not zero, editorial oversight, something that would unleash a pretty nightmarish world on its users.

This consideration presents a more challenging flaw in the middleware model in my opinion. Could a middleware developer realistically handle the volume of censorship (moderation) currently carried out by social platforms today? It’s unlikely. Facebook’s Safety and Security budget for 2019 was reported to be nearing $4BN, a cost that would critically reduce the potential entrants to any market.

So what to do? I can see a solution coming from a few sources.

We could mandate that moderation remains the responsibility of the platform owner. If they are benefitting from the advertising revenue, it could be argued that they should foot the bill for moderation. This does nothing for our concern about centralised power, however.

An alternative would be to make the state responsible for moderation, probably through a statutory corporate body or QUANGO like Ofcom (the UK’s telecoms regulator) or the British Board of Film Classification (BBFC). A hybrid of the two could also work. For example, a British Board of Social Media Moderation (BBSMM) could create a framework that the platform was required to adhere to, enforceable by spot checks and fines. This body could also oversee the welfare of content moderators (assuming some were employed locally) as the challenging working conditions they face is well documented.

Whether these rules apply on a user level, meaning the moderation applies wherever the user happens to be located. Or they apply on an IP level, wherever the user accesses the platform is an interesting question, but not an insurmountable one.

With censorship taken care of two editorial processes remain, labelling and ranking, both of which are less labour intensive and therefore more likely to result in a dynamic middleware market.

Some labelling could be handled by algorithms written by a middleware developer, for example, labelling any content that includes references to Covid vaccines. However, some labelling may require human oversight, for example fact checking organisations like FullFact rely on a team of human checkers.

Fact checking services that rely on human input will not keep up with the velocity of social content. However, real-time fact checking need not be the goal. A user could choose to opt-out of content that has been flagged for fact checking, it could be labelled as requiring fact checking, or they could be shown the unchecked content but notified if it is labelled retrospectively. Once again, this is why choice is critical. Decisions as impactful as the few described above should not be left solely in the hands of the platform owner.

Ranking is the process most suited to a middleware marketplace. Sequentially it follows censorship and labelling meaning all the content available to rank has been cleared for publication. It’s then down to the user’s chosen algorithm to decide how the content available should be prioritised based on the preference data it has. This is not to say that this is an easy task, but that it is one more suited to a software only solution, one that could written by a single developer, in an ideal world, the end user themselves.

Returning to the opening question: how does a middleware marketplace reduce the impact of harmful content on social media? The short answer is that it doesn’t. But this misses the key point of the idea. The goal is to reduce the political power of social platforms, not clean up your timeline.

I think these changes can be part of a broader conversation about we regulate content on social platforms, but that conversation must start with the assumption that allowing social platforms to do it on our behalf is unacceptable. Users, in concert with the state, should have the power to decide what version the world they want to be exposed to.

Roon, Algorithmic Choice and Platform Middleware

My search for a better way to stream music led me to a service called Roon last week. Roon is great, but rather than write about that, I want to write about what Roon can tell us about algorithmic choice and the notion of platform middleware.1

Let’s start with the idea of abundance. As Albert Wegner puts in The World After Capital:

Digital information is already on a clear path to abundance: we can make copies of it and distribute them at zero marginal cost, thus meeting the information needs of everyone connected to the Internet.

This abundance has created a new economic challenge. Rather than the challenge of how we distribute scarce resources to a large group of people, we now have to decide which resources people should consume given a (near) infinite supply, and, in some cases a relatively limited amount of time to consume it. Some of these decisions are somewhat banal, for example, which movie or TV show I should watch next? But not all are like this. Which hairdryer should I buy might seem trivial, but when an algorithm decides which product is ranked first (and thus more likely to be bought) we are giving it an extraordinary amount of power. Similarly, when we allow an algorithm to decide which news article comes first, or which person’s voice is heard the loudest.

It’s tempting to view algorithms as an exogenous force. Something independent of human biases and thus well placed to make important decisions such as the ones described. However, this couldn’t be further from the truth. Algorithms are socio-technical systems that, even with the best intentions, are imbued with all the biases inherent to humankind. They can be presented as independent while in reality promoting the agenda of their creator.

Centralised control over algorithms with power of this kind is something that society must reject. But that still leaves us with a new challenge of the second industrial revolution: in a world of abundance, how do we decide what people consume?

The state, peer recommendation, organisations, human editors and many other modes of discovery will play their role (as they always have), but algorithms powered by data are part of our future too. If we agree that algorithms represent new forms of power, and aren’t comfortable with the centralisation of that power, something needs to change.

Algorithmic choice is one idea that requires closer examination. If we are able to choose the algorithm that sorts, filters and flags content on our behalf the inherent power of algorithms becomes more diffuse. Algorithmic choice within e-commerce may allow someone to preference specific brands, products that were made locally or ones that meet certain environmental criteria. Similarly, within social media it could allow for a plurality of moderation services or attempt to make ones feed more representative of public opinion and less of a filter bubble.

Fortunately, these ideas are beginning to gain traction. Jack Dorsey, the CEO of Twitter, has voiced support for a “marketplace” of algorithms, and at the end of last year Francis Fukuyama released a report that called for the creation and enforcement of algorithmic middleware providers.

This brings me to Roon. First, consider how music streaming platforms like Spotify and Tidal choose to tackle the problem of abundance that we have just highlighted. From their growing libraries of 70M+ songs they present us with ideas of what to listen to next. If you’re like me, you find this grating, as you want to browse the music yourself, but if you like the recommendations you’re limited to the platform’s technology. And, as we already know, those recommendations come with the biases / economic goals of their creators. Now consider how Roon operates. After creating my account, Roon asks me to connect it to my streaming provider (Tidal or Quboz only I’m afraid). Roon then imports my entire collection and begins to re-organise the music based on their discovery interface. Similarly, it introduces a whole new set of algorithms to surface music based on my existing collection and based on things it thinks I’ll like.

The critical difference however is their economic incentive. Roon’s model isn’t based on micropayments dished out to the holders of music copyright. You can rent it off them for £12.99 a month, or your can purchase a lifetime license for £600.

As an aside, I really like the trend towards offering lifetime access for a single price, but I’ll save that for another post.

Roon is beholden to me, their only customer. Their goal should be to make my listening experience as enjoyable as possible. Tidal, Spotify et al. might argue that this is their goal too, but the truth is that as a multi-sided platform (or aggregator as Ben Thompson likes to call them) their motivation is not entirely aligned with mine.

It’s worth noting that the Room model creates a significant degree of disintermediation, or, in non-technical terms, I no longer use the Tidal apps to listen to music. Tidal may not be best pleased about this, but at the same time they are still receiving the £29.99 a month I pay them for access to the music library, so the impact on their business is not universally bad. Social media companies may be less comfortable with this outcome however as they are “free” to consume but monetised via ads. If I no longer access their front end, their revenues quickly drop to zero. This isn’t a blocker for the middleware model however, it just requires the platform to integrate a choice of algorithmic middleware themselves.

A basic middleware flow diagram

For those interested in algorithmic choice and the middleware model, Roon, and its relationship with Tidal / Quboz, is worthy of closer examination than a Friday morning blog post can afford. For my money, Roon is one of the best examples of commercially successful platform middleware available today. And only exists due to a quirk of history that has created multiple layers of rights over how digital music is consumed.

It’s probably obvious from my tone that I think algorithmic choice has the potential to untangle many of the messy issues the last 20 years of platform development has left us with. It is not a panacea however, we’ll need many new tools to fix the world we have today and however good I think this one is, it remains just one.

1 If you're interest to learn more about Roon I'd recommend reading this overview from What HiFi as it's not a service that can be explained that easily.

Making my blog open source

Over the weekend I read a post from Brandur about the need for the need for blogs to degrade gracefully over time. It resonated with me for a number of reasons. First, because there are blogs I remember reading at that are near impossible to find archives of now. Second, because my own archive is pretty poor too.

It’s the podcast I hosted while at university between 2005 and 2006 that I am most disappointed about losing, however. We used a service called LibSyn, an early podcasting platform, to store and distribute the show, but a few months after I moved to London we stopped podcasting and payments to the platform lapsed. The podcast’s old URL now just 404s, and there is no account associated with any of our email addresses.

So, in the spirit of not letting this happen again, I’m taking Brandur’s advice and publishing my Github repo. Should this blog ever be of “significance” finding an archive of it should be easy enough, and, given that all the content is written in Markdown, extracting the actual words will be pretty easy too.

Using Roam and Scrivener

UPDATE: It appears that in order for this to work correctly you need to have your Roam instance set to public. Following a conversation with someone in tech support at Literature and Latte (the team behind Scrivener) I was told that their web page files are “not able to handle authentication” and will error if the page you link to is not publicly viewable.

Since starting my master’s programme Roam has become an indispensable tool. I use it as my research diary and for more general note taking. The bidirectional links are incredibly powerful, allowing me to easily link ideas together and create new pages on the fly for the purpose of revisiting at a later date. Then there are the slash commands, the knowledge map and lightning fast search. This post is by no means a review, but I thought it was worth mentioning how good I think the app is for starters.

Another new app I’ve fallen for is Scrivener. Scrivener is a word processor in that it competes with the likes of Word and GDocs, but is different from then in that it’s targeted at people who are writing long form content. Novelists, screenwriters, academics etc. What’s nice about Scrivener is that it has a dedicated research section, the content of which is ignored when you come to compile your final document. This has been really useful as I can take notes from other papers within Scrivener when working on my own.

This process of making research notes within Scrivener leads to some duplication. In reality, I want my research notes in Roam, so I find myself either copying earlier Roam notes into Scrivener, or, taking the original notes in Scrivener and transferring them the other way when the essay is finished. Fortunately, I found a hacky way to get the best of both worlds.

Launching Roam in a Scrivener web view

It’s possible to launch Roam inside the research section of your Scrivener project using a web view. You’ll be able to navigate your Roam database just as you would within a standard browser, as well as create / edit pages and copy content across for your writing.

To do this right click on the research folder, select ‘Add’ and then ‘Web Page…’, punch in your desired Roam URL, sign in, and away you go.

A screenshot showing how you add a web view to Scrivener

I recommend that you name the Scrivener file after the name of the Roam page you link to. This is because each time you navigate away from the Scrivener file it has to reload the Roam page when you return. While this may seem frustrating to begin, after using it for a while I find it’s actually preferable. It means that each Scrivener file in your research section is dedicated to a specific Roam page, rather than being a general link to your entire Roam database. If you want to have access to more Roam pages, create new Scrivener files, just as you would if you were creating notes in Scrivener without Roam.

A screenshot of Roam working inside Scrivener

The only regular frustration I run into is when I create a new Roam page inside Scrivener, only to lose that page when I return to my writing. The quickest way to combat this is to open the page menu inside Roam, click ‘Share link to this page’ and then create a new Scrivener file using the URL that was just copied to your clipboard. It’s imperfect, but like I said, this is a hack.

Using the page menu in Roam to copy the page's URL

Other than this, the occasional crash from Scrivener, and failed load from Roam, I’ve found this system to be pretty handy. It allows me to keep all my note taking centralised within Roam, and avoids the need to switch between different apps. The later I find particularly handy given that retaining focus is one of the aspects of writing I find hardest; accessing Roam in this way means I can work for long periods of time with only Scrivener open on my Mac.

Hopefully this is of use to you. Or, if you find a better way to achieve the same thing, be sure to let me know.