A machine-generated book and an algorithmic author

David Beer
Towards Data Science
5 min readApr 14, 2019

--

The comparisons are obvious. In Orwell’s 1984 a machine is used to churn out books for distribution to the masses. Thoughtless, turgid and repetitious, these grey books feed a lifeless culture. It was hard not to think about that image when discovering that artificial intelligence had been used to write an academic book. This automation of writing, we are told, is a rapid and efficient new way to capture large fields of knowledge.

With the exception of the 33 page preface, the book has been ‘automatically compiled by an algorithm’. This algorithm was developed through a collaboration between the publisher, Springer Nature, and the Applied Computational Linguistics lab based at Goethe University Frankfurt/Main. The full book, Springer Nature’s first ‘machine-generated book’, can be downloaded for free.

In place of an author, the book’s cover is adorned with the name of an algorithm: Beta Writer. It’s interesting that they gave this machinic author a forename and surname, presumably so that it doesn’t look too out of place on the cover.

Beta Writer’s style could be described as long-winded and monotone — with four mammoth chapters of lumbering description carved into the content. The chapter titles provide a hint of the lack of finesse. The fourth chapter heading, ‘Models, SOC, Maximum, Time, Cell, Data, Parameters’, is typical of the flat sub-analytic approach. Analysis and subtlety, admittedly, are clearly not the aim. This is a book solely targeted at mechanised summary.

As Springer Nature put it, the ‘book prototype provides an overview of the latest research in the rapidly growing field of lithium-ion batteries’, adding that it ‘aims at helping researchers to manage the information overload in this discipline efficiently’. The book and its algorithmic author Beta Writer are presented as a means of scoping across a large field, providing researchers with the means to overcome the bewildering and overwhelming volume of material found on specific topics. Although I wonder how much the reader’s sheer boredom will make its content inaccessible.

This is not so much writing as an exercise in grouping bits of information. It couldn’t be described as curating, there isn’t enough selection or narrative going on. Instead it uses a ‘clustering routine to arrange the source documents into coherent chapters and sections’ from which it ‘then creates succinct summaries of the articles’. The question then is whether clustering is authorship, or if a new category of writing and of book is being produced here. One question is whether this bundling of information from a field advances anything. Books don’t just need to find patterns, they need to colour them in and shape the constellations they encounter.

This development is not intended to end with this single book, it’s also a marker of an intended future direction. Niels Peter Thomas, Managing Director Books at Springer Nature, has suggested that with this technology the publisher is ‘aiming at shaping the future of book publishing and reading’. Thomas claims that ‘new technologies around Natural Language Processing and Artificial Intelligence offer promising opportunities for us to explore the generation of scientific content with the help of algorithms’. There is a future orientation here, a suggestion that this is the start of a new branch of literature. The human authored preface to the book also has the sense of this book as triggering something rolling in an uncertain direction:

‘As with many technological innovations we also acknowledge that machine-generated research text may become an entirely new kind of content with specific features not yet fully foreseeable. It would be highly pre-sumptuous to claim we knew exactly where this journey would take us in the future.’

There is a vision of an unkown future in which, it is presumed, machine-learning is anticipated to bring unknown whilst yet seemingly certain benefits. Reflecting on what this all means for the role if the author, the human preface adds a little more flesh to that potential future:

‘We foresee that in future there will be a wide range of options to create content—from entirely human-created content to a variety of blended man-machine text generation to entirely machine-generated text. We do not expect that authors will be replaced by algorithms. On the contrary, we expect that the role of researchers and authors will remain important, but will substantially change as more and more research content is created by algorithms.’

The vision here is of variations of human, machine and hybrid texts. Another way to see this is as authorship being placed across a spectrum of automation. It seems that Beta Writer may yet have co-authors. Underpinning this is the idea, a celebration in fact, that a new automated agency will transform knowledge and writing.

In broad terms being an academic is already increasingly algorithmic, this development is another step in a raft of ongoing shifts. This needs some critical reflection. Like those other moves toward automation, this algorithmic authorship is also sold as a move toward evermore enhanced efficiency, convenience and so on. Perhaps the biggest surprise is not that AI can write a book like this, but that they that it can do it more cheaply than a real human academic.

As AI advances it might be worth returning repeatedly to the question of why. In this particular case, what is the machine adding? We can ask if this a way of advancing knowledge and understanding, or if it is merely a mechanical packaging and distribution system. It seems it is not so much an exercise in thinking as it is an attempt at automated information cataloguing and repackaging.

It would be easy to jump to the conclusion that this signals the automated death of the author, with them being replaced by faster and more comprehensive machines, but I suspect that by demonstrating just what an author is needed for they will do the opposite. The things this book lacks actually highlight what an author brings. I’ve no doubt these systems will advance quickly and might be able to replicate authorship more closely in the future, but I wonder if this is desirable. As it stands, this mechanical content lacks insight whilst also being hard to read. To make any field digestable, even in the form of a summary, some vitality is needed.

The Data Gaze: Capitalism, Power and Perception was published in December and is available in paperback and ebook.

If you are interested in technology, media and culture, details of a free weekly newsletter can be found here.

--

--

Professor of Sociology at the University of York. His most recent book is The Tensions of Algorithmic Thinking.