Re-inventing the Research Text

There’s been a sustained conversation for a while now about how tech will impact the ways research is produced, read, and propagated. With the advent of complex digital books, for example, researchers will finally be able to store the wealth of raw data and sources they collected during fieldwork, and make it immediately accessible to anyone who wants more information, instead of forcing them to go online and digging through files.

But innovations like this take the book itself (as it currently exists) for granted. Even though it doesn’t quite strike us so in our everyday lives, the book is a profoundly unnatural way of presenting information to others. It requires all the relevant information, regardless of subject, complexity, and source type, to fit within a linear text of typically a few hundred pages. A fascinating question to consider is how the reading experience could change if we were willing to alter the book’s linearity itself.

Consider for example the set of texts that proceed axiomatically, that is, by building an elaborate deductive system from a set of basic assumptions. I have in mind works like Newton’s Principia, Wittgenstein’s Tractatus, and Spinoza’s Ethics. I can’t speak for the authors, but for most of us who attempt to read these today, understanding what’s being said usually means frantically flipping back to the various theorems proven before in order to put them together in a way that makes the later theorems intelligible. The biggest hurdle to faster learning here is the linearity our current books impose on us. Smart ebooks could change this, and there are already some indications of how this can be done away with.

A PhD student at Boston College, John Bagby, created visualizations of the entirety of Spinoza’s Ethics, with each node representing a proposition.

Clicking on a node reveals its connections to other nodes, and also brings up a dialogue box which will state all the relevant propositions (the one selected, the parent and children propositions). Just like that, the linearity that was taken to be constitutive of our reading experience for centuries is shown to be a constraint, and the visualization makes the connections far easier to pursue. That isn’t to say that reading Spinoza becomes easy, but it’s undeniable that this would make the text tremendously more accessible for both beginners attempting to read it as well as for experienced researchers hunting down some obscure subtlety.

As ground-breaking as this is, an obvious drawback is that very few books lend themselves to be transformed in this particular manner. But we shouldn’t be too quick to dismiss its relevance. For far too long we’ve been asking ourselves what the next big idea will be. Perhaps it is time to acknowledge that the future isn’t about a single all-encompassing idea but many ideas, pushing in many different directions. For such a future, however, tech companies will have to stop thinking in terms of delivering a single, clear-cut solution, and instead think in terms of platforms capacious enough to allow different authors, designers, and publishers push the envelope in their own ways, on their own terms.

Moving away from the "stupid" e-book: An opinionated survey of our options

Earlier this year, Hachette CEO Arnaud Nourry’s remarked that the ebook was a stupid product, since "it is exactly the same as print, except it’s electronic. There is no creativity, no enhancement, no real digital experience." While shocking in its honesty, it also prompts the obvious question: what would a non-stupid ebook look like?

When contemplating how technology can alter the future, there are two risks to look out for. The first is the false positive, where we fantasize recklessly about tech which actually isn’t the revolutionary game-changer it is imagined to be. The second is the false negative, where we are insufficiently sensitive to the potential of something before us. And that’s not even taking seriously the role of sheer luck in making or breaking a product. Still, speculate we must, and so we might as well do it with full self-awareness about the risks undertaken. So what could the next wave of ebooks consist in?

Custom Books

One obvious-seeming answer is to point to personalization. While we might even one day have tech capabilities for this, I’m still quite skeptical about how popular this would be.

For one, we already have some idea of what personalization could look like. Companies already provide services where they insert names into fixed slots in books, allowing you, or anyone you choose, to be the protagonist of the story. An intriguing idea, but also one that strikes me as a gimmick which anyone would tire of fast. Admittedly there’s some more space for children’s books to innovate in this regard, for example how “Put me in the story” incorporates photos of kids in the books they read, but again I’m not sure if the trend can outlast the novelty factor.

Interactive books which draw from video games, where the reader has to choose how the plot proceeds and what the character should do, will also be possible. But we already have video games, and plus if I wanted to “do things”, I would just go outside. Unless books can somehow deliver on adventure that the cutting edge video gaming industry cannot, this sort of personalization will be unlikely to gain much purchase in the market either.

Perhaps the most radical possibility is that of books custom written for an individual based on interests and favorite genres. With the wealth of information about ourselves we store online, anyone brave enough to give access to a publisher might be able to get a book version specifically written for them! I can conceive of this taking off, but even here I suspect all might not be well. A large part of the book reading experience apart from the actual reading consists in listening to others talk about it, talking about it online and in person with friends, reviewing it and reading the reviews of others, and above all arguing over minute details with others who love/hate it just as passionately. In other words, there are social aspects and rituals predicated on all of us reading the same book, which would be lost if all of us were reading different versions. So even if this kind of personalization were possible, our shared culture of reading might have to change considerably, and not necessarily in a positive way.

Interactive Books

A far more promising approach is the incorporation of multi-media in books, that can include audio, video, gifs, maps, AR, and VR. The application in travel guides and books on far away places is obvious, and I can’t wait to use books that let me access how various locations actually look before booking a vacation, or perhaps even more importantly, to give a sense of distant places to those who aren’t able to make it there just yet. And children, who’ve shown themselves quite susceptible to the charms of youtube, will probably be delighted at having their dull school exercise books being guided by Dora the explorer (or someone else less likely to violate copyright).

An unexplored avenue for multimedia is how other genres might find surprising potential. In high fantasy, for example, it is common for maps to be provided at the beginning of the book, and have characters traverse it during the story. To be able to explore these maps immersively while reading, to get a sense of how the journey proceeds, could enhance the experience significantly (and I might have spent far less time flipping back and squinting at Tolkien maps as a teenager).

The desire for a multi-media experience isn’t restricted to children, of course. When the distinguished philosopher G. A. Cohen delivered his Valedictory Lecture at the age of 67, he sent his colleagues a CD recording along with the text of the lecture itself, with a note saying, “please don’t read the text except when listening to the CD, because the text is much less funny unspoken.” And who knows what other applications might be found?

As promising as these enhancements are, some caution is in order. Ever since Our Choice, Al Gore’s “first feature-length interactive book” from 2001, there have been predictions about the rise of the interactive book, and these have failed to materialize. What this shows us, I think, is that while there is definitely space for enhancing the reading experience, I don’t think readers necessarily want the core experience itself transformed. As fun as map immersion would be, when it comes to the reading itself I still want uninterrupted text, with the enhancements brought up only when desired, and typically desired rarely. For all the talk about change, I can’t really imagine giving up the experience of sustained reading itself.

The fully interactive text then looks like a false positive, something that seems like an obvious game changer, which instead fizzled out. The ability to rotate a windmill in Our Choice by just blowing on the screen, while a cool party trick, has very little use for readers. And having videos disrupt reading is distracting, especially after the novelty wears off.

But what I wouldn’t dream of suggesting is that the eBook, as it is, is the insurmountable pinnacle of innovation. Nourry is right, the current ebook really is stupid! But at least part of the reason for this languishing is that we’ve been a little too taken with tech capabilities instead of asking whether readers would actually find their experience made better over the long-term. What publishing needs is a tech philosophy which doesn’t allow current reader preferences limit change, but also one which pays attention to where readers actually are with regard to their habits and needs. Luckily, the now burgeoning industry of publishing-specific tech might mean we could have a truly smart eBook sooner than anyone might suspect.

Why Rights and Licensing Automation is Essential to a Publisher's Bottom Line

An Interview with Jane Tappuni, General Manager, IPR License

The rights department is not an area in which publishers tend to invest, and yet, it’s one of the key areas of the industry with untapped revenue opportunities. With most rights deals still handled via paper contracts and one-to-one communication between editors and rights holders, it can be a slow process. Furthermore, it’s hard for publishers to have an accurate accounting of what rights they hold (and sometimes when a license runs out or rights revert to another party), how to monetize those rights against current market trends, and even more difficult to generate a quick deal in order to free up time for more complicated rights deals that may require more thoughtful consideration.

Enter technology. In a blog post earlier this year, we discussed rights deals, smart contracts, and illustrated how we thought they might be useful to publishers; “For publishers, the world of contracts unfortunately continues to be predominantly ruled by paper, creating a lag in transactional payment and royalty collection. But, that doesn’t have to be the case going forward.”

By automating systems in the rights department, using tools which generate smart contracts that can be resolved and signed in a matter of moments, a publisher can not only increase their revenue but also have a better understanding of the marketplace to make better acquisitions in the future. So, why are publishers so hesitant to adopt technology into the rights department?

Jane Tappuni, an expert on the frontlines of the rights and licensing industry, and General Manager, IPR License, deals with publishers and rights every day. As a platform built to discover, buy, and sell international rights online, IPR License deals daily with the challenges publishers face in this brave new technological world. We asked her to weigh in on how technology can help publishers…or not.


Jane Tappuni, IPR License

PageMajik: How will smart contracts help publishers?

Jane Tappuni: The smart contract can be built onto the blockchain and allow for the IP to be transacted or in simple terms for the creator to make money. Smart contracts help you exchange something of value in a transparent, conflict-free way while avoiding the services of a middleman. In publishing this could mean a better way to transact rights by taking the information out of the publishing organizations and into a blockchain with smart contracts attached that allow for the rights sale to take place.

PageMajik: Do you see smart contracts significantly changing the way publishers handle rights and licensing in the future or will it be a slow adoption over many years in particular sectors of the industry?

Tappuni: Yes I think there is an opportunity to change and improve the way rights and licensing is handled via a blockchain and smart contract solution. This is a massive behavioural shift from using internal, siloed systems into a shared verifiable database of sorts. This change in behaviour could take a long time.

PageMajik: When you work with publishers, what have been their biggest concerns about adopting technological improvements in their business?

Tappuni: Their biggest concern is value for money, return on investment is always the number one concern.

PageMajik: Do you see any downside to publishers relying on technology to help improve their business?

Tappuni: Not as long as publishers choose the right technology tools for the problem they want to solve. All too often organizations implement new software to repeat the processes they already have in place. New technology implementations are a good time to really think about process improvement.

PageMajik: With the adoption of smart contracts to secure rights transactions and track royalties, providing more revenue for publishers and freeing up staff to focus on other work, how do you see the international rights and licensing industry changing? Will there be additional challenges to overcome?

Tappuni: I see this as a possible solution to a huge problem of rights tracking. At the moment publishers use a variety of rights solutions to store their rights data some good and some not so good. This would take the rights storage data out of the silo publishing systems owned by IT and into a secure, accessible arena. The day-to-day role of a rights professional would not change as they would still be performing a rights sales role but using a global blockchain solution as a positive tool to give rights ownership data.


Jane Tappuni has more than 20 years of publishing experience and is currently the General Manager (consulting) at IPR License, a place to discover and buy international rights and permissions online. IPR License is owned by The Frankfurter Buchmesse, Copyright Clearance Center and the China South Publishing & Media Group. Jane is a specialist in publishing technology, with a focus on transactional IP management and solutions and also a graduate of the Oxford University Said Business School Blockchain Strategy Programme.

Is the Science behind AI just Alchemy?

In Primo Levi’s celebrated short story collection The Periodic Table, the story titled “Chromium” illustrates how our collective ways of behaving incorporate procedures whose justification no longer apply over time. For example, when he worked in a paint manufacturing company, he found that a certain batch of paint had turned solid due to an accidental excess of chromium oxide. In response, he added ammonium chloride to the paint to make it liquid again, and recommended to continue doing so until that batch was used up. He then left his job, but when he returned 10 years later, he found that people were still adding ammonium chloride despite the bad batch having long been replaced: "And so my ammonium chloride, by now completely useless and probably a bit harmful, is religiously ground into the chromate anti-rust paint on the shore of that lake, and nobody knows why anymore."

According to AI researcher Ali Rahimi, something analogous is happening in the field of AI research today. Last December, he argued that the use of machine-learning algorithms had become a form of alchemy since the researchers developing and using them don’t know why their algorithms work and why they don’t.

Algorithms are tweaked and tested using trial and error to generate success against benchmarks, but it really isn’t possible to pinpoint whether the success is due to the core algorithm or if the peripheral add-ons were doing all the heavy lifting. Rahimi thinks this is an unhealthy state of affairs and urges greater attention to explanations and finding root causes. He must have been onto something, because his talk received 40 seconds of standing applause from the audience.

Not everyone agrees with Rahimi, however. According to Facebook’s Yann LeCun, Rahimi is fundamentally wrong because while understanding is certainly good wherever you can get it, understanding often only follows the creation of methods, techniques, and even tricks. To then insist that the creation of new technology only takes place where understanding is possible, would be to cripple innovation. He even makes this claim concrete by arguing that this was precisely why neural nets didn’t get the attention they deserved for over ten years.

Still, I get a sense that both Rahimi and LeCun are arguing past each other, because there’s no indication that Rahimi wants the kind of comprehensive understanding that would stifle innovation, as much as a more rigorous approach to avoid pitfalls. In a recent paper, for example, he calls for measures like

  • Breaking down performance measures by different dimensions or categories of the data

  • Full ablation studies of all changes from prior baselines should be included, testing each component change in isolation and a select number in combination.

  • Understanding of model behavior should be informed by intentional sanity checks, such as analysis on counter-factual or counter-usual data outside of the test distribution.

  • Finding and reporting areas where a new method does not perform better than previous baselines.

These are clearly not intended to stop progress, but to ensure a more sustainable model of growth. Still, the question of whether this will actually generate better results is one that cannot be answered through armchair philosophy — we’ll simply have to give these methods a shot and see if they prove fruitful.

The State of Automation - Part 4

During previous weeks we’ve been analysing the impact automation and disruptive technologies will likely have on the publishing industry. We’ve explored the innovations on the horizon and how the different roles in book publishing will be affected by them in the short, mid and long-term future.

Automation will have a massive impact on publishing, there is no doubt whatsoever about that. But whether this impact is negative or positive depends greatly on the industry response. Will publishers let innovation happen to them? Or will they act quickly to understand how new technologies work and can be applied to their organisations, then evolve their working practices and reskill their workforce accordingly?

In The Book Industry Study Group’s “State of Supply Chain” survey conducted earlier this year, 33% of respondents said they were somewhat or very concerned about the potential to be replaced by technology or artificial intelligence. This week, in our final post of this four-part series, we look at survival and what publishers, and those who work in the industry, can do to confront the new reality of what many are calling the fourth industrial revolution.

Knowledge is power

If the last 20 years have taught us anything it’s that rapid innovation can, and will, gobble you up if you’re not prepared for it. And most industries have suffered, some more than others, at the hands of disruptive technologies they were completely ignorant about and ill-prepared to respond to. This is a lesson we all must learn from.

Publishers, who traditionally tend to adopt a rather cautious approach to new technology, will need to know exactly what is around the corner when it comes to automation. Not knowing will mean not being able to respond quickly enough when the world around them is transforming at break-neck speed.

Publishing houses which are aware of these developments, those prepared to take an open-minded approach and start to experiment, and those proactively seeking ways to use automation to their benefit, will automatically be in advantageous positions.

Humans are (still) essential

A survey conducted by Evolve in 2016 revealed that the most in-demand skills in the workplace are “the ability to work cooperatively, flexibly and cohesively”. These soft skills are areas where humans usurp robots (well, at least for the next 15 years, which is when experts are predicting computational power will equal the human brain). Recognising this is key.

While AI will do a fantastic job at automating a variety of tasks, in most cases the incorporation of AI technology is at its most powerful when it interacts with humans and benefits from the creativity, imagination and judgement of the human brain. To this end, being able to harness automation-driven technology and play to its strengths but also to align it with human capabilities, will give publishers an edge.

In the real world, this can be applied in the editorial department, for example, where AI can be used to do the heavy lifting when it comes to proofing manuscripts, but the process will still need to be overseen by human eyes. Or in the production department where AI can be applied to a great deal of production tasks, however taking judgement calls and making business-critical decisions on print runs, for example, will still need to be made by humans.

Next gen workforce

Many believe that in a world of automation the only people who will survive will be those who came out of the womb coding and that only employees with an intimate understanding of the latest tech will be of any use in the future. Although rather exaggerated, this is to some extent true. As technology will play a much more influential role in our working lives, job seekers who are tech savvy and can prove that they have the ability to work alongside the latest innovations will always have a cutting edge.

However, on the other side of the coin another view is that widespread automation will make those who have heightened emotional intelligence and a softer skill base more in demand, as reflected by this article on “automation resistant skills” in the BBC.

Either way, it’s highly likely that those who present an innate understanding of technology and a willingness to work with it, while also demonstrating a range of emotional skills will be the most likely to thrive in an automated workplace, and it is these types of candidates who will be most valuable to publishers.

Automation is going to change book publishing as we know it beyond all recognition. It will be as gradual as it will be sudden. It will be as beneficial as it will be damaging. Publishers will flourish and perish, and employees will gain and lose. This is what has happened during every major period of disruption since the dawn of time. But the industry has a small window of opportunity to at least learn about how the publishing business might be affected and what sort of steps can be taken to exploit opportunities afforded by automation as opposed to getting left behind.

Ignore the Headlines and Embrace the Bots

In 2018, bots became even more prevalent in the marketplace. According to a study by Distil Networks, a leading bot security company, almost half of web traffic (42.2%) in 2017 was not human. Though some may find this trend surprising or even alarming, bot traffic has been growing consistently for the last five years as more companies add bots into their workflow systems. What has proven a growing concern, according to the media, is the influx and rise of “bad bots.”

Bots can be incredibly helpful by processing mundane or repetitive tasks and allowing humans the opportunity to do more creative, thoughtful work. Bots have been adopted to conduct customer service tasks, help curate individual products for users, among other activities. But, there are also bad bots, which were first taken note of when used to buy tickets online and then offer the same tickets for a much higher resale value. These bots are also responsible for stealing personal information, social media harassment, disrupting the marketplace, and, in the largest show of bot activity, potentially impacting the 2016 US presidential election. The presence and prevalence of bad bots is increasing too, with bad bot traffic up 10% last year, slightly outflanking good bot traffic (21.8% of total web traffic is bad bot vs. 20.4% for good bots).

What makes bots unique is that they tend to mimic, and mimic very well, human behavior. That is what makes bad bots particularly difficult to battle, because they are often very difficult to detect. The existence and growing pervasiveness of bad bots adds to the public concern about the implications of artificial intelligence and whether or not AI can “turn against humans.”

But, like with any technology, security and defense systems are being developed to thwart bad bots. The first legislation, the Better Online Ticket Sales (BOTS) Act passed in September 2016, was to deal with the aforementioned ticket-buying bots (though this continues to be a problem despite the legislation). An op-ed in Fortune earlier this year, calls for both private security implications and government intervention through creating or updating additional legislation that would levy heavy fines and penalties for those parties creating bad bots.

Some in the technology world are leading the charge against bad bots, including Twitter questioning 9.9 million accounts thought to be spam or bots, creating more sophisticated authentication procedures, and preventing an average of 50,000 spam/bot accounts a day from being set up.

Though bad bots are a problem and a threat to the marketplace, they should not overshadow the use of good bots to increase efficiency, improve systems, and analyze data in a variety of industries. Headlines scream that bots are bad, but, in reality, half of the bots out there are refining processes, allowing for further creativity, development, and increased revenue.

As Harley Davis, Vice President, France Lab and Decision Management, IBM Hybrid Cloud writes in a February blog post, “Businesses need solutions that assist in automation rather than simply fulfilling it, handle tasks intelligently and are highly autonomous. These solutions also must deliver customer-centric and personalized experiences, at enormous scale, without a massive back-end operation to prop them up.” The next generation of bots will not simply conduct mundane, repetitive tasks, they will be able to adapt as a company grows and changes, taking on each challenge intelligently. Being able to have a system that fluctuates as goals and needs change is crucial to progress and advancement as the marketplace transforms.

#CockyGate and the Perils of Trademark Bullying

Trademarks are among the most important ways creative professionals can protect their brand and ensure their fans can easily identity their work, as well as protect themselves from similar products from others. But trademark allocation brings up tough questions about what a reasonable trademark would consist in, and at what point trademarks are being used unfairly to stifle competition.

Since 2016, novelist Faleena Hopkins had been writing romance novels in her ‘cocky’ series, for example “Cocky Roomie” and “Cocky Biker”. She had written 19 books and sold 600,00 copies in this series, and so wanting to protect her brand, she decided to trademark “cocky” to keep copycat authors from riding on her coattails. When her trademark registration was issued in April 2018, she sent out notices to several authors with books with “cocky” in their title, informing them of the trademark violation and asked them to either change their titles or face legal action.

Initially, a few authors complied with her demands. Jamila Jasper had published a book titled “Cocky Cowboy” in March 2018, with had the same title as a book Hopkins had published in September 2016, and was one of the authors to receive a cease and desist letter. She shared a screenshot on Twitter:

She wrote on her blog that she decided to err on the side of caution and unpublish her book, and instead republish it after renaming it “The Cockiest Cowboy To Have Ever Cocked” and paying money to redesign its cover. Although she said she’s trying to remain optimistic, she admitted that “it hurts to be attacked and it hurts to have your integrity questioned”. She also argued that it is “exceedingly common” in the romance publishing industry to have similar, even identical titles, and that therefore was incredibly unfair to demand other authors take down books they’ve already published and to ask them to refrain from using “cocky”, an incredibly common descriptor in this genre.

The internet agreed with Jasper, and a massive online backlash was unleashed against Hopkins. Writers piled on in her social media accounts, with negative comments far outnumbering likes and retweets/shares. On facebook, she was inundated with comments from authors and readers, declaring that they were going to boycott her. On sites that allow for reviews like Goodreads and Amazon, her books were hit with negative reviews which explicitly referenced how she had targeted indie authors who didn’t have the resources to fight her in court.

Eventually, the Authors Guild and Romance Writers of America filed suit against her trademarking a common word, and won their challenge, with the judge ruling that Hopkins’ desired “preliminary injunction censoring the continued publication of various artistic works is unwarranted and unsupported”. With her then deciding to step down from her trademark battle, finally the #CockyGate saga came to an end.

While this particular incident might have ended well, it also reveals that the process through which trademarks are approved allows authors to overstep and try to trademark overly generic phrases, whether intentionally or otherwise. One innovative way to battle this is CockyBot, a twitter bot that automatically finds and tweets fiction-related trademark applications filed from the US Patent and Trademark Office’s database.

For each application, CockyBot tweets out the phrase being trademarked, the status of the application, the documents submitted, and an Amazon search link of products that might be related to the phrase.

While most of the applications seems acceptable, there are the occasional generic terms like “dragon slayer” and “big” also included. Clearly, not everyone has learnt from the Hopkins affair.

However, we need to remember that while the kind of behaviour Hopkins engaged in might be unacceptable, trademarks are an essential part of how creative professionals make it and survive. To ensure that such incidents don’t repeat, we need a way to ensure that authors can check for similar titles on the market, while not letting frivolous trademarks impede them.

A possible solution is technological. While CockyBot is certainly a step in the right direction, it still relies on human users having to look through the Amazon search list themselves to check if there are any products from other creators containing the word or phrase that’s being trademarked. What we need going forward is a way for authors to check whether the title they are planning to use in already in use, as well as whether it would be violating someone else’s legitimate trademark. 

As the number of books hitting the market increases, and new authors try their hand at writing for niche audiences, it is no longer possible for each person to be mentored individually and taught their way about the industry. Luckily, tech solutions like well-crafted automation can be of enormous help to these newcomers, helping them avoid pitfalls that they might not even imagine were problems.

The State of Automation - Part 3

During the past few weeks we have been looking at how automation may impact the book publishing industry in the future. In the previous post, we started exploring and analysing how many of the different roles within the publishing ecosystem could be affected by this phenomenon, revealing how upper management, HR, legal and financial positions will likely fare.

This week we turn our attention to some of the more traditional roles in publishing to understand what the future of working in the industry could be like.

Editorial: Most people who aspire to work in publishing and have a love for the written word often have their hearts set on editorial jobs. From discovering new talent to working with writers to refine their work, and from negotiating contracts to correcting manuscripts, editors are very much considered the heart and soul of a publishing house, and their roles are incredibly diverse and multi-faceted. But editorial responsibilities will probably be among those hit the hardest by automation.

Ever since Jodie Archer and Matthew L. Jocker famously released The Bestseller Code: Anatomy of the Blockbuster Novel, and came up with the Bestseller-ometer, the algorithm at the heart of the book’s thesis, much has been said about whether computers can do what was previously considered an incredibly “human” job, that of the commissioning editor.

Understanding complex emotions, what makes us tick, the journey we want a book to take us on and the characteristics which can ultimately make a book a success — these are the skills very much at the core of what commissioning editors do. The fact that big data algorithms have been developed, and machine learning based start-ups such as Intellogo and Archer and Jocker’s very own consultancy, Archers Jockers, have come into existence, show that this is an aspect of publishing which is ripe for automation. But will we see the role of the commissioning editor replaced? It’s highly doubtful. It’s more likely that the commissioning editors of the future incorporate AI tools into their role to assist them in uncovering and snapping up potential bestsellers, allowing them to focus on nurturing author relationships and managing other aspects of the book cycle.

Lower down the editorial chain of command is where automation will really take no prisoners. As workflow tools become increasingly sophisticated and integrate machine learning as the new normal, the need for copy editors and proof readers will become less, as the new technology will sift through manuscripts checking flow, sense, clarity consistency, grammar, and even facts. The editorial department of the future looks very different from what it is now, and those looking to enter publishing via the editorial route may find themselves training for a very completely different role.

Design: Despite design being considered among the most creative disciplines in publishing, there are various elements of graphic design, in particular, which are succumbing to automation. In this article by Rob Pearl, ominously entitled Automation threatens to make graphic designers obsolete, the author highlights his belief that much of the work designers do is already ‘prescriptive’ and being affected by automation. He goes on to discuss the work of designer Jon Gold, who is applying machine learning techniques to standard graphic design procedures and uses this approach to analyse typefaces and typographic trends, for example. Interestingly Gold’s pull-out quote states: “I’m building design tools that try to make designers better by learning about what they’re doing. Augmenting rather than replacing designers.” In publishing, where many companies traditionally opt for a particular house or brand style when it comes to book jackets, typefaces and marketing materials, the automation of many of the more procedural design processes could have an extremely positive impact on the role of the designer, freeing them up to focus on the more creative elements of their job. Designers becoming obsolete is not a likely outcome, certainly not in the short to mid-term future, however designers training or retraining to understand how to use the latest machine learning-driven tools at their disposal is a far more realistic consequence of automation.

Production: The publishing department we expect to be hardest hit by automation is the production department. While there will always be the need for production personnel to oversee the supply chain and bring books to market, it is probable that this area is deeply affected by automation and that junior production roles will be the most at risk. Workflow tools which incorporate machine learning are increasingly automating many key production tasks, such as formatting, layout, typesetting and proofing. They are also facilitating improved lines of communication between different departments, like design and editorial, another important aspect of the production role. In order to stay in the game we will inevitably see production staff becoming jacks-of-all-trades, equipping themselves with more technical skills, as well as being able to take on editorial and design tasks.

Marketing: There is no doubt that in most marketing circles the arrival of automation is considered a force for the good. Applications incorporating AI have flooded the marketplace and are already helping marketers in their day jobs, while enabling them to analyse data and trends more efficiently and become more impactful in their roles. In this article in Forbes by Andrew Stephen, head of marketing at Oxford’s Said Business School, we can see how marketing as an industry is adapting to this new reality and how digital literacy is now such an important currency for existing and aspiring marketers. In terms of how this plays out in publishing, AI can help deliver much greater and deeper understanding of consumers and readers, so those who empower their marketing departments and give them these valuable tools will inevitably be one step ahead.

The final post in this four-part series will examine what all this means for publishers, what the industry might look like in the future, and how publishers should consider equipping themselves for automation across the business.

Will Publishers who are Investing in Technology Be Better Prepared for the Future?

Earlier this month, Pearson announced that it had hired former Intel executive, Milena Marinova, to the new position of Senior Vice President for Artificial Intelligence (AI) Products and Solutions. As one of the first companies to create such a role, Pearson appears to be jumping headlong into finding ways to use these advances in machine learning and automation to better its business.

In The Bookseller article about Marinova’s appointment, “Marinova said there were untapped opportunities within education where it could draw on digital and advanced AI techniques to the benefit of teachers and learners.” Could it be that it requires someone from outside of the publishing industry to see what potential technology can provide to publishers?

This isn’t the first time a publisher has brought talent from outside the industry in-house to help with strategic development. Just among the Big Five, Chantal Restivo-Alessi, Chief Digital Officer at HarperCollins, worked in the music industry and banking before joining the ranks at HarperCollins; Nihar Malaviya, Chief Operating Officer at Penguin Random House, worked as a consultant for JP Morgan and directed Bertelsmann’s Business Development; Cara Chirichella, Senior Director of Digital Marketing and Technology at Macmillan, worked in customer engagement.

While some companies may invest in bringing talent from other industries in-house to provide outside perspective and skills, others rely on service providers who can create a tailored program or system to meet the publisher’s unique needs or goals.

At Pearson, Marinova will be focused on “exploring and applying existing and new technologies in artificial intelligence, machine learning, including deep and reinforcement learning, as well as data analytics and personalized learning into current and future products and services,” according to the announcement.

As the market changes and customer desires fluctuate, it is important for publishers to be agile and bring in the talent needed to address those changes. And with automation-driven technology destined to play a major role in all of our futures, in publishing and beyond, having the right people in place, with the right knowledge, might just make all the difference when it comes to future-proofing our businesses.

A Crisis in Discoverability and how we can move towards fixing it

Lacking a single central repository that collects information about scholarly papers from each discipline, it is somewhat hard to estimate the exact number of journals and papers that are published each year. A conservative estimate was generated by Lutz Bornmann and Ruediger Mutz in their 2014 paper Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references, where they track all material — papers, books, datasets, and even websites — cited between 1980 and 2012. From this, they plotted the data and found that the rate of scientific output increases by 8–9% every year, meaning there is a doubling of total output every nine years. (The dip in recent years can plausibly be chalked up to more recent papers simply not having had enough time to be cited)

Admittedly, this is an imperfect measure because it ignores all those sources that were never cited, as well as those simply no longer cited. Still, there is at least a prima facie case that there is a dramatic increase in the amount of research currently created.

And even this might be understating the actual amount of potentially valuable work produced. One academic estimates that every year 10,000 papers gets written within his discipline, which compete for around 2,000 spaces. Those whose papers are rejected don’t just give up, but keep trying to publish in other reputable sources, leading to a backlog which spikes rejection rates to 94%. Since it seems quite plausible that a substantial chunk of those not-published papers might actually be valuable and only missed out because of a lack of space, he advocates for “creating a lot more journal space (maybe 3 times as much as we have now) for the additional papers to be published”.

And this isn’t even taking into consideration the effect of the Open Access movement and the trend of sharing results directly on social media and the web, and how the lack of traditional gatekeepers will almost certainly increase how much content gets produced.

What these discussions mean for publishers is that there is going to be an increasing need for efficiently sifting through large quantities of research output, because if relevant work can be located, then it is immaterial how much more unrelated material is added. In other words, discoverability is going to become an increasingly pressing issue.

I speculate that two kinds of tech changes will be necessary if we are going to deal with this issue. The first is an increasingly fine-grained tagging of content that will permit researchers to conduct incredibly precise searches for the topic they’re interested in. This might mean, for example, that instead of settling for a handful of keywords along with the title and author information, books will have to offer chapter-level tagging to provide more metadata as well as more precise metadata.

But as the metadata requirements get more demanding, it will also become increasingly onerous for the traditional manual generation of relevant metadata. This will call for machine learning approaches to rapidly scan content and generate the relevant kinds of metadata, which can then simply be approved by a human counterpart. This isn’t going to be a simple requirement, because different kinds of data (photos, paragraphs, etc) will have quite different technical approaches, with some involving the clever manipulation of language rules, and others looking to image identification techniques. And different academic fields might require very different metadata, indicating that tech will have to pay close attention to the variety of demands instead of simply producing a generic, high-level solution.

The increase in scholarly output might seem intimidating, but I prefer to look at it more optimistically since it suggests that we have the good fortune to be living in a time where we are producing more knowledge than we know how to handle. With some clever technical fixes, we should be able to harness this increase in productivity across the board, and effortlessly navigate through these changing times.

Subscribe!