We need to talk about ChatGPT



Writing in The New Yorker last February, sci-fi author Ted Chiang likened ChatGPT to “a blurry jpeg of the web”, which I think is a very good simile. It retains much of the information on the internet, but in the same way that a jpeg retains much of the information of a higher-resolution image – it compresses the information, so what you get is an approximation of the web. If you have access to the internet, how much use is a blurry image of it? This is why, in its current version, ChatGPT should not be used as a factual source of information.


The shock of the not-so-new


Like most people, I had a moment of shock when I first used ChatGPT. What really strikes you is the capability to imitate humans. Our species is fascinated by what resembles us, and we often attribute human traits to non-human entities due to our anthropomorphic bias. This may help explain why ChatGPT has become a social phenomenon as much as it has a technological advancement. My guess is that, even if it feels like a breakthrough now, in time to come ChatGPT will seem like another step in an evolution of technology that a gradual alignment of the planets has enabled. 


I believe it’s crucial to understand how we got here: there’s been a constant build-up in the past 20 years and one factor is human capital. We have trained more and more engineers and scientists in computer science and STEM fields, and researchers have developed open-source, standardised tools and built on each other’s research to get to this point. We have seen an implicit alliance between academia and the technology sector, and companies have been investing in very efficient computing infrastructure with high scalability.


A new business ecosystem


We may be seeing the appearance of a new ecosystem and people will build many applications in the same way, for example, that smartphones gave rise to a new creative generation. This could lead to the emergence of a new type of business – app-based platforms, for example – that can connect with users, so LLMs could become the go-to platform for entrepreneurs to create new applications, and many corporations are thinking about how to leverage the technology in customer services.


To open source or not?


Different companies have their own take on this. Meta AI, for example, has been much more open with its LLM, LLaMA. It has not released it as a public chatbot like ChatGPT, but as an open-source library of models to which anyone in the AI community can request access. Meta’s strategy, or so it claims, is to make the tool available, explain how it works, and make it completely open to users and the research community. 


OpenAI’s strategy seems a bit more ambiguous. ChatGPT is “open” in the sense that it can be used now (with safety measures), for which OpenAI charges a fee. The company shares what the algorithm is doing to some extent, but not much is actually known or understood about it. OpenAI’s CEO believes we should not have a fully open-source AI universe because of its potential for harm. This is a trade-off: the more open-source we are and the more transparency we have with respect to how the algorithm is built, the more accessible the technology is to a wider range of stakeholders. But, if it’s completely open-source, people of malign intent may repurpose the tool for illegal activities.


Barriers to entry


There’s another question around ease of replication: to what extent is there a barrier to entry for a company, institution or individual who wants to build their own LLM? How easy is it to do this? We need more research in this area. Researchers from Stanford University recently claimed to have developed a small AI language model on a budget of less than $1,000. It’s an interesting study because, if the results are confirmed, it would mean that the barrier to entry is not that great. But I’m sceptical – I think there will be a first-mover advantage for the big techs because they have massive computing infrastructure and huge volumes of data and human feedback. (Shortly after its release, the Stanford University research team took down their LLM due to misinformation and rising costs.) 


Anyone may be able to develop an AI tool cheaply, but that tool may not be trustworthy and may have more limitations. The real difficulty is to catch the ‘edge’ cases, such as when the tool is offensive or racist. The barrier to entry lies in the extent to which we can replicate the fine-tuning that these platforms have been engaged in, which is a non-trivial task. If we can’t replicate the models easily, we risk ending up in the hands of the tech giants, who will have a first-mover advantage and will capture much of the value added.




Post a Comment

0 Comments