Daniel Gillblad is the Director of the Chalmers AI Research Center (CHAIR) as well as the Co-Director of AI Sweden, the Swedish national center for applied artificial intelligence. He has a track record of working with both academia and industry and strives to turn research breakthroughs into applications benefiting Swedish citizens.
AI Sweden released early 2023 GPT-SW3, the first truly large-scale generative language model for the Swedish language. This model uses the same technical principles as the much-hyped GPT-3 which the general public discovered with ChatGPT, and its goal is to help Swedish organizations build language applications never before possible. Indeed the development process strived to build this model “as openly, transparently, and collaboratively as possible, with the goal to make the model available to all sectors in Sweden that may have a need for NLP (Natural Language Processing) solutions”
We are at a stage where generative models can truly assist, and in some parts replace, human work and creativity. Image models can generate visual representations that can easily be taken for natural or human-created, and large language models can generate text on relatively complex subjects in several different styles. As these models grow more capable and cover more modalities such as video, music, programming, and engineering, we will have to account for a new way of working with creative disciplines and a new reality where text and representations that earlier could only be attributed to human effort could have been generated by a machine with its specific limitations and representation of the world.
While these techniques ultimately are going to increase human productivity and knowledge creation immensely, as with all powerful technologies, there are risks involved. Some of the more direct include:
- Overconfidence in the produced results by users
- Results that could include factual errors and misrepresentations of the world
- Change of roles and tasks in the workforce creating a need for education and organizational change
- Easier generation of disinformation and low-information material.
As we know of and to a degree understand these risks we can work to reduce them, but there are less understood risks that could prove more consequential, and perhaps the main problems with these models are not the factual errors or false representations they could produce but e.g. the gradual influence of the behavior of human users seeking advice and guidance.
There is definitely a place for European actors in this space and it is, for representation and inclusion purposes, important that not only Europe but all parts of the world are part of these developments. Driven by private actors, the development of these types of models is currently happening largely in the US. Given the importance of this type of technology Europe needs to engage, but without similar private actors the model likely has to be different. Collaborative efforts to develop open large language models have proven that there are alternative paths to development, but there is a need for concerted efforts by European actors.
Given their capabilities, if these models are available, they need to be broadly accessible and not just for a chosen few. How this should be done responsibly is still an open question, but we are working based on the principle that transparency, openness, and accessibility are critical for an open, democratic society.
As many of these models are quite generally applicable, general safeguards and management of e.g. bias is somewhat complicated. A broadly public government service based on a large language model may have quite different types of needs for safeguards compared to a specialized service only accessible to health professionals. Providers of specific applications will have to take responsibility for the impact and quality, but of course the systems must adhere to what is legal in general and, for more sensitive applications, possibly come with documented testing and verification.
The GPT-SW3 project is important from several aspects. First, we need the capacity, knowledge and infrastructure to develop these types of models in Sweden. Second, having these models fully available for testing and specialization for e.g. the Swedish healthcare system is important for new, cutting edge AI applications in Swedish, and an opportunity to collect the necessary feedback to develop these models for responsible use in all sectors. Finally, fully open models based on documented data sources is critical for research and experimentation to fully understand the potential and limitations of these models.