Abstract Vision Language Models (VLMs), like DALL-E, Midjourney, and Stable Diffusion, have raised significant concerns regarding authorship and whether AI-generated images devalue artistic practices and traditions. Recently, some have argued that VLMs should be viewed as another tool that artists use to generate their creative outputs. I defend this position and expand on it by introducing an account of agency that demonstrates that only biological agents, at least for now, possess the necessary powers to be responsible for the act of creation (for example, the capacity to realize autonomous goal-directed actions and manipulate affordances). I ultimately argue that although VLMs afford artists the ability to output high-quality images with minimal technical skill, creating artworks that are artistically valued using VLMs will require significant ingenuity. Therefore, in my view, concerns that this new tool will blur the lines of authorship and undermine artistic practices and traditions are unwarranted.
Dan Durso (Sat,) studied this question.