Table of Links
2. Prior conceptualisations of intelligent assistance for programmers
3. A brief overview of large language models for code generation
4. Commercial programming tools that use large language models
5. Reliability, safety, and security implications of code-generating AI models
6. Usability and design studies of AI-assisted programming
7. Experience reports and 7.1. Writing effective prompts is hard
7.2. The activity of programming shifts towards checking and unfamiliar debugging
7.3. These tools are useful for boilerplate and code reuse
8. The inadequacy of existing metaphors for AI-assisted programming
8.2. AI assistance as compilation
8.3. AI assistance as pair programming
8.4. A distinct way of programming
9. Issues with application to end-user programming
9.1. Issue 1: Intent specification, problem decomposition and computational thinking
9.2. Issue 2: Code correctness, quality and (over)confidence
9.3. Issue 3: Code comprehension and maintenance
9.4. Issue 4: Consequences of automation in end-user programming
9.5. Issue 5: No code, and the dilemma of the direct answer
8. The inadequacy of existing metaphors for AI-assisted programming
8.1. AI assistance as search
In research studies, as well as in reports of developer experiences, comparisons have been drawn between the nature of AI programming assistance and programming by searching and reusing code from the Internet (or from institutional repositories, or from the same project, or from a developer’s previous projects).
The comparison between AI programming assistance and search is a natural one, and there are many similarities. Superficially, both have a similar starting point: a prompt or query that is predominantly natural language (but which may also contain code snippets). From the user perspective, both have an information asymmetry: the user does not know precisely what form the result will take. With both search and AI assistance, for any given query, there will be several results, and the user will need to invest time evaluating and comparing them. In both cases, the user may only get an inexact solution, or indeed nothing like what they want, and the user may need to invest time adapting and repairing what they get.
However, there are differences. When searching the web, programmers encounter not just code, but a variety of types of results intermingled and enmeshed. These include code snippets interspersed with human commentary, perhaps discussions on forums such as Stack Overflow, videos, and images. A search may return new APIs or libraries related to the query, thus showing results at different levels of abstraction. Search has signals of provenance: it is often (though not always) possible to determine the source of a code snippet on the web. There is a lot of information scent priming to assist with the information foraging task (Srinivasa Ragavan et al., 2016). In this way, programming with search is a mixed media experience.
In contrast, programming with large language models can be said to be a fixed media experience. The only output is tokens (code, comments, and data) that can be represented within the context of the code editor. This has some advantages: the increased speed of code insertion (which is the immediate aim) often came up in experience reports. However, the learning, exploration, and discovery, and access to a wide variety of sources and media types that occurs in web search is lost. Provenance, too is lost: it is difficult to determine whether the generation is original to the model, or a stochastic parroting (Bender et al., 2021; Ziegler, 2021). Moreover, due to privacy, security, and intellectual property concerns, the provenance of code generated by large language models may be withheld or even destroyed (Sarkar, 2022). This suggests that in future assistance experiences, mixed-media search might be integrated into programmer assistance tools, or the models themselves might be made capable of generating more types of results than the simple code autocomplete paradigm of current tools.
Authors:
(1) Advait Sarkar, Microsoft Research, University of Cambridge ([email protected]);
(2) Andrew D. Gordon, Microsoft Research, University of Edinburgh ([email protected]);
(3) Carina Negreanu, Microsoft Research ([email protected]);
(4) Christian Poelitz, Microsoft Research ([email protected]);
(5) Sruti Srinivasa Ragavan, Microsoft Research ([email protected]);
(6) Ben Zorn, Microsoft Research ([email protected]).
This paper is