It's Not Your AI Partner: The Superficial Analogy Of Pair Programming

cover
6 Aug 2025

Abstract and 1 Introduction

2. Prior conceptualisations of intelligent assistance for programmers

3. A brief overview of large language models for code generation

4. Commercial programming tools that use large language models

5. Reliability, safety, and security implications of code-generating AI models

6. Usability and design studies of AI-assisted programming

7. Experience reports and 7.1. Writing effective prompts is hard

7.2. The activity of programming shifts towards checking and unfamiliar debugging

7.3. These tools are useful for boilerplate and code reuse

8. The inadequacy of existing metaphors for AI-assisted programming

8.1. AI assistance as search

8.2. AI assistance as compilation

8.3. AI assistance as pair programming

8.4. A distinct way of programming

9. Issues with application to end-user programming

9.1. Issue 1: Intent specification, problem decomposition and computational thinking

9.2. Issue 2: Code correctness, quality and (over)confidence

9.3. Issue 3: Code comprehension and maintenance

9.4. Issue 4: Consequences of automation in end-user programming

9.5. Issue 5: No code, and the dilemma of the direct answer

10. Conclusion

A. Experience report sources

References

8.3. AI assistance as pair programming

The third common perspective is that AI-assisted programming is like pair programming. GitHub Copilot’s commercial tagline describes it as “your AI pair programmer”. As opposed to search and compilation, which are both relatively impersonal tools, the analogy with pair programming is evocative of a more bespoke experience; assistance from a partner that understands more about your specific context and what you’re trying to achieve. AI-assisted programming does have the potential to be more personalised, to the extent that it can take into consideration your specific source code and project files. As Hacker News commenters write:

“[...] at one point it wrote an ENTIRE function by itself and it was correct. [...] it wasn’t some dumb boilerplate initialization either, it was actual logic with some loops. The context awareness with it is off the charts sometimes.[...]”

“[...] It’s like having the stereotypical “intern” as an associate built-in to your editor. [...] It’s also ridiculously flexible. When I start writing graphs in ASCII (cause I’m just quickly writing something down in a scratch file) it’ll actually understand what I’m doing and start autocompleting textual nodes in that ASCII graph.”

Besides personalisation, the analogy also recalls the conventional role-division of pair programming between “driver” and “navigator”. When programming, one needs to form mental models of the program at many layers: from the specific statement being worked on, to its context in a subroutine, to the role that subroutine plays in a module, to the module within the program. However, code must be written at the statement level, which forces developers to keep this lowest level constantly at the forefront of their working memory. Experienced developers spend more time mapping out their code so that they can spend less time writing it. Research into code display and navigation has explored how different ways of presenting lines of code can help programmers better keep these different layers of mental models in mind (Henley & Fleming, 2014). Pair programming, the argument goes, allows two partners to share the burden of the mental model. The driver codes at the statement and subroutine level while the navigator maps out the approach at the module and program level.

By analogy to pair programming, the AI assistant taking the role of the driver, a solo programmer can now take the place of the navigator. But as we have seen, the experience of programming with AI assistance does not consistently absolve the human programmer of the responsibility for understanding the code at the statement and subroutine level. The programmer may be able to become “lazier [...] about learning various details of syntax and libraries”, but the experience still involves much greater statement-level checking.

While a pair programming session requires a conscious, negotiated decision to swap roles, a solo programmer with an AI assistant might find themselves fluidly traversing the spectrum from driving to navigation, from one moment to the next. This may partially explain why, in a preliminary experiment (n=21) comparing the experience of “pair programming” with GitHub Copilot to programming in a human pair either as driver or navigator, Imai (2022) finds that programmers write more lines of code with Copilot than in a human pair, but these lines are of lower quality (more are subsequently deleted).

Moreover, meta-analyses of pair programming have shown mixed efficacy of human pair programming on task time, code quality and correctness (Salge & Berente, 2016; Hannay et al., 2009), suggesting that emulating the pair programming experience is not necessarily a good target to aim for. Multiple studies have concluded that the apparent successes of pair programming can be attributed, not to the role division into driver and navigator, but rather the high degree of verbalisation that occurs when pair programmers are forced to rationalise their decisions to each other (Hannay et al., 2009). Others have found that programming in pairs induces greater focus out of a respect for shared time; pair programmers are less likely to read emails, surf the web, or take long phone calls (L. A. Williams & Kessler, 2000). These particular benefits of pair programming are not captured at all by AI assistance tools.

The comparison to pair programming is thus relatively superficial, and today’s experience of AI-assisted programming is not comparable with pair programming to the same extent as it is with search or compilation.

Authors:

(1) Advait Sarkar, Microsoft Research, University of Cambridge ([email protected]);

(2) Andrew D. Gordon, Microsoft Research, University of Edinburgh ([email protected]);

(3) Carina Negreanu, Microsoft Research ([email protected]);

(4) Christian Poelitz, Microsoft Research ([email protected]);

(5) Sruti Srinivasa Ragavan, Microsoft Research ([email protected]);

(6) Ben Zorn, Microsoft Research ([email protected]).


This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.