ChatGPT, Symbolic Math, and the Struggle for Accuracy

Another formula quite similar to (4), but also this one is incorrect. As already mentioned before, ChatGPT is not a good option when it comes to reasoning. So, to get the right formula, some symbolic tool is definitely preferred. However, note that ChatGPT plugins[10] has been recently announced, and particularly Code interpreter, which is an experimental ChatGPT model that can use Python, handle uploads and downloads, allows for symbolic computations. Even if not freely available yet, this might be also a possible way to mitigate the problem, directly in the ChatGPT environment.

Following our communication protocol, let us feed ChatGPT with the right formula, and ask for a corresponding function in three programming languages: 1) MATLAB[11], which represents a proprietary software, 2) Python[12], an opensource software popular in the AI community, and R [13], an open-source software popular in the statistical community. Note that in cases when the output is too wide, we adjust it in order to fit on the page; otherwise we do not adjust it in any other way.

After feeding ChatGPT with the right formula, it immediately generated a functional code. Notice that we used quite a natural and relaxed form of conversation, e.g., like in an email.

As ChatGPT takes into account the previous conversation, we could afford to be extremely concise with our prompts and still get correct solutions. In what follows, we ask for code only in MATLAB to save space. However, the equivalent code in Python and R is shown in the appendices, where all the functions can be easily identified by their names.

3.3 The estimation

By contrast to our struggles with the PDF, we immediately got a correct solution. This may be due to the fact that code snippets computing ML estimators occur more frequently in ChatGPT’s training data. This pattern (the more general the task, the more frequently we receive a working solution on first trial) is observed also in other examples throughout this work.

Author:

(1) Jan G´orecki, Department of Informatics and Mathematics, Silesian University in Opava, Univerzitnı namestı 1934/3, 733 40 Karvina, Czech Republic ([email protected]).

This paper is available on arxiv under CC BY 4.0 DEED license.

[10] https://openai.com/blog/chatgpt-plugins

[11] We used version R2020a.

[12] Version 3.9.

[13] Version 4.2.2.

← Previous

Testing ChatGPT as a Pair Programming Partner

Up Next →

When AI Gets It Wrong—and Then Gets It Right