I Tried Data Analysis ChatGPT Plugin: Every Analyst's Dream or a Nightmare in Disguise

Recently, ChatGPT Plus introduced a collection of ‘My GPTs’ plugins available for use. If you’re a subscriber to the 20 USD per month ChatGPT Plus service, you can access these ChatGPT plugins or even build your own GPT (Generative Pre-trained Transformer) !

ChatGPT Plugin, Image Credit: Livia Ellen

I tried the Genz 4 meme. It’s fun! Hahaha…

Now, I’d like to share my experience with the Data Analysis Plugin from ChatGPT. My expectations were high, especially since this plugin was developed by ChatGPT itself. Moreover, OpenAI has significantly invested in its Codex development, which serves as the foundation for GitHub Copilot.

Experiment

Goal:

I am going to ask ChatGPT to analyze my JSON file.

Data:

The JSON file contains a list of exported chat history I had with ChatGPT. This list includes nested dictionaries, each varying in content length. Notably, some conversations feature extended back-and-forth exchanges, which could introduce complexity in unpacking and analyzing the data structure.

Instead of writing the code from scratch, I will utilize the Data Analysis ChatGPT plugin to analyze this JSON file. For this experiment, I am going to try analyzing this JSON file with 2 different starting prompts: a more precise prompt and another ambiguous one.

Given the provided JSON, I need ChatGPT to give me an idea about:

What are the top 10 topics based on the conversation JSON file – Identification of Key Topics
How many times those topics are mentioned – Frequency Analysis
Lastly, I want it to label each conversation as an action item such as asking for a concept, summary, idea, etc. These labels are just examples, the labels are not limited to those— Conversation Categorization

Let’s do it!

First Attempt: The Ambiguous Prompt Test

For the initial phase of the experiment, I submitted my JSON file to the Data Analysis ChatGPT plugin. The prompt I used was intentionally made ambiguous to assess how the plugin handles less clear instructions. I provide my JSON file to the Data Analysis ChatGPT plugin with this prompt:

**give me idea about :

what are the 10 most topic on this conversation

how many times those words/topic are mentioned

label each conversation as action : for example: asking for concept, summary, idea, etc this is just example, label are not limited to those**

You can see that this prompt is ambiguous in the way I mentioned words/topics on the second goal. Unlike the third goal, I did not provide an example of the second goal.

Here is the first response:

First Response from ChatGPT, Image by Livia Ellen

The Data Analysis ChatGPT plugin works by running the data analysis process using Python Programming. Users can click on the blue code icon to see how it processes the step.

Code Snippet from DA ChatGPT Plugin, Image by Livia Ellen

Seeing the first response, I was amazed by its capability.

The Data Analysis ChatGPT tools do not just answer straight away, instead, they show how exactly coders will tackle this, step-by-step, including solving the error they encounter.

Earlier, I mentioned that the dictionary from this JSON might have a different data structure, which might lead to an error if the unpacking wasn’t handled properly. The Data Analysis plugin managed to handle this issue and try to answer the first goal – what are the top 10 topics based on the conversation JSON file?

Top 10 Result from ChatGPT, image by Livia Ellen

ChatGPT starts with counting the highest-frequency words to find the top 10 topics. Then, they realized that high-frequency words are not always the same as the topic. Some words are just stopwords – words that are mentioned a lot yet are meaningless to the context of the passage.

A notable feature of the Data Analysis ChatGPT plugin is its self-corrective behavior.

— as seen in the underlined text below

Data Analysis ChatGPT asking for recommendations, Image by Livia Ellen

Every time, it asks what I think, I will just say yes to let them decide what is best based on their knowledge.

Ultimately, the Data Analysis ChatGPT plugin succeeded in extracting the main topics from my conversations with ChatGPT after 2 error tries. However, due to my ambiguous prompt mentioning words/topics, it assumed I would need only a 1-word topic – see the orange rectangle below.

Data Analysis ChatGPT Plugin Result, Image by Livia Ellen

Despite this assumption, the results were reasonably accurate. I have talked a lot about Augmented Reality, AI, Education, and Data Science with ChatGPT.

Interesting take

The Data Analysis ChatGPT plugin’s ability to mimic a programmer’s approach is quite remarkable. At times, by the very least – I would assume I have a conversation with a data intern.

Another downside I found, the passage from the image above also shows how the Data Analysis ChatGPT Plugin is trying to use stopwords from NLTK – see the highlighted blue rectangle on the screenshot above. However, the chatGPT server does not allow downloading files to the root folder which might require elevated access for the server. ChatGPT continued this process to find a workaround.

We found an error.. again….

At this point, I am done with this conversation thread. After 3 error tries, I noticed ChatGPT still wanted me to correct a few things by asking a recommendation.

Summary of First Attempt

Pros:

The Data Analysis ChatGPT Plugin managed to show step by step process
It provides a clear explanation
Self-correcting on the first error, so you don’t need to say "fix it" – unlike the general ChatGPT

Cons:

The Data Analysis ChatGPT Plugin has a good understanding of Python and data analytics concepts but needs to be improved in implementing the best practices.
It adds more time for prompting – and debugging.
It feels like talking to a noob data intern
The fact that it does not implement the best practice right away, you will have to wait until the code has been executed if their basic solution will work or not.

Second Attempt: A More Precise Prompt Test

I provided the same JSON file to the Data Analysis ChatGPT plugin with a slightly different prompt – I would not use the word words/topics to avoid the possibility of receiving one-word topic responses, a challenge encountered on the first attempt.

On this attempt, I would also recommend my approach to ChatGPT instead of saying Yes all the time and letting ChatGPT do its things.

**give me idea about :

what are the top 10 topics based on this conversation

how many times those topics are mentioned

label each conversation as action: for example: asking for concept, summary, idea, etc this is just example, label are not limited to those**

Result

After the first prompt,

The Data Analysis ChatGPT Plugin gives the expected answer! This accuracy, achieved without the need for any fine-tuning on my part, indicates a notable improvement in the plugin’s performance.

With that, I will add more stuff to the list of the Pros.

Pros:

Good prompt engineering might equal a good result

Here is the breakdown of the result…

The first part of the first response shows its self-correcting capability without me having to say "fix it" explicitly.

The image below is the second part of the response, it shows the answer to Goals #1 and #2. Notably, in this attempt, the topics identified by the plugin are more granular and detailed compared to the one-word topics generated in the first attempt.

The last part, Data Analysis ChatGPT Plugin has successfully answered Goal #3 – labeling the action for each conversation and counting the occurrences.

I am intrigued with the answer from the Data Analysis ChatGPT Plugin! So far, it performance seems like a dream for the data analyst.

Now, it’s time to move forward!

Let’s provide some feedback…

I am going to ask ChatGPT to fix the categorization of the "General/other" category to be labeled correctly.

Data Analysis Plugin managed to fine-tune its categorization capabilities throughout this experiment.

The Final Challenge for this plugin

So far the performance has been good, let’s export it to a Jupyter Notebook!

This step will test the plugin’s ability to seamlessly integrate its output into a Jupyter Notebook format for further exploration. And hey, we want to test it as a data analyst, so we need the code!

Challenge for Data Analysis Plugin, Image by Livia Ellen

It seems like we have encountered a list of errors. I see that we had a token limit issue, hence it’s failing the notebook generation process.

I told ChatGPT to force-stop the process.

Cons:

The plugin will keep doing the process until it’s done unless we force stop. It appears to lack an in-built break theory or mechanism to autonomously stop when encountering significant issues.

Following this experience, I provided feedback to ChatGPT, emphasizing the need to address and handle this issue.

First Notebook Output, Image by Livia Ellen

The Jupyter Notebook was successfully generated. When I downloaded it, it missed the middle chunk of the code, I had to remind them.

Plugin’s Acknowledgment:

Data Analysis ChatGPT plugin acknowledge the mistake. Good moral GPT! – see highlighted blue box on the image

The Jupyter Notebook which has been generated still missing some parts of the code. So, I gotta gave ChatGPT some encouragement – aiming to motivate and perhaps make the plugin towards a more effective resolution. A data intern would like this kind of encouragement too Haha…

And…

Whoopsie. More error!

Honestly, I am tired of debugging and asking…

I realized this plugin still needs to learn more. The most annoying pain point is I had to ask and remind them, as it does not know the best practice and lack of decision-making about when to cease operations. The continuous cycle of debugging and providing reminders has become exhausting.

Conclusion

This experience has led me to realize that the Data Analysis ChatGPT Plugin still has a considerable learning curve ahead.

To summarize, these are the pros and cons of the Data Analysis ChackGPT Plugin that I gathered from this experiment:

Pros:

The Data Analysis ChatGPT Plugin managed to show step by step process
It provides a clear explanation
Self-correcting on the first error, so you don’t need to say "fix it" – unlike the general ChatGPT
Effective prompt engineering is still needed, good prompt engineering might equal a good result

Cons:

The Data Analysis ChatGPT Plugin has a good understanding of Python and data analytics concepts but is very limited in implementing the best practices – Understanding vs. Implementation
Time-Consuming, adds more time for prompting – and debugging.
Novice-like Interaction, it feels like talking to a noob data intern
Delayed Best Practice Implementation, The fact that it does not implement the best practice right away, you will have to wait until the code has been executed if their basic solution will work or not.
Lack of Autonomous Stop Mechanism— it will keep doing the process until it’s done unless we force stop. It does not have the break theory implemented.

Summing up my experience with the Data Analysis ChatGPT Plugin, it’s clear that it’s more suited for beginners in coding and data analysis. I think it’s a dream come true for learners – however, what you got is a novice coding buddy. Likely because the codex is trained by less experienced programmers and contractors when OpenAI outsources its coders to train it. This makes it a useful tool for learning basics.

However, for professionals handling complex data tasks, the plugin isn’t ready for it. It can be slow because it needs a lot of debugging and doesn’t always follow the best practices in coding. This means more work for the user, making it less efficient for professional use – a nightmare in disguise.

In short, while the Data Analysis ChatGPT plugin is a helpful tool for beginners wanting to learn, it’s not yet ready for expert users who need a more efficient tool for data analysis.

As a professional, I would not use this data analysis tool, like I said, it feels like consulting with a newbie programmer where I had to keep seeking clarification.

I hope these observations provided a comprehensive overview of the current state of the Data Analysis ChatGPT Plugin to you, indicating its potential and the areas that need further development.

Let me know in the comments what are your thoughts about this ChatGPT plugin – a dream come true or a nightmare?

If you like to read more about my Data Science & AI tips, you can take a look at my curated lists:

Data Science & AI

Let’s Connect

Follow me on LinkedIn and Medium
Subscribe for free to my Medium newsletter for email updates!
Please clap, save into your reading list, share, and comment on this article if you found this useful!
Oh Hey! you can clap more than one (50 max)in 1 article – that will help me get a cup of coffee to write my upcoming articles with
or just buy me a real coffee – Yes I love coffee.