Data Analytics

How to use ChatGPT to make charts and tables with Advanced Data Analysis


colorful chart

Eugene Mymrin/Getty Images

Know what floats my boat? Charts and graphs.

Give me a cool chart to dig into and I’m unreasonably happy. I love watching the news on election nights, not for the vote count, but for all the great charts. I switch between channels all evening to see every possible way that each network finds to present numerical data. 

Is that weird? I don’t think so.

Also: The moment I realized ChatGPT Plus was a game-changer for my business

As it turns out, ChatGPT does a great job making charts and tables. And given that this ubiquitous generative AI chatbot can synthesize a ton of information into something chart-worthy, what ChatGPT gives up in pretty presentation it more than makes up for in informational value.

It should come as no surprise to anybody that AI chatbots’ feature sets are changing constantly. As of the time of this update (end of May, 2024), OpenAI has just come out with a Mac application and has release its GPT-4o LLM, which is available for both free and paying customers. The GPT-4o version that comes for the added-price Plus version is supposed to have interactive chart features and the ability to interact with the engine longer per session.

But, not so much. My free account doesn’t offer GPT-4o at all yet. It hasn’t rolled out to all free accounts yet. And while paid ChatGPT Plus plan does provide the interactive charts feature in Chrome and Safari, it doesn’t in the Mac app.

Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it?

This article was last updated when the Advanced Data Analysis features (which included charts) were only available to Plus customers. Even though some of those features are supposed to be available to free customers, since my free account doesn’t have them yet, I’m going to present the rest of this article as if the charting features are only available to Plus customers. If you’re a free customer and you have GPT-4o, feel free to try some of the prompts. Those features may work for you, and undoubtely will as we move forward in time.

Advanced Data Analysis produces relatively ugly charts. But it rocks. First, let’s discuss where ChatGPT gets its data, then we’ll make some tables.

How to use ChatGPT to make charts and tables

Earlier, we talked about which charting tools are available in which versions of ChatGPT. But there’s more to it than simply charting tools. If you want to use ChatGPT productively, you need to understand what the various editions can do.

ChatGPT free version: This version has historically used the GPT-3.5 large language model (LLM), which isn’t quite as capable as the GPT-4 version. As of May 2024, the GPT-4o LLM is also available to some free users and rolling out over time.

ChatGPT Plus: ChatGPT Plus is OpenAI’s commercial, fully powered version of ChatGPT. Right now, ChatGPT Plus provides three major selection options per session: GPT-3.5, GPT-4, and GPT-4o. It used to offer plugins, but they’ve been replaced by custom GPTs.

The GPT-4 and GPT-4o versions now include DALL-E 3, Bing Web access, and Advanced Data Analysis. Some users have reported some difficulty with using Bing for web access. Most of what we will be doing is using the Advanced Data Analysis component. Even without Bing web access, GPT-4 and 4o report that training data now includes information up to December 2023. 

Also: What does GPT stand for? Understanding GPT 3.5, GPT 4, GPT-4o, and more

For much of this article, we will be using the Advanced Data Analysis component of the GPT-4 option. This tool will import data tables in a wide range of file formats. While it doesn’t specify a size limit for imported data, it can handle fairly large files, but will break if the files exceed some undefined level of complexity.

As ChatGPT Plus changes, and it will, we will update you with more information. For now, let’s just look at making some cool charts.

ChatGPT Enterprise: Advanced Data Analytics and plugins are also available in the enterprise version. You can upload files to Enterprise, and they will remain confidential. Enterprise is also supposed to allow for bigger files and bigger responses. Pricing has not been specified.

Let’s start with an example. For the following demonstration, we’ll be working with the top five cities in terms of population.

List the top five cities in the world by population. Include country.

I asked this question to ChatGPT’s free version and here’s what I got back:

basic-city-list

Screenshot by David Gewirtz/ZDNET

Turning that data into a table is simple. Just tell ChatGPT you want a table:

Make a table of the top five cities in the world by population. Include country.

basic-city-table

Screenshot by David Gewirtz/ZDNET

You can manipulate and customize a table by giving ChatGPT more detailed instructions. Again, using the free version, we’ll add a population count field. Of course, that data is out of date, but it’s presented anyway:

Make a table of the top five cities in the world by population. Include country and a population field

city-table-with-population

Screenshot by David Gewirtz/ZDNET

You can also specify certain details for the table, like field order and units. Here, I’m moving the country first and compressing the population numbers.

Make a table of the top five cities in the world by population. Include country and a population field. Display the fields in the order of rank, country, city, population. Display population in millions (with one decimal point), so 37,833,000 would display as 37.8M.

Note that I gave the AI an example of how I wanted the numbers to display.

city-table-manipulated

Screenshot by David Gewirtz/ZDNET

That’s about as far as the free version will take us. From now on, we’re switching to the $20/month ChatGPT Plus version.

ChatGPT Plus with Advanced Data Analytics enabled can make line charts, bar charts, histograms, pie charts, scatter plots, heatmaps, box plots, area charts, bubble charts, Gantt charts, Pareto charts, network diagrams, Sankey diagrams, choropleth maps, radar charts, word clouds, treemaps, and 3D charts.

In this example, we’re just going to make a simple bar chart.

Make a bar chart of the top five cities in the world by population

Chatty little tool, isn’t it?

bar-chart

Screenshot by David Gewirtz/ZDNET

The eagle-eyed among you may have noticed the discrepancy in populations between the previous table shown and the results here. Notice that the table has a green icon and this graph has a purple icon. We’ve jumped from GPT-3.5 (the free version of ChatGPT) to GPT-4 (in ChatGPT Plus). It’s interesting that the differing LLMs have slightly different data. This difference is all part of why it pays to be careful when using AIs, so double-check your work. In our case, we’re just demonstrating charts, but this is a tangible example of where confidently presented data can be wrong or inconsistent.  

One of Advanced Data Analytics’ superpowers is the ability to upload a dataset. For our example, I downloaded the Popular Baby Names dataset from Data.gov. This is a comma-separated file of New York City baby names from 2011-2014. Even though it’s a decade out of date, it’s fun to play with.

The dataset I chose for this article is readily available from a government site, so you can replicate this experiment on your own. There are a ton of great datasets available on Data.gov, but I found that many are far too large for ChatGPT to use. 

Also: How to use ChatGPT to create an app

Once I downloaded this one, I realized it also included information on ethnicity, so we can run a number of different charts from the same dataset.

Click the little upload button and then tell it the data file you want to import.

baby-name-import

Screenshot by David Gewirtz/ZDNET

I asked it to show me the first five lines of the file so I’d know more about the file’s format.

I was curious about how the dataset distributed gender names. Here’s my first prompt:

Create a pie chart showing gender as a percentage of the overall dataset

And here’s the result:

green-gender-pie

Screenshot by David Gewirtz/ZDNET

Unfortunately, the dark shade of green makes the numbers difficult to read. Fortunately, you can instruct Advanced Data Analytics to use different colors. I was careful to choose colors that did not reinforce gender stereotypes.

Create a pie chart showing gender as a percentage of the overall dataset. Use light green for male and medium yellow for female.

yellow-green-gender-pie

Screenshot by David Gewirtz/ZDNET

As we saw earlier, the data collected includes ethnicity. Here’s how to see the distribution of the various ethnicities New York recorded in the early 2010s:

Show the distribution of ethnicity in the dataset using a pie chart. Use only light colors.

And here’s the result. Notice anything?

raw-ethnicity-chart

Screenshot by David Gewirtz/ZDNET

Apparently, New York didn’t properly normalize its data. It used “WHITE NON HISPANIC” and “WHITE NON HISP” together, “BLACK NON HISPANIC” and “BLACK NON HISP” together, and “ASIAN AND PACIFIC ISLANDER” and “ASIAN AND PACI” together. This resulted in inaccurate representations of the data.

One benefit of ChatGPT is it remembers instructions throughout a session. So I was able to give it this instruction:

For all the following requests, group “WHITE NON HISPANIC” and “WHITE NON HISP” together. Group “BLACK NON HISPANIC” and “BLACK NON HISP” together. Group “ASIAN AND PACIFIC ISLANDER” and “ASIAN AND PACI”. Use the longer of the two ethnicity names when displaying ethnicity.

And it replied:

group-normal

Screenshot by David Gewirtz/ZDNET

Let’s try the chart again, using the same prompt.

Show the distribution of ethnicity in the dataset using a pie chart. Use only light colors.

That’s better:

group-fixed

Screenshot by David Gewirtz/ZDNET

You need to be diligent when looking at results. For example, in a request for top baby names, the AI separated out “Madison” and “MADISON” as two different names:

case-sensitive-baby-names

Screenshot by David Gewirtz/ZDNET

For all the following requests, baby names should be case insensitive.

Let’s wrap up with a complex chart from one prompt. Here’s our prompt:

For each ethnicity, present two pie charts, one for each gender. Each pie chart should list the top five baby names for that gender and that ethnicity. Use only light colors.

As it turns out, the chart generated text that was too small to read. So, to get a more useful chart, we can export it back out. I’m going to specify both file format and file width:

Export this chart as a 3000 pixel wide JPG file.

export-confirmation

Screenshot by David Gewirtz/ZDNET

And here’s the result:

pie-chart-extravaganza

Screenshot by David Gewirtz/ZDNET

Notice that Sofia and Sophia are very popular, but are shown as two different names. But that’s what makes charts so fascinating.

FAQ

How much does it cost to use Advanced Data Analytics?

Advanced Data Analytics comes with ChatGPT Plus. Some of its features are available in GPT-4o for the free version of ChatGPT. ChatGPT Plus is $20/month. Advanced Data Analytics also is included with the Enterprise edition, but pricing for that hasn’t been released yet.

Is the data uploaded to ChatGPT for charting kept private or is there a risk of data exposure?

Assume that there’s always a privacy risk.

I asked this question to ChatGPT and this is what it told me: 

Data privacy is a priority for ChatGPT. Uploaded data is used solely for the purpose of the user’s current session and is not stored long-term or used for any other purposes. However, for highly sensitive data, users should always exercise caution and consider using the Enterprise version of ChatGPT, which offers enhanced data confidentiality.

Also: Generative AI brings new risks to everyone. Here’s how you can stay safe

My recommendation: Don’t trust ChatGPT or any generative AI tool. The Enterprise version is supposed to have more privacy controls, but I would recommend you only upload data that you won’t mind finding its way to public visibility.

Can ChatGPT’s Advanced Data Analysis handle real-time data or is it more suited for static datasets?

It’s possible, but there are some practical limitations. First, the Plus account will throttle the number of requests you can make in a given period of time. Second, you have to upload each file individually. There is the possibility you could use a licensed ChatGPT API to do real-time analytics. But for the chatbot itself, you’re looking at parsing data at rest.


You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter on Substack, and follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.





Source

Related Articles

Back to top button