Analysing Data with ChatGPT-4 Vision
TL/DR Massive potential when interpreting and explaining data, still making lots of mistakes.
Our household is an early adopter when it comes to electric vehicles (I purchased a PHEV in November 2014 and moved to a BEV in Sep 2019). Almost five years ago we invested in solar panels and a battery for our home to support our increased energy needs. Anneliese purchased her first BEV in November 2022 adding a second EV to our household.
I calculate that we have saved on average $6,780 per year meaning we will have paid off our investment by the end of summer 2025.
We generate ~10 MWh per year which is almost exactly how much power the house uses excluding charging the vehicles. Each year we purchase ~5 MWh from the grid that covers charging one BEV that drives ~20,000 km per year. We are on the Contact Energy Good Nights plan that gives us free electricity from 9pm-midnight every day. Our house battery allows us to shift our grid consumption to off-peak times. This means over the last year we paid ~$500 for ~15 MWh of energy consumed. This works out to 3.6c per kWh which is a fraction of the current average price of 33.5c per kWh including line and daily connection charges.
Since the Clean Car Programme came out in 2017, electric vehicle uptake in NZ has increased by 35%. The increased adoption of BEV's increases the need for electricity, Transpower has estimated that we need to increase electricity generation in NZ by 68% by 2050 to meet our future energy needs.
With the upcoming election on Saturday the three major parties have policies to address energy and the environment.
Labour: EV charging hubs every 150-200km on main highways (policy here). Rebates of $2000 for solar panels on homes, plus $2000 for a battery (policy here).
Greens: Payments up to $6k and $30k interest-free loans for homeowners to install solar or efficiency upgrades and make tax-deductible carbon-zero upgrades (policy here).
National: Support the rollout of 10,000 EV chargers by 2030, scrap the clean car discount (policy here).
Over the last five years 80% of my EV charging has been at home, this is 100% for Anneliese's charging since she purchased a BEV last November. Given this is a common pattern investing in home solar and batteries into homes is critical.
Announced two weeks ago ChatGPT can now see, hear, and speak.
One of the things that I have been most excited to try is the data analysis component of ChatGPT Vision. The idea is that you can show ChatGPT a complex graph and have ChatGPT interpret and extract insights from the data.
This works by uploading an image to ChatGPT and asking it questions to interpret what it sees.
I started by uploading these two images to ChatGPT and asking some questions of the data.
I asked the following question: "Can you look at these two data tables and charts for a household of four people in New Zealand that generate their electricity from solar panels, supplemented by the grid and store energy in a house battery for use after the sun goes down. They also have two electric cars that they mostly charge at home. Can you explain the graphs to me?"
This is a pretty good start, although there is a mistake in interpreting the colour of the lines in the graphs.
I then asked about the seasonal aspects to power consumption and generation.
Chat GPT did a good job interpreting the seasonal aspects present in the graphs.
I then asked about any anomalies present in the graphs.
In this case Chat GPT was on the right track looking at the Residual line and relating it to the table data but it didn't identify the anomalies I was looking for, so I dug a bit deeper.
In this case the answer ChatGPT gave is incorrect and I needed to direct ChatGPT to improve the answer.
I confirmed extended holidays as the reason for reduced household consumption and asked for suggestion on why the May 2023 might have occurred?
I identified something interesting here which I have seen repeated by ChatGPT Vision a few times now. ChatGPT misread the line in the table with an off by one row error incorrectly reading the data from April 2023 and interpreting it as May 2023.
ChatGPT corrected its error and suggested the correct reason (increased car usage) amongst it set of four possible reasons for the anomaly.
I then turned my attention to see if ChatGPT could identify the months where the solar panels malfunctioned.
ChatGPT confirmed that the answer can be found by looking for significant negative deviations in the Residual line (Red) of the "Monthly Solar Generation (2019 to 2023 YoY)" graph.
ChatGPT then failed to read this graph correctly and return the answer from the associated table data.
I then asked ChatGPT to summarise the findings:
This is my first attempt to work with ChatGPT -4 Vision to interpret complex data and although it didn't get everything right it clearly demonstrated an ability to interpret data and offer potential reasons for the anomalies discovered. This will be game changing tool for data analysis in years to come.
#UPDATE Andrew Revell pointed me towards the Open AI Advanced Data Analysis plugin which I hadn't been able to access previously.
The Advanced Data Analysis (ADA) plugin and ChatGPT-4 Vision both offer data analysis capabilities, but they focus on different types of data and offer different features.
Advanced Data Analysis (ADA) is designed to work with text-rich documents. It has three main capabilities:
Synthesis: Analyse information from documents to generate new content or insights.
Transformation: Alter the presentation of information without changing its underlying essence.
Extraction: Identify and pull-out specific pieces of information from a document.
ADA supports a variety of file types including PDF, Text, PowerPoint, Word, Excel, and Comma-separated values. It allows you to upload up to 10 documents with a file size limit of 500 MB per file.
On the other hand, ChatGPT-4 Vision focuses on image analysis. It allows you to upload photos and uses computer vision to analyse the images.
I uploaded the same table data shared above to ADA and asked ChatGPT to analyse the data.
The analysis was useful and return the python code to re-create the generated graphs.
I asked ADA to suggest two months where the solar equipment malfunctioning lead to less than anticipated solar generation?
In answering this question ADS correctly identified April 2021 but didn't interpret the seasonal aspects of the time series data, so I probed a bit further.
I then got curious as to whether ADS could re-produce in python the Time Series re-composition graphs, I had created to pass into the vision example.
I then asked again the question around the anomolies:
Now this was curious why was Feb identified instead of March, sure Feb was low, the solar malfunctioned at the end of Feb but it was out for almost all of March before coming back for April.
It turned out that ADS wasn't calculating residual beyond Feb 2023. I provided my calculations to see if that would help.
In summary I found the ADS plug-in very powerful and incredibly useful for generating the python code needed to analyse the data. I have created a repo on Github of the generated code and graphs.
AI GBB @ Microsoft
9moNigel Parker - might want to consider the new GPT4 with CodeInterpreter. I think you might get better abilities at data analysis/ statistics.
Building collaborative communities and knowledge networks | Children's Book Writer and Illustrator
1yThanks Nigel! We got solar last year and a PHEV this year. Interesting for us to look at your numbers to compare.
Associate Director of Datapay AI Labs
1yMy experiencing with prompting GPT is to start with ensuring it understands the structure of the information you want it to process first. I put GPT into a while loop until I am satisfied that all the relevant structure, sub-structure, labelling etc is clearly in context before analysis. It's not unlike data cleansing work you need to do before you do ETL.
CEO @ Aceso Health | Digital Health Tech, ex Microsoft
1yGoing to replicate this at home, this is a neat use case and I too have plenty of solar and charging data. Thanks for the inspiration Nigel, awesome!
Head of AI @ Serko
1yI wonder how it would have fared via Advanced Data Analytics on the raw spreadsheet data rather than an image?