GPT-4 Vision API to boost recommender & search algorithms for less than $1k
OpenAI have recently released GPT-4 Vision API, while it is very exciting (AI can now see!), it hasn't generated an avalanche of new ideas yet. So today, I'd like to propose what I think is a no-brainer use case for any e-commerce company - image tagging to improve search or recommendations algorithms.
See the prompt and the JSON outputs below or see it on my GitHub repo
What is GPT-4 Vision API?
Vision API lets GPT-4 'see' the image (that you can send via an API) and analyse it in an impressive amount of detail. And since it is GPT-4, it can leverage its understanding of the world and reasoning capabilities to make sense of what it sees. It is definitely not perfect, but it is the most impressive vision system we have ever had - it understands context & humour, has near perfect OCR, and can work out complex patterns.
Plus, it is really cheap, a mere ~$0.01 per image, meaning that if ASOS were to tag every single one of their 68,000 products that is available on their Black Friday sale, it would cost them about $680 (make it $1,000 for extra prompt tokens).
Results Are Very Impressive
In short, I am very impressed by the results. GPT-4 Vision API recognised the items, who they were for as well as huge amounts of detail about them, including visual humour for the greeting card.
I don't see any reason why you wouldn't want to use this and give your data science team some extra data to make their search and recommender algorithms a boost, all for less that $1k - hard to imagine a better RoI.
Prompt Used to Generate Tags
Apart from the API, you can also try it out in ChatGPT Plus, paste the prompt and upload an image of the product.
JSON Outputs
Below you can see the 4 random images I have generated tags for #1-3 have had a generic prompt, while for #4, I have tailored the prompt to explain the humour and gave it extra specific categories - to demonstrate what you can do with a bit of customisation. Note the understanding of the visual humour, it is very impressive.
AI Transformation Project Manager & AI Engineer | Leading AI-Driven Change to Empower Teams & Drive ROI | Enabling Success in an AI-First Workforce through Tools & Training
1yBeautiful, thank you for sharing!
Building Trustworthy AI Agents for Enterprise | Co-Founder & CTO @ AI Dionic
1yThis is a great idea, and on the surface fairly simple use case. It kind of aligns with what people have been doing already for product descriptions, but the vision API takes it to the next level.