Almost Timely News: 🗞️ Generative AI and the Synthesis Use Case (2024-06-02)

Christopher Penn

Co-Founder and Chief Data Scientist at TrustInsights.ai

Published Jun 2, 2024

+ Follow

Almost Timely News: 🗞️ Generative AI and the Synthesis Use Case (2024-06-02) :: View in Browser

👉 Download The Unofficial LinkedIn Algorithm Guide for Marketers!

Content Authenticity Statement

100% of this week's newsletter was generated by me, the human, though the walkthrough video shows the use of generative AI to make the LinkedIn guide. Learn why this kind of disclosure is a good idea and might be required for anyone doing business in any capacity with the EU in the near future.

Watch This Newsletter On YouTube 📺

Click here for the video 📺 version of this newsletter on YouTube »

Click here for an MP3 audio 🎧 only version »

What's On My Mind: Generative AI and the Synthesis Use Case Category

This week, let’s talk about the seventh major use case category for generative AI, especially with regard to large language models. I’ve talked extensively in my keynotes, workshops, and webinars about the six major use case categories:

Generation: making new data, typically in the form of language or images
Extraction: taking data out of other data, like extracting tables from a PDF
Summarization: making big data into small data
Rewriting: turning data from one form to another, like translation
Classification: organizing and categorizing our data, like sentiment analysis
Question answering: asking questions of our data

The seventh category, which is a blend of several of the tasks above but is distinct enough that I think it merits a callout, is synthesis. This is mashing data together to form something new.

Why is this different? Because if we look at the use cases above, all of them except generation are about taking existing data and in one form or another getting a smaller version of that data out. None of them are about putting data together, and that’s what synthesis is.

What does synthesis look like? Let’s go to a specific, tangible use case. My friend Amber Naslund works for LinkedIn and has been asked a bazillion times how LinkedIn’s algorithm works, why a post did or didn’t appear, etc. To be clear, Amber works in sales leadership, not machine learning or AI. She’s not the right person to ask these questions of, and despite her saying so very publicly, very frequently, people keep asking her.

However, LinkedIn itself has told us how its algorithm works, at length. LinkedIn has an engineering blog in which engineers - the people who actually build LinkedIn’s algorithm - document the technologies, algorithms, techniques, code, and tools they use to create the LinkedIn algorithm. From how the LinkedIn graph is distributed across more than a dozen servers globally in real-time (which is a ridiculous feat of engineering itself) to how the feed decides to show you what, the engineers have told us how it works.

So why don’t marketers and sales professionals know this? Because, engineers being engineers, they told us in engineering talk. And they’ve told us across dozens of blog posts, interviews, articles, podcasts, and videos around the web. They didn’t serve it up on a silver platter for us in terms a non-technical marketer can understand…

… and they are under no obligation to do so. Their job is to build tech, not explain it to the general public.

Until the advent of large language models, that meant very technical documents were simply out of reach for the average non-technical marketer. But with large language models - especially those models that have enormous short-term memories (context windows) like Google Gemini 1.5 and Anthropic Claude 3 Opus - we suddenly have the tools to translate technical jargon into terms we can understand and take action on.

But to do that, we need to play digital detective. We need to find all these pieces, gather them in one place… and synthesize them. Glue them together. Put all the puzzle pieces in the lid of the box and sort them so that we can do tasks like question answering and summarization.

So let’s go ahead and do that. I strongly recommend watching the video version of this if you want to see the process, step by step.

First, we need to find the actual data itself. We’ll start with LinkedIn’s engineering blog. Not every post is relevant to how the algorithm works, but we want to identify posts that talk about content in any capacity, from serving it up quickly to sorting it to preventing abuse and spam. Any post talking about content may have clues in it that would be useful.

Then we need to hit the broader web, with an AI-enabled search engine like Bing or Perplexity, something that can interpret large and complicated queries. We ask the search engine to find us interviews with LinkedIn engineers about content, especially on podcasts and on YouTube. Once we find those resources, we convert them to text format, typically with AI-powered transcription software if transcripts or captions aren’t provided. (Power move: YouTube closed captions can usually be downloaded with free utilities like yt-dlp, especially in bulk)

What we don’t want are third party opinions. Everyone and their cousin has their opinion - usually uninformed - about what they think LinkedIn is doing behind the scenes. We should be careful to exclude any of that kind of content in our work.

After that, we want to hit up those same AI-powered search engines for academic papers and research from LinkedIn engineers also about content, especially any kind of sorting, categorization, or ranking algorithms.

Once we’ve gathered up all the goods from as many places as we can find them, we load them into the language model of our choice and ask it to synthesize the knowledge we’ve gathered, discarding irrelevant stuff and summarizing in a single, unified framework all the knowledge related to the LinkedIn feed that we’ve provided. Be careful in prompting to ensure the model uses only the uploaded data; we want to restrict it to credible sources only, those being the ones we’ve provided.

After we’ve done that, we can convert the framework into a protocol, an actionable guide of practices we can deliver to our social media marketing teams that will help them get more out of LinkedIn - and spare Amber’s inbox.

That’s the power of synthesis. Why is it so important? If you’ve ever worked with a large language model and had it hallucinate - meaning invent something that wasn’t true - it’s because the model is drawing from its long term memory, its training data. Some of the training data in the model is crap information, patently false stuff. Some of what we’re asking, the model simply might not know. In an effort to be helpful and follow our instructions, the model instead returns the closest matches which are statistically correct, but factually wrong.

In the case of our LinkedIn synthesis, there are a LOT of people who have a lot of opinions about how LinkedIn works. Very few of them are LinkedIn engineers, and if we want to reduce hallucination - both from an absence of data as well as bad data - we need to bring our own data to the party, like all those documents.

The rule of thumb is this: the more data you bring, the less the model is likely to invent and the less likely it is to hallucinate.

We have our working guide for how to market on LinkedIn to take advantage of the information provided to us by engineering. If you’d like the PDF copy of this output, you can download it for free from the Trust Insights website in exchange for a form fill - but I would encourage you to try the process out for yourself so you can see firsthand how synthesis works. No matter what, you can safely stop asking Amber how LinkedIn works now.

And so we now have our Magnificent Seven, the Seven Samurai of Generative AI: generation, extraction, summarization, rewriting, classification, question answering, and synthesis. Welcome to the party, synthesis. It’s nice to have you here.

How Was This Issue?

Rate this week's newsletter issue with a single click. Your feedback over time helps me figure out what content to create for you.

Share With a Friend or Colleague

If you enjoy this newsletter and want to share it with a friend/colleague, please do. Send this URL to your friend/colleague:

https://www.christopherspenn.com/newsletter

For enrolled subscribers on Substack, there are referral rewards if you refer 100, 200, or 300 other readers. Visit the Leaderboard here.

ICYMI: In Case You Missed it

Besides the newly updated Generative AI for Marketers course I'm relentlessly flogging, this week we reviewed the big Google SEO leak on the livestream. Don't miss it.

Skill Up With Classes

These are just a few of the classes I have available over at the Trust Insights website that you can take.

Premium

Free

Advertisement: Generative AI Workshops & Courses

Imagine a world where your marketing strategies are supercharged by the most cutting-edge technology available – Generative AI. Generative AI has the potential to save you incredible amounts of time and money, and you have the opportunity to be at the forefront. Get up to speed on using generative AI in your business in a thoughtful way with Trust Insights' new offering, Generative AI for Marketers, which comes in two flavors, workshops and a course.

Workshops: Offer the Generative AI for Marketers half and full day workshops at your company. These hands-on sessions are packed with exercises, resources and practical tips that you can implement immediately.

👉 Click/tap here to book a workshop

Course: We’ve turned our most popular full-day workshop into a self-paced course. The Generative AI for Marketers online course is now available and just updated as of April 12! Use discount code ALMOSTTIMELY for $50 off the course tuition.

👉 Click/tap here to pre-register for the course

If you work at a company or organization that wants to do bulk licensing, let me know!

Get Back to Work

Folks who post jobs in the free Analytics for Marketers Slack community may have those jobs shared here, too. If you're looking for work, check out these recent open positions, and check out the Slack group for the comprehensive list.

Advertisement: Free Generative AI Cheat Sheets

Grab the Trust Insights cheat sheet bundle with the RACE Prompt Engineering framework, the PARE prompt refinement framework, and the TRIPS AI task identification framework AND worksheet, all in one convenient bundle, the generative AI power pack!

Download the bundle now for free!

How to Stay in Touch

Let's make sure we're connected in the places it suits you best. Here's where you can find different content:

My blog - daily videos, blog posts, and podcast episodes
My YouTube channel - daily videos, conference talks, and all things video
My company, Trust Insights - marketing analytics help
My podcast, Marketing over Coffee - weekly episodes of what's worth noting in marketing
My second podcast, In-Ear Insights - the Trust Insights weekly podcast focused on data and analytics
On Threads - random personal stuff and chaos
On LinkedIn - daily videos and news
On Instagram - personal photos and travels
My free Slack discussion forum, Analytics for Marketers - open conversations about marketing and analytics

Advertisement: Ukraine 🇺🇦 Humanitarian Fund

The war to free Ukraine continues. If you'd like to support humanitarian efforts in Ukraine, the Ukrainian government has set up a special portal, United24, to help make contributing easy. The effort to free Ukraine from Russia's illegal invasion needs your ongoing support.

👉 Donate today to the Ukraine Humanitarian Relief Fund »

Events I'll Be At

Here are the public events where I'm speaking and attending. Say hi if you're at an event also:

MAICON, Cleveland, September 2024
Traceone User Conference, Miami, September 2024
MarketingProfs B2B Forum, Boston, November 2024

There are also private events that aren't open to the public.

If you're an event organizer, let me help your event shine. Visit my speaking page for more details.

Can't be at an event? Stop by my private Slack group instead, Analytics for Marketers.

Required Disclosures

Events with links have purchased sponsorships in this newsletter and as a result, I receive direct financial compensation for promoting them.

Advertisements in this newsletter have paid to be promoted, and as a result, I receive direct financial compensation for promoting them.

My company, Trust Insights, maintains business partnerships with companies including, but not limited to, IBM, Cisco Systems, Amazon, Talkwalker, MarketingProfs, MarketMuse, Agorapulse, Hubspot, Informa, Demandbase, The Marketing AI Institute, and others. While links shared from partners are not explicit endorsements, nor do they directly financially benefit Trust Insights, a commercial relationship exists for which Trust Insights may receive indirect financial benefit, and thus I may receive indirect financial benefit from them as well.

Thank You

Thanks for subscribing and reading this far. I appreciate it. As always, thank you for your support, your attention, and your kindness.

See you next week,

Christopher S. Penn