Ron Kohavi’s Post

Vice President and Technical Fellow | Data Science, Engineering | AI, Machine Learning, Controlled Experiments | Ex-Airbnb, Ex-Microsoft, Ex-Amazon

1mo Edited

Is there p-hacking in e-commerce A/B Testing? A new paper by Alex P. Miller and Kartik Hosanagar (https://lnkd.in/gCBgfWh8 ) claims they can’t find evidence of that based on 2,270 experiments conducted by 242 firms. This is very different than Ron Berman et al's prior paper, which claimed heavy p-hacking. Key difference: the prior paper looked at Optimizely data from 2014, which at the time encouraged p-hacking (as a feature), but that statistically naïve “feature” was fixed in 2015 (see https://lnkd.in/gbVWtXh and Peter Bordens' post on how he was almost fired: https://lnkd.in/gF9k3vBk). The new paper uses data from a different (unnamed, but “large U.S.-based”) vendor. P-hacking is intentional or unintentional misapplication of statistics to achieve statistically significant results using human degrees of freedom, such as ending experiments early, processing outliers, looking at segments, etc. It used to be a big problem 10 years ago, but awareness has risen and while I think it still occurs (mostly unintentionally), it is much less frequent today. Still, I strongly recommend everyone run A/A tests (see Chapter 19 of https://lnkd.in/eWuqBVw) and look at the actual distribution of p-values from your experimentation platform to see if there’s an unreasonable discontinuity around alpha (usually 0.05). Want to learn more about A/B testing and trust? I teach an interactive 10-hour course: https://bit.ly/ABClassRKLI and an advanced course: https://lnkd.in/gU9xrezE #abtesting #twymansLaw #AATest #peeking Leonid Pekelis Aisling Scott, Ph.D. Christophe Van den Bulte Uri Simonsohn Ulrich Schimmack

8 Comments

Gareth Wilson

Senior Product Manager

1mo

Hi Ron Kohavi, you mentioned that pre-2014 Optimizely encouraged p-hacking. As noted in the article, the platform studied here (unnamed) also doesn’t explicitly advise stopping tests only once the sample size is reached. The article also states it moves metrics to a separate section as soon as they hit statistical significance. How does this differ from what Optimizely was doing in terms of discouraging p-hacking? It seems like this platform could also be subtly encouraging p-hacking.

1 Reaction

Kartik Hosanagar

AI, Entrepreneurship, Digital Transformation, Mindfulness. Wharton professor. Cofounder Yodle, Jumpcut

1mo

Thanks Ron for sharing our paper. I believe you have seen a prior conference version of this paper many years ago. It took us some time to get it out in print. At some level, it's natural for untrained testers to fall into the trap of looking for statistical significance rather than the truth. It was a pleasant surprise to find a null result in our meta analysis (and our data is from a similar period as the Berman paper). As an aside, i like your definition: "P-hacking is intentional or unintentional misapplication of statistics to achieve statistically significant results." We once had a referee strongly object to use of the term for unintentional misuse. Took us a lot of effort to convince them. Keep up your posts on testing. I enjoy them.

2 Reactions

Sarnath Kannan

Staff data scientist, MAFer, Georgia Tech Alumni

1mo

Discontinuity around Alpha -- Thats a great idea! Thanks!!!

1 Reaction

Leo Murillo

Software Engineering Manager at Amazon Fulfillment Technologies

1mo

Saved

Michael Hughes

Growth @ Eclipse | Helping Ecommerce brands grow through experimentation

These concepts will always be difficult for those only lightly involved with experimentation to fully grasp. Like Ron Kohavi, running an A/A test our preferred way of illustrating how a result matures as the sample size grows

1 Reaction

Paulo Saavedra

Consultor UX Lead / Estrategia y Gestión / Productos Digitales / Gestor de Proyectos

1mo

Esto es genial

1 Reaction

Alex P. Miller

Assistant Professor, USC Marshall School of Business

1mo

Thanks for posting, Ron! Our data come from a similar time period (2014-2016). I think you're right highlighting this was a bigger deal back then, but maybe that makes our results all the more surprising!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Geddam Sai Praneeth

Attended Lovely Professional University
7mo
Report this post
I'm thrilled to share my latest blog as part of my university assignment. In this blog, I delve into understanding the importance of Exploring Mobile Optimization For E-commerce Websites: Benefits And Considerations .This blog contains the parameter which we should follow to protect our data. Check out my blog to learn more :-https://lnkd.in/gDV23BvZ Let's hear your thoughts and experiences in the comments below! Don't forget to like the post if you're as intrigued as I am! https://lnkd.in/gDV23BvZ

Navigating the World of Drop Shipping : Pros, Cons, and Best Practices

inklewriter.com
Like Comment
To view or add a comment, sign in
Dingxiang

59 followers
6mo
Report this post
In 2023, the Hangzhou Intermediate People's Court concluded two cases involving "store relocation software" related to unfair competition. The plaintiff in this case was the operating entity of a well-known domestic large-scale e-commerce platform, while the defendant developed a replica software called "a certain moving quick goods listing batch release", which was accused of illegally obtaining platform product information and selling it in other service markets. According to the plaintiff's claim, its platform and merchants invested a large amount of costs in operating product, transaction, and logistics data information, and took various measures to protect these data resources, prohibiting unauthorized access, copying, storage, and use. The defendant's software provided services to paying users for a long time, illegally capturing the plaintiff platform's product links, titles, images, details, parameters, prices, inventory, and other information, and promoted that it can be copied and transported to other platforms with a single click, resulting in large sales volume. After trial, the court found that the defendant, without authorization from the plaintiff, illegally obtained and uploaded platform product information to other competing shopping platforms, violating relevant provisions of the "PRC Anti-Unfair Competition Law" and constituting unfair competition on the internet. After the court learned of the plaintiff's willingness to mediate, it actively explained the legal principles and consequences to the defendant, making the defendant clearly aware of the seriousness of its infringement. Ultimately, the parties voluntarily reached a settlement agreement. According to the agreement, the defendant promised to delete the relevant data and derivative data, and guaranteed that the software would no longer have the functionality to illegally obtain relevant data. In addition, the defendant also paid 100,000 yuan in compensation for economic losses to the plaintiff. #E-commerce #Dingxiang #Antifraud https://lnkd.in/gjY5h-bV

618: Can e-commerce platforms fend off malicious website scraping attacks?

aisecurius.com
Like Comment
To view or add a comment, sign in
Vimal Kaul

Chief Technology Officer at Ogilvy Africa | Harnessing Technology to Fuel Creative Solutions | 18+ Years in AdTech
1w
Report this post
🚀 From 2 Seconds to 50ms: How We Made E-Commerce Lightning-Fast ⚡ Ever been frustrated by a slow search bar? Imagine losing customers because your product search takes 2 seconds—a lifetime in e-commerce! 😩 Now imagine turning that 2 seconds into 50 milliseconds. Sounds like science fiction? Nope, it’s AI in action. In my latest blog, I share how we: ✅ Transformed sluggish searches into lightning-fast results. ✅ Used FAISS, Sentence Transformers, and FastAPI to achieve the impossible. ✅ Created an exceptional user experience that not only saves time but boosts conversions. 🎯 Whether you’re a tech enthusiast, developer, or e-commerce pro, this is a must-read if you want to stay ahead of the competition. Don’t miss this journey into the future of e-commerce search. Your audience (and sales numbers) will thank you. #ECommerceOptimization #ArtificialIntelligence #FAISS #UserExperience #Innovation

How I Optimized E-Commerce Search from 2 Seconds to 50ms

kaulvimal.medium.com
Like Comment
To view or add a comment, sign in
Virgil Dubois

Shopify developer 💻🏆 Theme/Liquid specialist
5mo Edited
Report this post
I made a major discovery in the Liquid language earlier this week. Due to an oversight or undocumented behavior of the default filter, it's possible to create objects natively using the Shopify Liquid language. The syntax "assign variable = null | default: key: 'value'" will create an accessible key-value pair, unseen before. This opens up a lot, as with some tinkering, complex data structures can be crafted, all in Liquid, no performance cost, working along the json filter. And yes, you can attach any data to the value. Only drawback at this stage is subjective and common to most of the Liquid end-game, reading/writing it. Now being able to create both arrays and objects, Liquid solidifies its position as a great tool to build both performant and creative solutions for merchants !
30 Comments
Like Comment
To view or add a comment, sign in
Martin G.

Sales @Smartproxy | Web Scraping tech for AI
2mo Edited
Report this post
The landscape of unblocking technology providers has seen a high growth in the recent year. With targets like e-commerce and search engines being the most popular domains scraped nowadays. Sources like The Web Scraping Club and Proxyway can save time and resources when evaluating the web data collection tools on the market. Proxyway launched their own research just a month ago. On the screenshot, we see a comparison of the unblocking tools for e-commerce for some vendors. Sharing more info in the comments section below 👇
5 Comments
Like Comment
To view or add a comment, sign in
Shivang Goyal

Building brands with stories
9mo Edited
Report this post
Boy, google has expanded structured data support to product variants. If you are in e-commerce, you better this this bad boy sorted. Get'em CTRs up and improve the visibility of your products! Follow the documentation to implement and use this validation tool to maximize your chance to show up! Imp 🔗: Documentation: https://lnkd.in/dJwqE2C3 Validation Tool: https://lnkd.in/dtMhkThM Use this tool if you don't code (like me!): https://lnkd.in/dmE4UD2q Have fun with it! #ecommerce #seo #structureddata

Product Variant Structured Data (ProductGroup, Product) | Google Search Central | Documentation | Google for Developers

developers.google.com
Like Comment
To view or add a comment, sign in
Karl Groves

Providing full service digital accessibility including audits, training, strategic consulting, remediation, and custom design & development services with over 20 years experience.
2mo Edited
Report this post
No a11y pro-tip today. Instead, a discussion: The above in this image comes from the first round testing of our Shopify Theme. Each page was subjected to 270 individual checks. Each check was given a response of “Pass”, which means test was relevant and the check was passed, “Fail”, which means the test was relevant but the check failed, or “N/A” which means that the test was not relevant for that page.What kind of things are “N/A”? That’s a check that cannot be run because a precondition was not met. For example, we have about a dozen checks for multimedia. So, if there’s no multimedia, then the tester doesn’t check the thing. Instead it is marked “N/A”. Here’s the discussion topic: Would the product (in this case, our theme) Get a 99% Pass or a 95% pass? Some people claim that irrelevant items should be marked as “Pass” by virtue of not failing. Some argue that irrelevant items cannot possibly have contribute to a grade because the check to verify them cannot be run. I discuss this topic at length here: https://lnkd.in/edxe275F
7 Comments
Like Comment
To view or add a comment, sign in
Shrilaxmi Relekar

Aspiring Data science professional | Ex-infoscion
1mo
Report this post
🌟 Thrilled to share my latest project on E-Commerce Text Classification! 📦 In this project, I: *Loaded and explored a rich dataset of e-commerce product descriptions. *Applied advanced text preprocessing techniques using the NLTK library: removing HTML tags, numbers, and special characters; tokenization; lemmatization; and converting text to lowercase. *Converted processed text data into numerical representations using CountVectorizer and TfidfVectorizer. *Built a Random Forest Classifier model that achieved an impressive 97% cross-validation accuracy. Check out the full project on my GitHub to see how we transformed raw text into actionable insights and predictions!

GitHub - shrirelekar/E-Commerce-Text-Classification

github.com
Like Comment
To view or add a comment, sign in
iCONQUER Ltd

99 followers
9mo
Report this post
New Google Search Support for Product Variant Structured Data On the 20th Feb, Google introduced support for Product Variant structured data. This new structured data supports three new properties: hasVariant, variesBy, and productGroupID, which allow it to handle the majority of the ways e-commerce sites offer product variations. This means that accurate markup may be utilised for items that have variants, such as various sizes, colours, or pricing alternatives dependent on quantity selected. If your website sells items with such versions, it is worth looking into this to ensure accurate markup for Google Merchant Centre. https://lnkd.in/gj_D4DqV

Adding structured data support for Product Variants | Google Search Central Blog | Google for Developers

developers.google.com
Like Comment
To view or add a comment, sign in
Simon Jackson, PhD Simon Jackson, PhD is an Influencer

Building Unstoppable, Data-driven Businesses 🚀 Analytics, Experimentation, & AI Expert, Consultant, Speaker 🚀 Aussie with World-class Experience 🌏
4mo
Report this post
Experimentation gurus, what's going on with this primary metric advice? ... I've had multiple clients recently tell me they've been advised to pick a primary metric that is closest to their change. Example for e-commerce site: - If changing the cart functionality, use "items added to cart" - But if changing product search, use something search related (I don't even know, "searches made?") This is completely contradictory to advice to use a single, shared primary metric and: - What I consider best practice - What every world-class experiment program I've seen does - What field experts suggest like Kohavi's many publications on the OEC I'd like to understand... Who is making these recommendations and what's the thinking? Anyone seen the same or have thoughts to share?

57 Comments
Like Comment
To view or add a comment, sign in

35,511 followers

View Profile Follow

Ron Kohavi’s Post

More from this author

Goodhart’s Law with Examples

The QA Tradeoff in A/B Testing

Should you suggest or enforce a template for hypotheses in A/B tests?

Explore topics