Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more from just $11.99/month.

#49 Data Science Tool Building

UNLIMITED

#49 Data Science Tool Building

FromDataFramed


UNLIMITED

#49 Data Science Tool Building

FromDataFramed

ratings:
Length:
58 minutes
Released:
Nov 19, 2018
Format:
Podcast episode

Description

Hugo speaks with Wes McKinney, creator of the pandas project for data analysis tools in Python and author of Python for Data Analysis, among many other things. Wes and Hugo talk about data science tool building, what it took to get pandas off the ground and how he approaches building “human interfaces to data” to make individuals more productive. On top of this, they’ll talk about the future of data science tooling, including the Apache arrow project and how it can facilitate this future, the importance of DataFrames that are portable between programming languages and building tools that facilitate data analysis work in the big data limit. Pandas initially arose from Wes noticing that people were nowhere near as productive as they could be due to lack of tooling & the projects he’s working on today, which they’ll discuss, arise from the same place and present a bold vision for the future.LINKS FROM THE SHOWDATAFRAMED SURVEYDataFramed Survey (take it so that we can make an even better podcast for you)DATAFRAMED GUEST SUGGESTIONSDataFramed Guest Suggestions (who do you want to hear on Season 2?)FROM THE INTERVIEWWes on TwitterRoads and Bridges: The Unseen Labor Behind Our Digital Infrastructure by Nadia Eghbalpandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.Ursa LabsFROM THE SEGMENTSData Science Best Practices (with Ben Skrainka ~17:10)To Explain or To Predict? (By Galit Shmueli)Statistical Modeling: The Two Cultures (By Leo Breiman)The Book of Why (By Judea Pearl & Dana Mackenzie)Studies in Interpretability (with Peadar Coyle at ~39:00)Modelling Loss Curves in Insurance with RStan (By Mick Cooney)Lime: Explaining the predictions of any machine learning classifier Probabilistic Programming PrimerOriginal music and sounds by The Sticks.
Released:
Nov 19, 2018
Format:
Podcast episode

Titles in the series (100)

Data science is one of the fastest growing industries and has been called the ‘Sexiest job of the 21st Century’. But what exactly is data science? In this podcast, brought to you by DataCamp, Hugo Bowne-Anderson approaches the question by exploring what problems data science can solve rather than defining what data science is. From automated medical diagnosis and self-driving cars to recommendation systems and climate change, come on a journey with experts from industry and academia to explore the industry that will change the course of the 21st century.