On why you should use Jupyter Notebooks: "Notebooks are a great way to tell a story, and telling stories is what all fields should be about. Especially computer science." This is from an interview with Doug Blank, Head of Research at @cometml. He continues: "To me, a Jupyter notebook is a blank sheet of paper. You can write a story in it. And if you change the name of one of the characters in paragraph one, you have to change the name of the character throughout the whole story." People often tell me they don't like notebooks because people write bad code in them. Nonsense. Bad developers write bad code. Notebooks have nothing to do with that. "Some educators feel that it's too open-ended and too flexible, but I disagree — it's a new way of doing computing (...)." I always recommend that developers learn how to use notebooks. Not as their primary way of writing software but as an alternative tool they can use to experiment, troubleshoot, and become more effective at their work. The math in my head is simple: A developer who knows how to use notebooks effectively is better than a developer who doesn't. Notebooks aren't a replacement for what you do. They are a boost. The rest of the interview is pretty good: https://lnkd.in/e7ezJsx7
With a few best practices, it is easy to write good notebooks.
Notebooks can be incredibly liberating - after all, what are they other than a pre-built framework for exploration? You don't have to worry about how charts are rendered (mostly). You don't have to worry about proper annotation (hi, Markdown). You can actually work on what's important - the data. And when you're comfortable with the data and the workflow, you migrate that code into a more robust set of objects or scripts. That extra time spent migrating is more than made up for by the time saved when using the workbook. The important thing is that you accept that there will be migration later, and structure your notebook accordingly.
That’s right!! If you dislike people who use notebooks to write bad code, then you don’t dislike notebooks… you dislike those people.
Santiago Valdarrama have you tried marimo? It takes the notebook experience to a new level. I still think Jupyter is better when you are exploring initial ideas. It has less constraints
I find Jupyter notebooks annoying when it comes to running multiple lines of code at the same time. They're great for learning, though.
Engaging analogy. How do you integrate Jupyter Notebooks effectively in your workflow? 🚀
Agreed Santiago!
Sr. Machine Learning Engineer
5moMy flow: stage 3 folders on root dev, test, and prod. Each has a .yml, .env, and requirements.txt with a .venv and then your ipynbs. Use the main.ipynb as a SOP of sorts. Document your project. Put html div links for the table contents. I will often have DEV and PROD Bools so I can easily toggle back and troubleshoot specific blocks. Once I’m good, I convert the ipynb to a .py and move to test. Copy over the other files. New venv. Reduce the bloat from the Jupyter kernel. Add async. Unit test here. Finally, move to prod. New .venv. Reduce more bloat from requirements.txt. Lean. Clean up code. Make it bullet proof. Use dev containers to test running lean in a Linux container or whatever. Now… you have the it all. Documentation with dev, unit testing and features with .py that tricky with ipynb, and then your final. Parallelize and async. Then containerize or at least build some scripts to run your environment and code with an agent. Ignore your .env in ignores and only push those in prod at container build time through arguments. This way - whichever way the project goes - you’re ready. And you go from a 1-2 GB project to a 200-300 gb lean quick Linux deployable but still have the others for reference/docs.