Huge thanks to our training team (and their helpers) that have set up the Data Engineering course for WeThinkCode. I believe the uptake was unprecedented https://lnkd.in/dGP64NZ8
Congrats Frank!
Skip to main content
Huge thanks to our training team (and their helpers) that have set up the Data Engineering course for WeThinkCode. I believe the uptake was unprecedented https://lnkd.in/dGP64NZ8
9k+| Member of Global Remote Team| Building Tech & Product Team| AWS Cloud (Certified Architect)| DevSecOps| Kubernetes (CKA)| Terraform ( Certified)| Jenkins| Python| GO| Linux| Cloud Security| Docker| Azure| Ansible
2moCongrats Frank!
To view or add a comment, sign in
💣💥 HUGE updates to my Duke Master in Interdisciplinary Data Science (MIDS) 🦀Rust for Data Engineering course! 💣💥 https://lnkd.in/eJJkrwze 🔑 I've added key terms and definitions to help cement core concepts. 📚 ❓ There are new quizzes to test your knowledge. 🧠 🧪 New Hands-on labs allow you to apply what you learn through interactive coding challenges. 💻 📰 Additional interactivity to dig deeper into key topics at your own pace. 🤓 ☁️ Updated demos showcase leveraging the power of Rust across cloud providers like AWS, GCP and Azure. ⚡️ Topics include: 🦀 Leveraging Rust's performance, safety and concurrency for data tasks 🦀 Processing large datasets with Polars, Arrow, Parquet and more 🦀 Building robust data pipelines and ETL flows 🦀Applying Rust for ML with PyTorch, ONNX and Hugging Face 🤗 🦀 Querying data in BigQuery, SQLite and more Whether you're new to Rust or an experienced coder, you'll gain practical skills to apply Rust for data engineering. Check out the updates in my course! 👉 https://lnkd.in/eJJkrwze ✨I build courses: https://lnkd.in/eSK3QYbZ #RustLang #RustDev #DataEngineering #DataScience #DataAnalysis #DataViz #DataPipelines #ETL #Databases #SQL #BigData #DataFrames #DataWrangling #CloudComputing #AWS #Azure #GCP #ApacheArrow #PyTorch #ONNX #HuggingFace
To view or add a comment, sign in
A few months before, I posted a roadmap for becoming a data engineer. I couldn't complete that roadmap, so I will launch a new one, which is more detailed than the previous one. Moreover, I mentioned every course link and drive folder so that anyone can find the right platform to learn. Drive Link for Roadmap: https://lnkd.in/gGJZBKzs Roadmap for Data Engineering: From today, I will start my journey into data engineering. I will follow the Roadmap of Data Engineering by Darshil Parmar and the research material I found in the last three weeks. Here on LinkedIn, I will be sharing my day-to-day learning for the next 11-13 Months. 1] SQL The most important skill that is needed to become a data engineer is SQL. These courses will cover almost everything with hands-on practice. Follow the courses in this order:- freeCodeCamp:- https://lnkd.in/gPZRnN4E Udemy:- https://lnkd.in/gXXrbqBG Darshil Parmar: https://lnkd.in/efSZQ29x For hands-on practice:- https://lnkd.in/gc-Npce8 2] Programming Language: Python freeCodeCamp:- Learn Python - Full Course for Beginner https://lnkd.in/gvGQW7YU DataCamp:- https://lnkd.in/gqNx6Taz Darshil Parmar: https://lnkd.in/efSZQ29x Learn Python programming for data engineering, it requires learning libraries:- NumPy, Pandas, Matplotlib, Seaborn, NLTK, scikit-learn, and Statsmodel. 4] Data Warehousing: Snowflake Darshil Parmar: https://lnkd.in/efSZQ29x Udemy:- https://lnkd.in/gqXJnjDy Certification:- https://lnkd.in/gz9HWcVg 5] Data Processing: Two types:- i) Batch Processing and ii) Real-time data processing For that, I will learn Apache Spark and Databricks. These tools are important for understanding core architecture and high-level APIs. Moreover, the PySpark language needs to learn data processing and also, Kafka for Real-time streaming. Spark Fundamentals:- https://lnkd.in/gFeZVhf4 Databricks:- https://lnkd.in/gVvKnD5N Real Time Streaming (Kafka):- https://lnkd.in/ghf_JfGH 6] Work-flow Management: Apache Airflow {Open source} DataCamp:- https://lnkd.in/gZpjzYne 7] Knowledge of at least one cloud Platform: Amazon Web Services (AWS):- https://lnkd.in/gakgfpny 8] A Few Advanced Tools: Apache Hudi, Iceberg, Datadog Follow the drive link for more details. Also, Follow the Darshil Parmar, freeCodeCamp Onehouse #dataengineering
To view or add a comment, sign in
I want to learn more about data engineering for machine learning system. Hence, I started a personal project to learn about data architecture for ML training and inference in a recommendation system. I have created a demo project and been writing up what I have learnt in a series of blog posts. I have completed the demo codes together with 2 out of 3 planned blog posts. The 3rd blog post is released a working draft since exams hit me this week. You can find each component via the links below: - GitHub repo: https://lnkd.in/gvutFYzr - Development log: https://lnkd.in/gN485z53 - Storage and Compute, Offline Training: https://lnkd.in/guTS5-sg - Storage and Compute, Online Inference: https://lnkd.in/g8B5Knph Overall, I have learnt a great deal from this project. I have been fascinated with distributed systems since the time reading Designing Data-Intensive Applications cover to cover during morning commute. Learning more about industry examples at extreme scale, such as S3 and Spark which I mentioned in the blog posts, give me a perspective of how theoretical considerations are implemented. #dataengineering #recsys
To view or add a comment, sign in
Starting off with the Data Engineering Zoomcamp Week 1 : Set up a local development environment for streamlined workflows. Installed Docker and utilized Docker Compose to run a PostgreSQL database. Imported data from files into PostgreSQL using Python and Pandas. Tackled challenging SQL queries to enhance problem-solving skills. Created a GCP account and set up the necessary tools. Installed Terraform and GCP CLI to manage cloud infrastructure. Deployed and managed GCP resources using Terraform. This is all part of the Great Zoomcamp by Alexey Grigorev, which provides top-notch content and hands-on experience for aspiring data engineers. Looking forward to Week 2, where we’ll be diving into workflow orchestration with Mage.AI! #DataEngineering #GCP #Terraform #SQL #CloudInfrastructure #MageAI
To view or add a comment, sign in
Want to crack top product-based company? Don’t worry, I’ve got you covered! Just focus on the points below: 📌 Medium to hard SQL questions (LeetCode for practice). 📌 Arrays, Strings, and Linked Lists for DSA (GFG and LeetCode for practice). 📌 Python for data transformation (LeetCode for practice). 📌 Spark (Understand the internal workings and architecture of Spark; expect a lot of scenario-based questions). 📌 Cloud technology (You should know how cloud services interact with each other). 📌 End-to-end data engineering projects. 📌 A complete understanding of your existing project (Don’t miss questions related to data quality and data governance). ♻ Found this post useful? Repost it ! Follow ABHAY SINGH 😀 Would you like to know more about how to crack a product-based company? Let's connect: https://lnkd.in/g7Dewhyq
To view or add a comment, sign in
Many thanks, Team 𝐈𝐧𝐝𝐢𝐚 Big Congratulations! Together, let's celebrate! 𝐃𝐢𝐬𝐜𝐨𝐮𝐧𝐭 𝐜𝐨𝐝𝐞: 𝑰𝑵𝑫𝑰𝑨20 (20% off) on all combo packages that include courses The promotion is good for two days. I suggest this reasonably priced course offered by Sagar Prajapati GeekCoders. **Why Should You View This Course?** - **𝐄𝐱𝐩𝐞𝐫𝐭 𝐈𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐨𝐫𝐬**: Gain knowledge from professionals with years of experience in data science and data engineering. **Practical, Real-World Examples and Exercises**: Follow along with these hands-on labs. - **𝐃𝐞𝐭𝐚𝐢𝐥𝐞𝐝 𝐂𝐨𝐯𝐞𝐫𝐚𝐠𝐞**: Learn everything there is to know about Databricks and its ecosystem, from basic ideas to cutting-edge methods. **𝐏𝐫𝐨𝐟𝐞𝐬𝐬𝐢𝐨𝐧𝐚𝐥 𝐆𝐫𝐨𝐰𝐭𝐡**: Give yourself the tools you need to succeed in the data-driven job market of today. **Who Intended Audience:** - IT experts and analysts - Data scientists and engineers - Scholars and students - Anyone with an interest in machine learning and big data For the latest content from us, don't forget to subscribe, like, and press the notification bell. Let's embark on this informative journey together! **👍 Maintain Contact** In addition, after completing a course, you can ask questions on the community channel if you have any questions or if you are having trouble with any tasks. 𝐓𝐡𝐢𝐬 𝐥𝐢𝐧𝐤: https://lnkd.in/d65tMH3J I highly recommended you to save this post and also share it with people who might need this helping in your career... 📱 Join the Telegram Messenger Channel to stay updated 𝐋𝐢𝐧𝐤: https://lnkd.in/gYJZb-zV Happy Learning !!! Regards! Together, we can grow and learn. Please share this again with your network. 🅳🅰🆃🅰🅴🅽🅶🅸🅽🅴🅴🆁🅸🅽🅶 🅳🅰🆃🅰🆂🅲🅸🅴🅽🅲🅴 🅳🅰🆃🅰🅰🅽🅰🅻🆈🆂🅸🆂 🅳🅴🅴🅿🅻🅴🅰🆁🅽🅸🅽🅶 🅰🅸
To view or add a comment, sign in
A single tool is not enough to… …get into Data Engineering You might see this all the time on YouTube and so on: 'This is the ultimate Spark course to make you a Data Engineer'. Let me tell you: a single course, a single tool will not make you a Data Engineer. A single tool is just not enough! That's why I have over 30 courses in my Data Engineering Academy - from the basics that you need, like Python for Data Engineers or Docker, over platform & pipeline design, understanding all kinds of data stores, and data modeling, over fundamental tools like Snowflake, dbt, Airflow and more. And yeah, of course there is going to be Spark in it too ;) Beside the courses we have full end to end projects on all the clouds - AWS, Azure, and GCP with modern data warehouses and so on. Now, do you need to take all of these 30 courses? Of course not. You only need to focus on having knowledge in a few different, very specific areas, like the basics. As I said, things like Python, computer networking, Git, or Docker. Then platform & pipeline design, that is always important. And then you need to have a few fundamental tools in the area of how to ingest and process data. And then you should learn how to store data with relational databases or transactional databases and with data warehouses and how to visualize it. And that's all. So, take a look at my Academy and only do what you really need! #dataengineer #dataengineering #datascience #bigdata
To view or add a comment, sign in
Three months might seem like a long time, but the growth and learning during this period are truly remarkable. From January 15th, diving into the #dezoomcamp, I was unsure of what to expect really. But today, looking back, I can confidently say it was a great experience with a lot of learning, connecting with like minded people from all around the world and having fun. Here's a quick overview of tools we used and tasks we accomplished: 🔹 Docker and Terraform: Streamlining application packaging and infrastructure setup. 🔹 Orchestration with Mage: Efficient management of data pipelines. 🔹 BigQuery: Leveraging its capabilities for data warehousing. 🔹 Analytics Engineering with dbt Labs: Transforming raw data into actionable insights. 🔹 Spark for Batch Processing: Handling large-scale datasets with ease. 🔹 Kafka for Streaming: Designing real-time data processing solutions. We also had two really great workshops in between. The first one was about dltHub (Data Load Tool) and the second one was about RisingWave and how to process real-time streaming data using SQL. The final project was the real challenge, where we applied everything we learned to build an end-to-end data pipeline in Google Cloud. If you are curious to read more details, check my latest newsletter: https://lnkd.in/dNVUa-D8 I must admit, this was really hard. But the feeling of accomplishment and learning made it all worthwhile. And I would do it again! #DataEngineering #python #GCP #BigQuery #Orchestration #Docker #Terraform #PersonalGrowth DataTalksClub
To view or add a comment, sign in
☕ Weekend Caffeinated Insights: Engineering Blog Recommendations Last week, I shared a list of top books for learning the fundamentals of data engineering. This week, I’d like to highlight some of the best engineering blogs, published by tech companies, that I have followed for many years. I highly recommend these blogs to data engineers. They are invaluable resources for staying updated with the latest developments and best practices in the field. It's important to recognise the significant contributions of engineering teams at these companies to the data engineering ecosystem as they tackle scaling challenges while continuously innovating and advancing the field. Many common open-source technologies used today, such as Apache Kafka, Apache Superset, Apache Airflow, Apache Iceberg, and Apache Hudi, were initially developed by tech giants like LinkedIn, Airbnb, Uber, and Netflix. Which company engineering blogs do you follow and recommend besides these? (Links to the listed blogs can be found in the comments section)
To view or add a comment, sign in
Machine Learning Engineer | MEng Data Science (Cum Laude)
2moVery cool collaboration!