Gjør som tusenvis av andre bokelskere
Abonner på vårt nyhetsbrev og få rabatter og inspirasjon til din neste leseopplevelse.
Ved å abonnere godtar du vår personvernerklæring.Du kan når som helst melde deg av våre nyhetsbrev.
Build scalable and reliable data ecosystems using Data Mesh, Databricks Spark, and KafkaKey Features:Develop modern data skills used in emerging technologiesLearn pragmatic design methodologies such as Data Mesh and data lakehousesGain a deeper understanding of data governancePurchase of the print or Kindle book includes a free PDF eBookBook Description:Modern Data Architectures with Python will teach you how to seamlessly incorporate your machine learning and data science work streams into your open data platforms. You'll learn how to take your data and create open lakehouses that work with any technology using tried-and-true techniques, including the medallion architecture and Delta Lake.Starting with the fundamentals, this book will help you build pipelines on Databricks, an open data platform, using SQL and Python. You'll gain an understanding of notebooks and applications written in Python using standard software engineering tools such as git, pre-commit, Jenkins, and Github. Next, you'll delve into streaming and batch-based data processing using Apache Spark and Confluent Kafka. As you advance, you'll learn how to deploy your resources using infrastructure as code and how to automate your workflows and code development. Since any data platform's ability to handle and work with AI and ML is a vital component, you'll also explore the basics of ML and how to work with modern MLOps tooling. Finally, you'll get hands-on experience with Apache Spark, one of the key data technologies in today's market.By the end of this book, you'll have amassed a wealth of practical and theoretical knowledge to build, manage, orchestrate, and architect your data ecosystems.What You Will Learn:Understand data patterns including delta architectureDiscover how to increase performance with Spark internalsFind out how to design critical data diagramsExplore MLOps with tools such as AutoML and MLflowGet to grips with building data products in a data meshDiscover data governance and build confidence in your dataIntroduce data visualizations and dashboards into your data practiceWho this book is for:This book is for developers, analytics engineers, and managers looking to further develop a data ecosystem within their organization. While they're not prerequisites, basic knowledge of Python and prior experience with data will help you to read and follow along with the examples.
A beginner's guide to simplifying Extract, Transform, Load (ETL) processes with the help of hands-on tips, tricks, and best practices, in a fun and interactive wayKey Features Explore data wrangling with the help of real-world examples and business use cases Study various ways to extract the most value from your data in minimal time Boost your knowledge with bonus topics, such as random data generation and data integrity checksBook DescriptionWhile a huge amount of data is readily available to us, it is not useful in its raw form. For data to be meaningful, it must be curated and refined.If you're a beginner, then The Data Wrangling Workshop will help to break down the process for you. You'll start with the basics and build your knowledge, progressing from the core aspects behind data wrangling, to using the most popular tools and techniques.This book starts by showing you how to work with data structures using Python. Through examples and activities, you'll understand why you should stay away from traditional methods of data cleaning used in other languages and take advantage of the specialized pre-built routines in Python. Later, you'll learn how to use the same Python backend to extract and transform data from an array of sources, including the internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, the book teaches you how to handle missing or incorrect data, and reformat it based on the requirements from your downstream analytics tool.By the end of this book, you will have developed a solid understanding of how to perform data wrangling with Python, and learned several techniques and best practices to extract, clean, transform, and format your data efficiently, from a diverse array of sources.What you will learn Get to grips with the fundamentals of data wrangling Understand how to model data with random data generation and data integrity checks Discover how to examine data with descriptive statistics and plotting techniques Explore how to search and retrieve information with regular expressions Delve into commonly-used Python data science libraries Become well-versed with how to handle and compensate for missing dataWho this book is forThe Data Wrangling Workshop is designed for developers, data analysts, and business analysts who are looking to pursue a career as a full-fledged data scientist or analytics expert. Although this book is for beginners who want to start data wrangling, prior working knowledge of the Python programming language is necessary to easily grasp the concepts covered here. It will also help to have a rudimentary knowledge of relational databases and SQL.
Abonner på vårt nyhetsbrev og få rabatter og inspirasjon til din neste leseopplevelse.
Ved å abonnere godtar du vår personvernerklæring.