Apache Polaris: The Definitive Guide: Enriching Apache Iceberg Data Lakehouses with an Open Source Catalog

· ·
· "O'Reilly Media, Inc."
Ebook
258
Pages
Eligible
Ratings and reviews aren’t verified  Learn More

About this ebook

Revolutionize your understanding of modern data management with Apache Polaris (incubating), the open source catalog designed for data lakehouse industry standard Apache Iceberg. This comprehensive guide takes you on a journey through the intricacies of Apache Iceberg data lakehouses, highlighting the pivotal role of Iceberg catalogs.

Authors Alex Merced, Andrew Madson, and Tomer Shiran explore Apache Polaris's architecture and features in detail, equipping you with the knowledge needed to leverage its full potential. Data engineers, data architects, data scientists, and data analysts will learn how to seamlessly integrate Apache Polaris with popular data tools like Apache Spark, Snowflake, and Dremio to enhance data management capabilities, optimize workflows, and secure datasets.

  • Get a comprehensive introduction to Iceberg data lakehouses
  • Understand how catalogs facilitate efficient data management and querying in Iceberg
  • Explore Apache Polaris's unique architecture and its powerful features
  • Deploy Apache Polaris locally, and deploy managed Apache Polaris from Snowflake and Dremio
  • Perform basic table operations on Apache Spark, Snowflake, and Dremio

About the author

Alex Merced is a senior technical evangelist at Dremio with experience as a developer and instructor. His professional journey includes roles at GenEd Systems, Crossfield Digital, CampusGuard, and General Assembly. He co-authored "Apache Iceberg: The Definitive Guide" published by O'Reilly and has spoken at notable events such as Data Day Texas and Data Council. Alex is passionate about technology, sharing his expertise through blogs, videos, podcasts like Datanation and Web Dev 101, and contributions to the JavaScript and Python communities with libraries like SencilloDB and CoquitoJS.

Andrew Madson is an experienced data leader with 17 years of experience leading technical teams. Currently the Head of Evangelism and Education at Tobiko - the creators of SQLMesh and SQLGlot, Andrew has held senior leadership positions at institutions such as JP Morgan, LPL Financial, MassMutual, and Arizona State University. In addition to leading data teams, Andrew is a professor of data science and analytics at several universities, where he teaches graduate courses in machine learning, statistics, SQL, R, Python, Tableau, and Power BI.

Tomer Shiran is the Founder and Chief Product Officer of Dremio, an open data lakehouse platform that enables companies to run analytics in the cloud without the cost, complexity and lock-in of data warehouses. As the company's founding CEO, Tomer built a world-class organization that has raised over $400M and now serves hundreds of the world's largest enterprises, including 3 of the Fortune 5. Prior to Dremio, Tomer was the 4th employee and VP Product of MapR, a Big Data analytics pioneer. He also held numerous product management and engineering roles at Microsoft and IBM Research, founded several websites that have served millions of users and hundreds of thousands of paying customers, and is a successful author and presenter on a wide range of industry topics. He holds an MS in Computer Engineering from Carnegie Mellon University and a BS in Computer Science from Technion - Israel Institute of Technology.

Rate this ebook

Tell us what you think.

Reading information

Smartphones and tablets
Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.
Laptops and computers
You can listen to audiobooks purchased on Google Play using your computer's web browser.
eReaders and other devices
To read on e-ink devices like Kobo eReaders, you'll need to download a file and transfer it to your device. Follow the detailed Help Center instructions to transfer the files to supported eReaders.