Python: End-to-end Data Analysis

· · · ·
· Packt Publishing Ltd
eBook
931
페이지
검증되지 않은 평점과 리뷰입니다.  자세히 알아보기

eBook 정보

Leverage the power of Python to clean, scrape, analyze, and visualize your dataAbout This BookClean, format, and explore your data using the popular Python libraries and get valuable insights from itAnalyze big data sets; create attractive visualizations; manipulate and process various data types using NumPy, SciPy, and matplotlib; and morePacked with easy-to-follow examples to develop advanced computational skills for the analysis of complex dataWho This Book Is For

This course is for developers, analysts, and data scientists who want to learn data analysis from scratch. This course will provide you with a solid foundation from which to analyze data with varying complexity. A working knowledge of Python (and a strong interest in playing with your data) is recommended.

What You Will LearnUnderstand the importance of data analysis and master its processing stepsGet comfortable using Python and its associated data analysis libraries such as Pandas, NumPy, and SciPyClean and transform your data and apply advanced statistical analysis to create attractive visualizationsAnalyze images and time series dataMine text and analyze social networksPerform web scraping and work with different databases, Hadoop, and SparkUse statistical models to discover patterns in dataDetect similarities and differences in data with clusteringWork with Jupyter Notebook to produce publication-ready figures to be included in reportsIn Detail

Data analysis is the process of applying logical and analytical reasoning to study each component of data present in the system. Python is a multi-domain, high-level, programming language that offers a range of tools and libraries suitable for all purposes, it has slowly evolved as one of the primary languages for data science. Have you ever imagined becoming an expert at effectively approaching data analysis problems, solving them, and extracting all of the available information from your data? If yes, look no further, this is the course you need!

In this course, we will get you started with Python data analysis by introducing the basics of data analysis and supported Python libraries such as matplotlib, NumPy, and pandas. Create visualizations by choosing color maps, different shapes, sizes, and palettes then delve into statistical data analysis using distribution algorithms and correlations. You'll then find your way around different data and numerical problems, get to grips with Spark and HDFS, and set up migration scripts for web mining. You'll be able to quickly and accurately perform hands-on sorting, reduction, and subsequent analysis, and fully appreciate how data analysis methods can support business decision-making. Finally, you will delve into advanced techniques such as performing regression, quantifying cause and effect using Bayesian methods, and discovering how to use Python's tools for supervised machine learning.

The course provides you with highly practical content explaining data analysis with Python, from the following Packt books:

Getting Started with Python Data Analysis.Python Data Analysis Cookbook.Mastering Python Data Analysis.

By the end of this course, you will have all the knowledge you need to analyze your data with varying complexity levels, and turn it into actionable insights.

Style and approach

Learn Python data analysis using engaging examples and fun exercises, and with a gentle and friendly but comprehensive "learn-by-doing" approach. It offers you a useful way of analyzing the data that's specific to this course, but that can also be applied to any other data. This course is designed to be both a guide and a reference for moving beyond the basics of data analysis.

저자 정보

Phuong Vothihong has an MSc in Computer Science, related to the area of machine learning. After graduating, she worked as a data scientist. She has significant experience in analyzing users behavior and building recommendation systems based on a user's web history. Phuong is interested in reading machine learning, mathematics, and algorithm books, as well as data analysis articles.

Martin Czygan studied German Literature and Computer Science in Leipzig, Germany. He has been working professionally as a software engineer for about 10 years. For the past eight years, he has been delving into Python and still enjoying it. In recent years he has been helping clients to build data-processing pipelines and search and analytics systems. His consultancy can be found at: http://www.xvfz.net.

Ivan Idris was born in Bulgaria to Indonesian parents. He moved to the Netherlands and graduated in experimental physics. His graduation thesis had a strong emphasis on applied computer science. After graduating, he worked for several companies as a software developer, data warehouse developer, and QA analyst. His professional interests are business intelligence, big data, and cloud computing. He enjoys writing clean, testable code and interesting technical articles. He is the author of NumPy Beginner's Guide, NumPy Cookbook, Learning NumPy, and Python Data Analysis, all by Packt Publishing.

Magnus Vilhelm Persson is a scientist with a passion for Python and open source software usage and development. He obtained his PhD in Physics/Astronomy from Copenhagen University's Centre for Star and Planet Formation (StarPlan) in 2013. Since then, he has continued his research in Astronomy at various academic institutes across Europe. In his research, he uses various types of data and analysis to gain insights into how stars are formed. He has participated in radio shows about Astronomy and also organized workshops and intensive courses about the use of Python for data analysis. You can check out his web page at: http://vilhelm.nu.Luiz.

Luiz Felipe Martins holds a PhD in applied mathematics from Brown University and has worked as a researcher and educator for more than 20 years. His research is mainly in the field of applied probability. He has been involved in developing code for the open source homework system, WeBWorK, where he wrote a library for the visualization of systems of differential equations. He was supported by an NSF grant for this project. Currently, he is an Associate Professor in the Department of Mathematics at Cleveland State University, Cleveland, Ohio, where he has developed several courses in applied mathematics and scientific computing. His current duties include coordinating all first-year calculus sessions.

이 eBook 평가

의견을 알려주세요.

읽기 정보

스마트폰 및 태블릿
AndroidiPad/iPhoneGoogle Play 북 앱을 설치하세요. 계정과 자동으로 동기화되어 어디서나 온라인 또는 오프라인으로 책을 읽을 수 있습니다.
노트북 및 컴퓨터
컴퓨터의 웹브라우저를 사용하여 Google Play에서 구매한 오디오북을 들을 수 있습니다.
eReader 및 기타 기기
Kobo eReader 등의 eBook 리더기에서 읽으려면 파일을 다운로드하여 기기로 전송해야 합니다. 지원되는 eBook 리더기로 파일을 전송하려면 고객센터에서 자세한 안내를 따르세요.