Always know what to expect from your data with great_expectations
Great Expectations is a shared, open standard for data quality. It helps data teams eliminate pipeline debt, through data testing, documentation, and profiling.
In this beginner-level great_expectations tutorial, my objective is to help you learn more about great_expectations and find a way to incorporate great_expectations into your own data wrangling or exploratory data analysis as a data test or data validation, or data documentation tool.
Tutorial Level: Beginners or Starters
Content Timeline:
- Code & Jupyter Notebook Introduction
- great_expectations — what, why, and how?
- Why do you need great_expectations?
- What actually is great_expectations?
- great_expectations in simple terms
- great_expectations as a data documentation tool
- great_expectations method at a glance
- great_expectations installation
- great_expectations initialization
- great_expectations context
- great_expectations demo with the titanic dataset
- Export and apply great_expectations config
- Working with time-series dataset
- Processing SparkDataFrame
- great_expectations extension — debt_expectations
Library great_expectations GitHub:
GitHub URL for the samples in the Video:
Be Good and Do Good.
Thanks