In the world of machine learning, data clean-up and feature engineering are incredibly time-consuming. And with different people, teams and roles working on different parts of the pipeline, it’s likely that there’s duplication of effort happening somewhere along the way.

SageMaker Feature Store helps tackle that problem by providing a central place to store, share and update features.

In this hands-on tutorial, I give a brief overview of what SageMaker Feature Store is, and why you might use it. Then we launch SageMaker Studio, upload a Jupyter Notebook, and dive into the code. We’ll import data from a UCI auto dataset and do some simple transformations on it. Then using the SageMaker Studio UI, as well as code, we’ll see how to create a Feature Group. Finally, we’ll ingest data, and see how to consume it from an online and offline store.

Resources used in this video:
• Book: Getting Started with SageMaker Studio, by Michael Hsieh: https://www.amazon.com/Getting-Started-Amazon-SageMaker-Studio-ebook/dp/B09QKY4MXF/ref=sr_1_1
• Jupyter Notebook: https://github.com/PacktPublishing/Getting-Started-with-Amazon-SageMaker-Studio/tree/main/chapter04
• Dataset: https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data
• You might also be interested in Getting Started with SageMaker Studio: https://youtu.be/91z9s7iboeM
• And Getting Started with SageMaker Data Wrangler: https://youtu.be/tB0WrVlYhc4

??If you’re interested in getting AWS certifications, check out these full courses. They include lots of hands-on demos, quizzes and full practice exams. Use FRIENDS10 for a 10% discount!
– AWS Certified Cloud Practitioner: https://academy.zerotomastery.io/a/aff_n20ghyn4/external?affcode=441520_lm7gzk-d
– AWS Certified Solutions Architect Associate: https://academy.zerotomastery.io/a/aff_464yrtnn/external?affcode=441520_lm7gzk-d

00:00 – What is SageMaker Feature Store?
00:54 – Why use SageMaker Feature Store?
03:06 – Online and Offline Feature Stores
04:18 – BOOK: Getting Started with SageMaker Studio, by Michael Hsieh
04:36 – Downloading the Jupyter Notebook for the hands-on demo
05:26 – DISCLAIMER: Check SageMaker pricing if you plan to follow along with the demo!
05:47 – Uploading and launching the Jupyter Notebook in SageMaker Studio
06:35 – Overviewing the notebook and use case for auto data
07:19 – Setting up the code and importing the data from the UCI archive
09:04 – Creating a feature group in the SageMaker Studio UI
12:54 – Creating a feature group programmatically from a Jupyter Notebook
14:08 – Ingesting data into the feature group (in batches)
14:58 – Ingesting streaming data into the feature group
15:43 – Accessing data in the offline store (AWS Glue/Athena) programmatically
16:28 – Running a SQL query against the offline store in Amazon Athena
17:29 – IMPORTANT!! Shut down your resources

Similar Posts