Machine Learning Models – Do You Need a Model Catalog or Feature Store When Tracking Experiments?

machine learningneural networks

Here are some uses I've found for each case:

  • The model catalog lets you document the use case, the machine learning framework that was used to train the model, the version of that framework, and the algorithm used along with its hyperparameters.- Link
  • A feature store is a tool for storing commonly used features. When data scientists develop features for a machine learning model, those features can be added to the feature store. This makes those features available for reuse. – Link
  • Experiment tracking is the process of saving all experiment-related information that you care about for every experiment you run. – Link

From my understanding, good experiment tracking would track models (thus documenting the use case) and training data (thus storing the features). What's the point of the model catalog and feature store then?

Best Answer

The feature store is primarily for reusability. Commonly used features can be used directly from the feature store without the need for re-engineering. This is different than the data used in experiment tracking, which stores the data and features explicitly used in that experiment. The saved features can be pointers to feature store.

The model catalog described in Oracle's website is for reproducibility and source of a deployed model. But, it does not save all the experiment-related information, tuning etc. like an experiment tracking process.

Related Question