Workflow Element Store

  1. Data Partitioning - Train, Validation, & Test
  2. Binning / Discretization
  3. Polynomial Features
  4. Annotation
  5. Handling Missing Data
  6. Feature Selection
  7. Textual Feature Extraction
  8. Data Scaling and Normalization
  9. Handling Imbalanced Classes
  10. Auto-Preprocessing libraries
  11. Domain-Specific Feature Engineering
  12. Dealing with Outliers
  13. Dimensionality Reduction
  14. Data Transformations
  15. Augmentation
  16. Handling Time-Series Data
  17. Interaction Features
  18. Handling Categorical Data
  19. AutoEDA libraries
  20. Time-Based Features
  21. Feature Extraction from Images
  22. Handling Noisy Data
  1. Transfer Learning
  2. Forecasting Techniques
  3. Model Comparison
  4. Performance Visualization
  5. Natural Language Processing
  6. Recommendation Engine
  7. Association Rules
  8. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  9. Blackbox - Neural Network Models
  10. Binary Classification Techniques
  11. Weight Initialization
  12. Batch Size Selection
  13. Ensemble Techniques
  14. Word Embeddings
  15. Learning Rate Scheduling
  16. Hyperparameter Tuning
  17. Regression Analysis
  18. Transfer Learning
  19. Regularization Techniques
  20. Cross-Validation
  21. Evaluation Metrics
  22. AutoML
  23. Clustering
  24. Regular Monitoring and Logging
  25. Cross-Validation
  26. Reinforcement Learning
  27. Early Stopping
  28. Multiclass Classification Techniques
  29. Data Augmentation
  30. Regularization
  31. External Validation
  32. Network Analytics/ GeoSpatial Analytics
  33. Model Interpretability
  34. Batch Normalization
  1. Apache Airflow
  2. Data Preprocessing pipeline models
  3. Github Actions
  4. Github
  5. Evidently.ai
  6. Kafka Brokers
  7. code repository
  8. Databases
  9. model registry
  10. Datawarehouse
ML Workflow Advanced - Architecture
  • Element belongs to model
  • Element not belongs to model
Data Sources

Data Sources

Streaming Data

Streaming Data

Batch Data

Batch Data

Cloud Storage

Cloud Storage

Labeled Data

Labeled Data

Feature Engineering Pipeline

Feature Engineering Pipeline

Experimentation

Experimentation

ML Model

ML Model

Repository

Repository

CI/CD component

Continuous integration/Continuous delivery

Continuous deployment

Artifact Store

Feature Store System

Offline DB Online DB

Orchestration Component

Artifact Store

CI/CD Component

Model Registry

Scheduler

Workflow orchestration component

Automation ML Workflow Pipeline

Automation ML Workflow Pipeline

Monitoring Component

Monitoring Component

Model serving component

(Prediction on new batch or streaming data)