Workflow Element Store

  1. Public Datasets
  2. Unstructured data (Audio)
  3. APIs and Data Feeds
  4. Data Pre-existing
  5. Crowdsourcing
  6. WebScraping
  7. Data Logging
  8. Unstructured data (Images / Videos)
  9. Data Collaboration and Partnerships
  10. Mobile Applications or IoT Applications
  11. Structured Data (Tabular)
  12. Surveys and Questionnaires
  13. Data Generation
  1. GCS
  2. S3
  3. GCP BigQuery
  4. Azure blob storage
  5. Azure Data Warehouse
  6. Oracle DB
  7. RDBMS
  8. MS SQL server
  9. Informatica
  10. NoSQL DB
  11. AWS Redshift
  12. PostgreSQL
  13. MySQL
  1. Feature Selection
  2. Polynomial Features
  3. Auto-Preprocessing libraries
  4. Time-Based Features
  5. AutoEDA libraries
  6. Binning
  7. Domain-Specific Feature Engineering
  8. Handling Noisy Data
  9. Textual Feature Extraction
  10. Data Scaling and Normalization
  11. Handling Categorical Data
  12. Interaction Features
  13. Feature Extraction from Images
  14. Dimensionality Reduction
  15. Dealing with Outliers
  16. Dimensionality Reduction
  17. Handling Imbalanced Classes
  18. Encoding Categorical Variables
  19. Data Scaling and Normalization
  20. Logarithmic Transform
  21. Handling Time-Series Data
  22. Handling Missing Data
  1. Supervised Learning-Regression
  2. Train-Test Split
  3. Supervised Learning-binary classification
  4. Ensemble Techniques
  5. Data Partitioning
  6. Supervised Learning-multiclass classification
  7. Unsupervised Learning
  8. Forecasting
  9. Time Series Anaysis
  10. Blackbox Techniques
  1. Regular Monitoring and Logging
  2. Cross-Validation
  3. Hyperparameter Tuning
  4. Early Stopping
  5. Data Partition-sequential
  6. Regularization
  7. Gradient Clipping
  8. Data Augmentation
  9. Ensemble Methods
  10. Weight Initialization
  11. Train-Test Split
  12. Batch Size Selection
  13. Learning Rate Scheduling
  14. Batch Normalization
  15. Transfer Learning
  1. Model Comparison
  2. Evaluation Metrics
  3. Hyperparameter Tuning
  4. Cross-Validation
  5. Performance Visualization
  6. External Validation
  7. Train-Test Split
  8. Model Interpretability
  9. Data Partitioning
  10. Regularization Techniques
  1. A/B Testing
  2. Model Health Monitoring
  3. Web APIs - Flask, FastAPI, etc.
  4. Alerting and Notification
  5. Security Considerations
  6. Model Drift
  7. Bias and Fairness Assessment
  8. Performance Metrics
  9. Continuous Integration and Deployment (CI/CD)
  10. Error Analysis
  11. Model Retraining and Updating
  12. Containerization
  13. Prediction Logging
  14. Model Registry
  15. Edge Deployment
  16. Cloud Deployment
  17. Documentation and Reporting
  18. Serverless Computing
  19. Model Serialization
  20. Streamlit
  21. Concept Drift Detection
  22. Model Versioning
  23. Feedback Collection
  24. Documentation and API Documentation
  25. Model Monitoring and Maintenance
  26. Monitoring and Logging
  27. Data Drift Monitoring
  1. End User Machine
  2. Mobile
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model

Feature Store
(Online / Offline)

Data Sources

Data Warehouse/ Data Lake

EDA, Data Pre Processing & Feature Engineering

Model Selection

Model Training & Hyper Parameter Tuning

Model Evaluation

Model Deployment

End User Device

Model Registry