Workflow Element Store

  1. Public Datasets
  2. Data Generation
  3. Unstructured data (Images / Videos)
  4. WebScraping
  5. Crowdsourcing
  6. Data Logging
  7. Data Collaboration and Partnerships
  8. Surveys and Questionnaires
  9. Data Pre-existing
  10. Unstructured data (Audio)
  11. Mobile Applications or IoT Applications
  12. APIs and Data Feeds
  13. Structured Data (Tabular)
  1. S3
  2. GCP BigQuery
  3. Azure blob storage
  4. Informatica
  5. MS SQL server
  6. AWS Redshift
  7. PostgreSQL
  8. NoSQL DB
  9. Azure Data Warehouse
  10. RDBMS
  11. GCS
  12. MySQL
  13. Oracle DB
  1. Textual Feature Extraction
  2. Time-Based Features
  3. Dimensionality Reduction
  4. Handling Noisy Data
  5. Handling Time-Series Data
  6. Domain-Specific Feature Engineering
  7. Handling Categorical Data
  8. Dealing with Outliers
  9. Interaction Features
  10. Binning
  11. Polynomial Features
  12. Handling Missing Data
  13. Handling Imbalanced Classes
  14. Data Scaling and Normalization
  15. Feature Selection
  16. AutoEDA libraries
  17. Logarithmic Transform
  18. Auto-Preprocessing libraries
  19. Data Scaling and Normalization
  20. Dimensionality Reduction
  21. Feature Extraction from Images
  22. Encoding Categorical Variables
  1. Train-Test Split
  2. Blackbox Techniques
  3. Forecasting
  4. Ensemble Techniques
  5. Supervised Learning-Regression
  6. Data Partitioning
  7. Supervised Learning-multiclass classification
  8. Supervised Learning-binary classification
  9. Unsupervised Learning
  10. Time Series Anaysis
  1. Early Stopping
  2. Data Partition-sequential
  3. Transfer Learning
  4. Gradient Clipping
  5. Learning Rate Scheduling
  6. Batch Normalization
  7. Train-Test Split
  8. Regular Monitoring and Logging
  9. Regularization
  10. Ensemble Methods
  11. Data Augmentation
  12. Batch Size Selection
  13. Cross-Validation
  14. Weight Initialization
  15. Hyperparameter Tuning
  1. Train-Test Split
  2. Model Interpretability
  3. Performance Visualization
  4. External Validation
  5. Cross-Validation
  6. Hyperparameter Tuning
  7. Evaluation Metrics
  8. Data Partitioning
  9. Regularization Techniques
  10. Model Comparison
  1. Model Serialization
  2. Streamlit
  3. Concept Drift Detection
  4. Continuous Integration and Deployment (CI/CD)
  5. Containerization
  6. Alerting and Notification
  7. Serverless Computing
  8. Model Health Monitoring
  9. Edge Deployment
  10. Prediction Logging
  11. Feedback Collection
  12. Bias and Fairness Assessment
  13. Model Monitoring and Maintenance
  14. Cloud Deployment
  15. Web APIs - Flask, FastAPI, etc.
  16. Data Drift Monitoring
  17. Performance Metrics
  18. Documentation and API Documentation
  19. Documentation and Reporting
  20. A/B Testing
  21. Security Considerations
  22. Model Retraining and Updating
  23. Model Registry
  24. Model Versioning
  25. Monitoring and Logging
  26. Error Analysis
  27. Model Drift
  1. Mobile
  2. End User Machine
ML Workflow Beginner - Architecture
  • Element belongs to model
  • Element not belongs to model
Feature Store

Feature Store
(Online / Offline)

Data Sources

Data Sources

Data Warehouse

Data Warehouse/ Data Lake

Data Pre Processing & Feature Engineering

EDA, Data Pre Processing & Feature Engineering

Model Selection

Model Selection

Model Training & Hyper Parameter Tuning

Model Training & Hyper Parameter Tuning

Model Evaluation

Model Evaluation

Model Deployment

Model Deployment

End User Device

End User Device

Model Registry

Model Registry