Workflow Element Store

  1. APIs and Data Feeds
  2. Data Collaboration and Partnerships
  3. Data Bases - SQL
  4. Flat files
  5. Surveys and Questionnaires
  6. Mobile Applications or IoT Applications
  7. Data bases - NoSQL
  8. Public Datasets
  9. Experiments (DoE)
  10. Feedback Data
  11. WebScraping
  1. GCP Dataflow
  2. AWS Kinesis
  3. MS SQL server
  4. ETL/ELT pipeline
  5. Oracle DB
  6. AWS Glue
  7. Azure Streaming Analytics
  8. MySQL
  9. GCP Data Fusion
  10. RDBMS
  11. MongoDB
  12. s3
  13. Apache Kafka
  14. AWS Redshift
  15. Azure blob storage
  16. Azure ADF
  17. Azure Synapse
  18. GCS
  19. GCP BigQuery
  20. PostgreSQL
  21. AWS RDS
  1. Feature Selection
  2. Augmentation
  3. Feature Extraction from Images
  4. Interaction Features
  5. Handling Categorical Data
  6. Data Partitioning - Train, Validation, & Test
  7. Auto-Preprocessing libraries
  8. Annotation
  9. Handling Time-Series Data
  10. Polynomial Features
  11. Domain-Specific Feature Engineering
  12. Handling Imbalanced Classes
  13. AutoEDA libraries
  14. Data Transformations
  15. Handling Missing Data
  16. Textual Feature Extraction
  17. Binning / Discretization
  18. Handling Noisy Data
  19. Dealing with Outliers
  20. Time-Based Features
  21. Data Scaling and Normalization
  22. Dimensionality Reduction
  1. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  2. Data Augmentation
  3. Batch Normalization
  4. Batch Size Selection
  5. Regular Monitoring and Logging
  6. Performance Visualization
  7. Cross-Validation
  8. Natural Language Processing
  9. Early Stopping
  10. AutoML
  11. Hyperparameter Tuning
  12. Word Embeddings
  13. Regularization Techniques
  14. Multiclass Classification Techniques
  15. Clustering
  16. Weight Initialization
  17. Regularization
  18. Binary Classification Techniques
  19. Reinforcement Learning
  20. Ensemble Techniques
  21. External Validation
  22. Blackbox - Neural Network Models
  23. Model Interpretability
  24. Transfer Learning
  25. Evaluation Metrics
  26. Transfer Learning
  27. Regression Analysis
  28. Association Rules
  29. Recommendation Engine
  30. Network Analytics/ GeoSpatial Analytics
  31. Cross-Validation
  32. Model Comparison
  33. Learning Rate Scheduling
  34. Forecasting Techniques
  1. Data Preprocessing pipeline models
  2. code repository
  3. Datawarehouse
  4. model registry
  5. Databases
  1. FastAPI
  2. Edge Deployment
  3. Concept Drift Detection
  4. Bias and Fairness Assessment
  5. Data Drift Monitoring
  6. Flask
  7. Cloud Deployment
  8. Performance Metrics
  9. Model Health Monitoring
  10. Model Serialization
  11. Feedback Collection
  12. Streamlit
  13. Containerization
  14. Alerting and Notification
  15. Model Drift
  16. Prediction Logging
  17. Serverless Computing
  18. Model Versioning
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API