Workflow Element Store

  1. Surveys and Questionnaires
  2. Mobile Applications or IoT Applications
  3. Data Collaboration and Partnerships
  4. Data bases - NoSQL
  5. Feedback Data
  6. Public Datasets
  7. Flat files
  8. APIs and Data Feeds
  9. Data Bases - SQL
  10. Experiments (DoE)
  11. WebScraping
  1. Azure ADF
  2. s3
  3. ETL/ELT pipeline
  4. AWS Kinesis
  5. RDBMS
  6. GCP Dataflow
  7. AWS RDS
  8. Oracle DB
  9. Azure Synapse
  10. Azure blob storage
  11. MS SQL server
  12. GCP Data Fusion
  13. Azure Streaming Analytics
  14. MySQL
  15. GCS
  16. AWS Glue
  17. Apache Kafka
  18. PostgreSQL
  19. MongoDB
  20. AWS Redshift
  21. GCP BigQuery
  1. Data Scaling and Normalization
  2. AutoEDA libraries
  3. Domain-Specific Feature Engineering
  4. Data Transformations
  5. Textual Feature Extraction
  6. Binning / Discretization
  7. Handling Imbalanced Classes
  8. Handling Missing Data
  9. Handling Noisy Data
  10. Feature Selection
  11. Augmentation
  12. Data Partitioning - Train, Validation, & Test
  13. Auto-Preprocessing libraries
  14. Dimensionality Reduction
  15. Feature Extraction from Images
  16. Interaction Features
  17. Time-Based Features
  18. Handling Time-Series Data
  19. Handling Categorical Data
  20. Annotation
  21. Dealing with Outliers
  22. Polynomial Features
  1. Early Stopping
  2. Recommendation Engine
  3. Model Comparison
  4. Reinforcement Learning
  5. Batch Size Selection
  6. Transfer Learning
  7. Learning Rate Scheduling
  8. Cross-Validation
  9. Word Embeddings
  10. Regression Analysis
  11. Regularization Techniques
  12. Data Augmentation
  13. Multiclass Classification Techniques
  14. Natural Language Processing
  15. Association Rules
  16. Blackbox - Neural Network Models
  17. Ensemble Techniques
  18. Performance Visualization
  19. External Validation
  20. Network Analytics/ GeoSpatial Analytics
  21. Evaluation Metrics
  22. Regularization
  23. Hyperparameter Tuning
  24. Forecasting Techniques
  25. Binary Classification Techniques
  26. Model Interpretability
  27. Cross-Validation
  28. Batch Normalization
  29. AutoML
  30. Transfer Learning
  31. Clustering
  32. Regular Monitoring and Logging
  33. Weight Initialization
  34. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  1. model registry
  2. code repository
  3. Data Preprocessing pipeline models
  4. Datawarehouse
  5. Databases
  1. Model Versioning
  2. Data Drift Monitoring
  3. Containerization
  4. Concept Drift Detection
  5. Prediction Logging
  6. Flask
  7. Bias and Fairness Assessment
  8. Performance Metrics
  9. Model Health Monitoring
  10. Model Drift
  11. Edge Deployment
  12. Feedback Collection
  13. FastAPI
  14. Cloud Deployment
  15. Serverless Computing
  16. Streamlit
  17. Alerting and Notification
  18. Model Serialization
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API