Workflow Element Store

  1. Data Collaboration and Partnerships
  2. Flat files
  3. Public Datasets
  4. Feedback Data
  5. APIs and Data Feeds
  6. Experiments (DoE)
  7. Data bases - NoSQL
  8. Surveys and Questionnaires
  9. WebScraping
  10. Data Bases - SQL
  11. Mobile Applications or IoT Applications
  1. AWS RDS
  2. GCS
  3. AWS Kinesis
  4. Azure Synapse
  5. GCP BigQuery
  6. Azure blob storage
  7. Azure Streaming Analytics
  8. s3
  9. MS SQL server
  10. MongoDB
  11. MySQL
  12. Oracle DB
  13. RDBMS
  14. ETL/ELT pipeline
  15. PostgreSQL
  16. GCP Data Fusion
  17. Apache Kafka
  18. AWS Redshift
  19. GCP Dataflow
  20. Azure ADF
  21. AWS Glue
  1. Handling Imbalanced Classes
  2. Time-Based Features
  3. Data Scaling and Normalization
  4. Feature Extraction from Images
  5. Dimensionality Reduction
  6. Data Partitioning - Train, Validation, & Test
  7. Handling Missing Data
  8. Handling Categorical Data
  9. Augmentation
  10. AutoEDA libraries
  11. Auto-Preprocessing libraries
  12. Binning / Discretization
  13. Interaction Features
  14. Textual Feature Extraction
  15. Handling Noisy Data
  16. Polynomial Features
  17. Feature Selection
  18. Data Transformations
  19. Annotation
  20. Domain-Specific Feature Engineering
  21. Dealing with Outliers
  22. Handling Time-Series Data
  1. Regularization
  2. Learning Rate Scheduling
  3. Clustering
  4. Ensemble Techniques
  5. Natural Language Processing
  6. Forecasting Techniques
  7. Batch Size Selection
  8. Network Analytics/ GeoSpatial Analytics
  9. Binary Classification Techniques
  10. AutoML
  11. Weight Initialization
  12. External Validation
  13. Multiclass Classification Techniques
  14. Association Rules
  15. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  16. Word Embeddings
  17. Recommendation Engine
  18. Evaluation Metrics
  19. Blackbox - Neural Network Models
  20. Batch Normalization
  21. Regression Analysis
  22. Regularization Techniques
  23. Cross-Validation
  24. Transfer Learning
  25. Data Augmentation
  26. Hyperparameter Tuning
  27. Transfer Learning
  28. Cross-Validation
  29. Model Interpretability
  30. Model Comparison
  31. Reinforcement Learning
  32. Early Stopping
  33. Regular Monitoring and Logging
  34. Performance Visualization
  1. model registry
  2. Datawarehouse
  3. code repository
  4. Data Preprocessing pipeline models
  5. Databases
  1. Feedback Collection
  2. Streamlit
  3. Data Drift Monitoring
  4. Model Versioning
  5. Bias and Fairness Assessment
  6. Concept Drift Detection
  7. Model Health Monitoring
  8. Cloud Deployment
  9. Performance Metrics
  10. FastAPI
  11. Flask
  12. Model Drift
  13. Model Serialization
  14. Prediction Logging
  15. Edge Deployment
  16. Serverless Computing
  17. Alerting and Notification
  18. Containerization
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference