Workflow Element Store

  1. Surveys and Questionnaires
  2. Data Bases - SQL
  3. APIs and Data Feeds
  4. WebScraping
  5. Mobile Applications or IoT Applications
  6. Data Collaboration and Partnerships
  7. Data bases - NoSQL
  8. Experiments (DoE)
  9. Public Datasets
  10. Feedback Data
  11. Flat files
  1. MongoDB
  2. MS SQL server
  3. GCP BigQuery
  4. AWS Redshift
  5. MySQL
  6. Azure blob storage
  7. ETL/ELT pipeline
  8. s3
  9. Azure ADF
  10. AWS Glue
  11. GCP Dataflow
  12. AWS Kinesis
  13. GCS
  14. Azure Synapse
  15. AWS RDS
  16. Azure Streaming Analytics
  17. PostgreSQL
  18. RDBMS
  19. GCP Data Fusion
  20. Apache Kafka
  21. Oracle DB
  1. Feature Extraction from Images
  2. Auto-Preprocessing libraries
  3. Annotation
  4. Domain-Specific Feature Engineering
  5. Handling Noisy Data
  6. Dimensionality Reduction
  7. AutoEDA libraries
  8. Feature Selection
  9. Interaction Features
  10. Data Scaling and Normalization
  11. Handling Categorical Data
  12. Handling Missing Data
  13. Polynomial Features
  14. Dealing with Outliers
  15. Augmentation
  16. Handling Time-Series Data
  17. Data Partitioning - Train, Validation, & Test
  18. Binning / Discretization
  19. Handling Imbalanced Classes
  20. Time-Based Features
  21. Textual Feature Extraction
  22. Data Transformations
  1. Weight Initialization
  2. Performance Visualization
  3. Evaluation Metrics
  4. Ensemble Techniques
  5. Model Comparison
  6. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  7. Multiclass Classification Techniques
  8. Batch Normalization
  9. Regression Analysis
  10. Blackbox - Neural Network Models
  11. Regularization
  12. Natural Language Processing
  13. Network Analytics/ GeoSpatial Analytics
  14. Learning Rate Scheduling
  15. Binary Classification Techniques
  16. AutoML
  17. Cross-Validation
  18. Model Interpretability
  19. Forecasting Techniques
  20. External Validation
  21. Clustering
  22. Word Embeddings
  23. Data Augmentation
  24. Cross-Validation
  25. Association Rules
  26. Batch Size Selection
  27. Regular Monitoring and Logging
  28. Hyperparameter Tuning
  29. Transfer Learning
  30. Regularization Techniques
  31. Early Stopping
  32. Recommendation Engine
  33. Reinforcement Learning
  34. Transfer Learning
  1. Data Preprocessing pipeline models
  2. code repository
  3. Datawarehouse
  4. model registry
  5. Databases
  1. Model Health Monitoring
  2. Cloud Deployment
  3. Feedback Collection
  4. Streamlit
  5. Flask
  6. Containerization
  7. Edge Deployment
  8. Serverless Computing
  9. Performance Metrics
  10. Alerting and Notification
  11. Concept Drift Detection
  12. Data Drift Monitoring
  13. Model Versioning
  14. Model Serialization
  15. Prediction Logging
  16. FastAPI
  17. Bias and Fairness Assessment
  18. Model Drift
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference