Workflow Element Store

  1. Data bases - NoSQL
  2. Data Bases - SQL
  3. WebScraping
  4. APIs and Data Feeds
  5. Public Datasets
  6. Surveys and Questionnaires
  7. Data Collaboration and Partnerships
  8. Flat files
  9. Mobile Applications or IoT Applications
  10. Experiments (DoE)
  11. Feedback Data
  1. s3
  2. AWS Glue
  3. AWS Redshift
  4. ETL/ELT pipeline
  5. Apache Kafka
  6. GCP Data Fusion
  7. GCP BigQuery
  8. GCS
  9. Azure Streaming Analytics
  10. MongoDB
  11. AWS Kinesis
  12. AWS RDS
  13. Azure blob storage
  14. Azure Synapse
  15. MS SQL server
  16. GCP Dataflow
  17. MySQL
  18. PostgreSQL
  19. RDBMS
  20. Azure ADF
  21. Oracle DB
  1. Feature Selection
  2. Handling Missing Data
  3. Interaction Features
  4. Time-Based Features
  5. Handling Noisy Data
  6. Polynomial Features
  7. Handling Time-Series Data
  8. Feature Extraction from Images
  9. Handling Imbalanced Classes
  10. Data Transformations
  11. Data Scaling and Normalization
  12. Dimensionality Reduction
  13. Domain-Specific Feature Engineering
  14. Binning / Discretization
  15. Data Partitioning - Train, Validation, & Test
  16. Textual Feature Extraction
  17. Auto-Preprocessing libraries
  18. Annotation
  19. Augmentation
  20. Handling Categorical Data
  21. Dealing with Outliers
  22. AutoEDA libraries
  1. Word Embeddings
  2. Model Interpretability
  3. Cross-Validation
  4. Early Stopping
  5. Weight Initialization
  6. Transfer Learning
  7. Reinforcement Learning
  8. Transfer Learning
  9. Data Augmentation
  10. Evaluation Metrics
  11. Batch Normalization
  12. Regression Analysis
  13. Model Comparison
  14. Regularization
  15. Network Analytics/ GeoSpatial Analytics
  16. Forecasting Techniques
  17. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  18. Regularization Techniques
  19. Learning Rate Scheduling
  20. Ensemble Techniques
  21. Binary Classification Techniques
  22. Clustering
  23. Regular Monitoring and Logging
  24. Recommendation Engine
  25. Multiclass Classification Techniques
  26. Natural Language Processing
  27. Blackbox - Neural Network Models
  28. Association Rules
  29. Cross-Validation
  30. Hyperparameter Tuning
  31. Performance Visualization
  32. AutoML
  33. External Validation
  34. Batch Size Selection
  1. model registry
  2. Databases
  3. code repository
  4. Datawarehouse
  5. Data Preprocessing pipeline models
  1. FastAPI
  2. Data Drift Monitoring
  3. Prediction Logging
  4. Performance Metrics
  5. Model Versioning
  6. Edge Deployment
  7. Cloud Deployment
  8. Model Health Monitoring
  9. Serverless Computing
  10. Model Drift
  11. Alerting and Notification
  12. Concept Drift Detection
  13. Containerization
  14. Streamlit
  15. Bias and Fairness Assessment
  16. Model Serialization
  17. Flask
  18. Feedback Collection
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference