A technical deep-dive into building and scaling machine learning operations within the Malaysian enterprise context.
Malaysian enterprises are rapidly moving from AI proofs-of-concept to production-grade deployment, necessitating robust MLOps frameworks that handle regional data residency and latency requirements.
Continuous integration and continuous delivery for machine learning differs fundamentally from traditional software CI/CD. Beyond code changes, ML pipelines must also respond to data changes and model performance degradation — neither of which triggers a conventional code commit. This "three-axis" complexity is why many organisations that excel at DevOps still struggle to ship ML models reliably. A production-grade ML CI/CD pipeline consists of four stages: data validation (Great Expectations or Soda Core checking schema, distributions, and referential integrity), model training (parameterised, reproducible runs tracked in MLflow or Weights & Biases), evaluation gates (automated comparison of candidate model against champion on holdout data), and deployment (blue/green or canary rollout via Kubernetes with traffic splitting). For Malaysian enterprises using cloud infrastructure, AWS SageMaker Pipelines, Azure ML Pipelines, and Google Cloud Vertex AI Pipelines all offer managed orchestration that satisfies data residency requirements when deployed in ap-southeast-1 or asia-southeast1 regions. The choice between them typically comes down to existing cloud commitments rather than pure technical merit.
Deploying a model is the beginning of the MLOps journey, not the end. Production ML systems degrade silently — unlike broken software, a degraded model still returns responses, just increasingly wrong ones. This makes continuous monitoring non-negotiable for any ML system touching revenue or risk decisions. Model monitoring covers three distinct failure modes: data drift (the distribution of incoming features shifts from training data), concept drift (the relationship between features and target changes, even if feature distributions remain stable), and infrastructure drift (latency, throughput, or error rates change). Each requires different monitoring approaches. For Malaysian financial services, BNM RMiT Section 10.54 explicitly requires evidence of ongoing model performance monitoring with defined escalation thresholds. This regulatory requirement has accelerated MLOps adoption in the banking sector — CIMB, Maybank, and RHB all now run dedicated model risk teams that review monitoring dashboards weekly and trigger retraining when PSI exceeds 0.2 on key features.
A feature store is the centralised repository that allows data scientists across an organisation to discover, share, and reuse the engineered features that power ML models. Without a feature store, every team independently re-engineers the same features — customer tenure, transaction velocity, churn probability — creating duplicated effort, inconsistent definitions, and training-serving skew. Training-serving skew is among the most pernicious bugs in production ML: the feature transformation logic used during model training differs subtly from the logic used at inference time, causing silent accuracy degradation that can persist undetected for months. A feature store solves this by maintaining a single implementation of each feature transformation that serves both training and inference paths. For Malaysian enterprises, the build vs buy decision for feature stores has become clearer: open-source solutions (Feast, Hopsworks Community) are viable for organisations with strong platform engineering capacity, while managed offerings (AWS SageMaker Feature Store, Google Vertex AI Feature Store) are appropriate for organisations prioritising operational simplicity. The recurring cost of managed solutions is typically justified by the elimination of platform engineering overhead.
Reproducibility is the foundation of trustworthy ML — the ability to exactly reproduce any past model, including its training data, code, hyperparameters, and environment. Regulatory frameworks including BNM RMiT and the forthcoming NAIO AI Accountability Guidelines both require evidence of reproducibility for material AI models. Data versioning tools (DVC, Delta Lake, Apache Iceberg) extend version control concepts from code to datasets. By tagging specific snapshots of training data alongside model checkpoints and code commits, teams can recreate any past experiment exactly — critical for debugging production issues and for responding to regulator inquiries about how a model was trained. The practical implementation pattern that works best for Malaysian enterprises uses a three-tier data versioning approach: raw data versioned in object storage (S3 or GCS) using Delta Lake for ACID transactions, processed feature datasets versioned in the feature store with semantic versioning, and model artefacts versioned in MLflow with full lineage back to data and code versions.
Scaling ML infrastructure presents unique challenges in the Malaysian context: limited availability of GPU compute compared to US/EU regions, data residency requirements that restrict use of certain global endpoints, and a talent market where Kubernetes and Kubeflow expertise commands significant premium. The most pragmatic path for mid-market Malaysian enterprises is a hybrid architecture: managed training infrastructure (AWS SageMaker, Google Vertex AI) for compute-intensive workloads, combined with self-managed inference serving (Kubernetes on cloud VMs) for latency-sensitive production endpoints. This balances cost efficiency with control over the production environment. For organisations that have outgrown managed services, Kubeflow on GKE or EKS with Istio service mesh provides enterprise-grade ML platform capabilities. The investment threshold — roughly 10+ data scientists and 20+ production models — is where platform engineering for MLOps becomes clearly ROI-positive.
The organisational structure of an MLOps function is as important as the technology choices. Three models have emerged in Malaysian enterprises: the centralised platform team (a dedicated MLOps team that owns shared infrastructure and serves all business units), the embedded model (MLOps engineers sit within data science teams in each business unit), and the federated model (a small central platform team sets standards while embedded engineers implement them). The federated model consistently outperforms the alternatives for organisations with 3+ business units and 15+ data scientists. It provides the standardisation benefits of the centralised model without the bottleneck, and maintains the business context benefits of the embedded model without the fragmentation. Role clarity within the MLOps function is also critical. The ML Engineer role — distinct from both Data Scientist and DevOps Engineer — owns the production ML platform: training pipelines, model serving infrastructure, monitoring systems, and feature store maintenance. Without this dedicated role, MLOps responsibilities fall between teams and production systems become brittle.
Our partners are ready to help you navigate the complexities of enterprise AI in the APAC region.
Further Reading
Data & MLOps
Practical insights from deploying and managing production-grade machine learning pipelines in the APAC region.
Data & MLOps
Why clean, accessible, and well-governed data is a prerequisite for Large Language Model success.
Data & MLOps
Designing the robust, real-time data flows required to power a new generation of autonomous enterprise agents.
Deep Dives
Build the scalable data infrastructure every production AI programme needs.
ViewConnect trained models to core enterprise processes at scale.
ViewLay the data foundations before scaling your MLOps function.
ViewIdentify infrastructure gaps before your first production deployment.
ViewFree · 10 Minutes
Benchmark your AI readiness across six dimensions