DOI: 10.18586/msufbd.1792556 ISSN: 2147-7930

A Research on the Native Cloud Monitoring with the Help of Site Reliability Engineering Insights

Canberk Koç, Bora Uğurlu, Bahadir Karasulu
The demand for scalable and portable applications has accelerated the adoption of distributed systems and cloud computing. To ensure reliability and availability, Site Reliability Engineering (SRE) has become essential. This study compares native cloud monitoring services of Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure within the SRE framework. A PaaS-based sample application was developed using Golang, microservice architecture, API Gateway, and event-driven patterns, supported by relational databases and cloud-native messaging services. Providers’ monitoring tools, along with third-party solutions, were integrated to assess system health. Evaluation focused on objective SRE metrics: variety, cost, alert latency, data collection frequency, and retention. Findings show that GCP provides detailed VM-level metrics without additional configuration in API Gateway setups and achieves the highest message-handling capacity in event-driven architectures. The study offers a comparative analysis of monitoring capabilities and insights into platform suitability for reliability-centric deployments.

More from our Archive