Probus Insurance Broker Private Limited is one of India’s leading Insurtech platforms, offering a wide range of insurance solutions across life, health, motor, travel, commercial, home, and marine insurance categories.Established in 2002 and headquartered in Mumbai, Probus partners with over 29 insurance providers and operates through a hybrid business model combining digital platforms with an extensive nationwide Point of Sale Person (PoSP) network. With a rapidly growing customer base and large volumes of operational and financial data, the organization required a scalable and intelligent data management platform to support analytics, reporting, governance, and business growth.
Business ChallengeAs Probus Insurance expanded its digital operations and data ecosystem, the organization faced multiple challenges related to data engineering, governance, operational visibility, and scalability. The existing on-premises infrastructure lacked a centralized data catalog, making it difficult for teams to track schemas, manage metadata, and maintain consistency across multiple datasets and reporting systems. Manual ETL development and maintenance increased operational overhead while introducing inconsistencies and a higher risk of transformation errors. Troubleshooting failures across disconnected processing tools was time-consuming due to limited pipeline visibility and absence of centralized monitoring. Frequent schema changes across source databases required manual downstream modifications, resulting in pipeline failures, delayed reporting, and operational inefficiencies. The organization also lacked automated alerting mechanisms, making it difficult to proactively identify failures in data transfer and transformation jobs. In addition, data security, backup, and disaster recovery capabilities within the legacy environment were inadequate for handling sensitive financial and operational data at scale.
Goals & ObjectivesThe primary objective was to build a secure, scalable, and fully automated cloud-native data platform capable of centralizing analytics, automating ETL operations, improving observability, and strengthening governance.The organization also aimed to improve data quality, reduce operational dependency on manual ETL workflows, enable self-service analytics, and establish robust disaster recovery capabilities.
Solution ApproachPentagon System & Services designed and implemented a modern AWS-native data engineering and analytics platform centered around AWS Glue, Amazon Redshift, Amazon S3, and AWS Database Migration Service (DMS). The existing on-premises SQL Server databases were consolidated and migrated to Amazon Redshift using AWS DMS over a secure AWS Site-to-Site VPN connection, ensuring encrypted data transfer and compliance with internal security standards. To eliminate manual ETL dependencies, AWS Glue ETL Jobs were implemented for automated serverless data transformation workflows. The platform enabled ingestion of structured and semi-structured datasets from Amazon S3, transformation using PySpark, and loading into Amazon Redshift with integrated business logic, validation, and data quality checks. AWS Glue Crawlers were deployed to automatically scan datasets, detect schema changes, and continuously update the AWS Glue Data Catalog, creating a centralized metadata repository accessible across analytics and reporting tools.
Implementation ApproachTo improve operational visibility and governance, Amazon CloudWatch, AWS CloudTrail, and AWS Glue job logging were integrated for centralized monitoring, auditability, and lineage tracking across ingestion, transformation, and reporting layers. Automated alerting mechanisms were implemented using Amazon SNS, enabling real-time notifications for ETL failures, schema mismatches, and performance bottlenecks. The platform was designed to intelligently handle schema drift using AWS Glue Crawlers and DMS transformation rules, significantly reducing manual intervention and improving resilience against evolving data structures. Security controls were strengthened using AWS IAM role-based access policies, AWS KMS encryption for data at rest, and VPC endpoints to eliminate public internet exposure for Amazon S3, AWS Glue, and Amazon Redshift services. To strengthen business continuity, Amazon Redshift snapshots with cross-region replication and Amazon S3 versioning were configured, enabling reliable backup, disaster recovery, and compliance alignment with defined RTO and RPO objectives. Amazon QuickSight was implemented as the centralized business intelligence layer, enabling secure, role-based dashboards with Row-Level Security (RLS) for operational and leadership teams. The entire infrastructure was provisioned and automated using AWS CloudFormation, ensuring consistency, repeatability, and infrastructure-as-code governance across the environment.
- AWS Glue
- AWS Glue Crawlers
- AWS Database Migration Service (AWS DMS)
- Amazon Redshift
- Amazon S3
- Amazon QuickSight
- Amazon Athena
- Amazon CloudWatch
- AWS CloudTrail
- Amazon SNS
- AWS Lambda
- AWS IAM
- AWS KMS
- AWS CloudFormation
Results
The implementation enabled Probus Insurance to successfully modernize its data engineering and analytics ecosystem with a scalable and highly automated AWS-native architecture. The organization achieved a significant improvement in operational efficiency by reducing ETL processing and manual intervention efforts by more than 40%. Centralized metadata management and automated schema handling improved data consistency, governance, and pipeline reliability while reducing the impact of schema drift and transformation failures. Real-time monitoring, automated alerting, and centralized observability strengthened operational visibility and accelerated troubleshooting capabilities across the platform. The deployment of Amazon QuickSight enabled self-service business intelligence and real-time analytics for multiple user groups through secure and role-based dashboards. Enhanced security controls, encryption mechanisms, and disaster recovery strategies improved compliance readiness, data protection, and operational resilience for business-critical workloads. The result was a future-ready, scalable, and secure analytics platform capable of supporting Probus Insurance’s growing digital operations and data-driven business strategy.





