Healthcare
Data Engineering
A US company that develops, markets, and sells natural health products to healthcare professionals and select health food retailers. To increase its product sales, the teams require an integrated view of their existing sales and derive real-time insights. However, data silos pose a significant challenge. Trigent proposed a strategy that aimed at balancing the need to create a unified resilient data engineering infrastructure while protecting existing investments in business applications where possible.
The client is a US company that develops, markets, and sells natural health products to healthcare professionals and select health food retailers. With regards to its product line, the company develops various health products such as women’s well-being, therapeutics, and liposomal, to name a few. The company’s product development process involves a critical understanding of the market data, extensive R&D, and application of the latest scientific research. Its sales, operations, and administration departments work in tandem to provide customers with flawless services.
To increase its product sales, the Sales and Marketing teams require an integrated view of their existing sales and derive real-time insights for demand-focused promotions.
However, data silos pose a significant challenge, thereby making visualization difficult; the company needs to extract terabytes of data relating to marketing campaigns, sales promotions, past region-wise demand forecasts, product-wise performance, sales by channel, and more. The data from multiple sources need to be processed and loaded into a data warehouse system that can then be analyzed. The data warehouse solution needs to be:
Additionally, the company aspires to develop a single source of truth forecast that can be leveraged across the product, sales, and marketing teams, making the process more efficient.
To realize the company’s goal of data-driven decision-making, they worked with Trigent to design an integrated solution and provide a unified view of the sales and marketing data. Trigent proposed a strategy that balances the need to create a unified resilient data engineering and data analytics infrastructure and approach while protecting existing investments where possible.
The first step was to understand the current software implementation, flow of data between systems, the corresponding business processes, and issues faced by the technical and product teams. Trigent understood the company’s overall product portfolio, product categories, and distribution channels and existing tools to design an architecture that addresses future needs.
The next step was to collate data from the existing systems by using RestfulAPIs, and process the data to address format or structure issues, data availability, and accuracy.
Logical Architecture
Trigent suggested leveraging Amazon Managed Workflows for Apache Airflow (MWAA) for workflow management, orchestration set-up, and end-to-end operations of data pipelines. The team designed a scalable and reliable data pipeline/data infrastructure using AWS Glue (ETL) workflow. The proposed data pipeline was customized to the company's business needs, wherein the data was ingested from several sources such as customer order databases, payment gateways, and emails, to name a few.
The team proposed utilizing Amazon MSK for streaming data from all the available sources. The extracted raw data from disparate sources would then be transformed and given a unified format and structure for model building and data recovery, which would ultimately facilitate discovering insights. This transformed data would then be loaded into the warehouse on the cloud data platform architecture. Amazon Redshift was selected for designing the data warehouse.
For building the cloud data warehouse solution, Trigent suggested Amazon Redshift for hosting and processing terabytes of data and running thousands of highly performant queries in parallel, thereby enabling the company to capture, store and analyze large volumes of data and deliver real-time insights to the product, sales, and marketing teams.
Additionally, the research and development team designed a demand forecasting solution for the next quarter by leveraging ML techniques. Algorithms were designed with the help of Amazon SageMaker to improve Weighted absolute percentage error (WAPE). Time series and Artificial Neural Networks (ANNs) models were used to design a demand forecasting solution. The demand forecasting solution, if used, would lead to various intuitive observations such as sales values and sales percentage changes, to name a few.
Useful data insights from demand forecast and anomaly detection were derived. By utilizing Trigent's data visualization solution, the team designed dashboards, reports, and visuals from ML-generated insights. These customized reports would enable the teams to make informed decisions. The team designed the reports by using several tools such as PowerBI, Tableau, and Amazon QuickSight to generate customized visualizations.
Image Credit: AWS
Image Credit: Microsoft