×
☰ MENU

Effective Re-platform of Music & Entertainment data, from AWS to Google Cloud Platform

  • SOLUTION PILLAR : Smart Analytics
  • WORKLOAD : Data Warehouse
  • VERTICAL : Media & Entertainment

Overview

Our client, a major American music corporation, is a production house of media and entertainment data, operating in more than 60 countries. As part of its key business operations, the client works on a comprehensive music catalogue to handle various forms of music consumption. It also frequently handles data from its key distribution partners (Spotify, Apple, Pandora, Rhapsody, You Tube and others) to obtain timely insights on country wise sales and revenue data, from its proprietary albums and recording labels. To handle the massive incoming Music and Entertainment Data, the client uses AWS Redshift Data Warehouse Solution for storage and Amazon S3 for Data Lakes support.

Challenges

While trying to achieve its business needs, the client faced the following challenges:

  • Increased Costs: With Redshift, the billing is done based on hourly usages and the pricing fixation based on cluster size, making it costly for low query volumes. As the incoming M&E data from streaming partners was distributed across nodes on a periodic basis, it was unable to handle the increased data volume.
  • Frequent Performance Latencies: The existing system caused a time lag of nearly 24 hours due to increased data retrieval, processing time and client waiting hours. Gaining timely insights from high volume streaming data thus became difficult.
  • Increased Complexity: Redshift’s critical drawback is that it required constant low-level database tuning of the virtualized hardware and database configurations. Its “data clusters” required additional expertise and flexibility on its resource management capacity.
  • Difficulty in Scalability: Due to the exponential increase in data volumes from streaming partners, the existing system couldn’t scale up and process data faster, even in the presence of additional investments. This had a proportionate impact on the downstream processes, i.e., supporting deep dive analyses of Sales Data (LYSD - Last year same day), which have an impact on the key tactical business decision making process.

Solution Highlights

To handle the above challenges, the client was on the lookout for a technical offshore partner with the right business expertise, to conduct effective re-platform operations and migration of its data, from the legacy infrastructure to a much more scalable and cost optimized Data Warehouse solution, i.e., its movement from AWS to Google Cloud Platform. This offered increased speed in data processing and insights generation, for better business agility.

The Music and Entertainment Data, currently stored in Redshift Data Warehouse and Amazon S3 Data Lake, followed the migration process on an incremental setup to Google Big Query, and the Data Lake support was offered using Google Cloud Storage.

Google Big Query offered the following advantages:

  • Improved Features and Decreased Costs: Big Query has the capacity to can support handling and storage of massive data sets, as it favors an RDBMS based SQL system. With its effective resource management options, it could provide abstract details of the underlying hardware, database and other configurations. Its pricing is fixed based on the amount of data processed with these queries, and not on storage volume, decreasing the overall costs significantly.
  • Reduced Latencies: While offering superior usability, performance, and cost for its analytical use-cases, especially at scale, Big Query has quicker response times and better performance options.
  • Manageability and Usability: Google Big Query supports data optimization for fast queries, effective resources utilization distributed across time, query response times of a few minutes, and minimal database tuning. It also offers superior performance in terms of data distribution across a defined number of nodes. Big Query’s performance fluctuates substantially, as the same query against the same data set will run twice (or ½) as fast on different days, especially for SQL-like queries made against multi-terabytes of data sets.

The other key components used, as part of GCP:

  • Google Compute Engine: Implemented as an infrastructure-as-a-service (IaaS) solution that provides the features of virtual machine instances for hosting the workloads of M&E data volume, for the music consumers.
  • Google Cloud Storage: a data lake solution synonymous with Amazon S3, to store large, unstructured data sets of the incoming media and entertainment data.
  • Google Cloud Dataflow: a data processing service intended for ETL Operations and Analytics, to support real-time big data processing of the unstructured business data.

Results

  • Performance Improvement of up to 75%, due to the adoption of robust and cost optimized Google Big Query, which offers simplistic performance and offers pricing based on the number of queries processed.
  • Reduced data latency: Post migration to Google Cloud, the tasks could be completed within 1 hour, a considerable reduction in time.
  • Significant reduction in downtime: As per the industry best practices, the client’s current infrastructure could be kept at par with the data scale and query patterns. The current M&E data has been running on the new system Google Big Query, for the past 1 year, with minimal downtime hours.
  • Support for near real-time updates of over 4 hours Based on inputs obtained from over 10-15 partners and over 70 Priority partners, supporting downloads & other live streaming options.

Ready to start your cloud transformation journey?

Speak to an Expert

Your Name*
Company Name*
Job Title*
Email Address*
Phone Number*

By submitting this form you understand and agree that we may contact you regarding your interest about our services, partners and products as well as receiving electronic communications from us and our partners including news, events, updates, and promotional offers. You may withdraw your consent and unsubscribe from such marketing communication at any time. You also acknowledge and accept AgileGCPlabs Master Privacy Policy, including its use of cookies.

Agile GCP Labs previously known as Agile iSS was founded in 2014 and recently rebranded in 2019 as Agile GCP Labs.