Choose your language:

Hong Kong
New Zealand
United Kingdom
United States

Delivering a Unified Data Architecture for Sony Computer Entertainment America

Retail Services | Analytics Services

Download PDF

With a vast amount of incoming user gaming data and no unified data architecture in place, SCEA had difficulty deriving any real value or actionable insight from the data. TEKsystems® partnered with SCEA to provide a Big Data solution.

Client Profile

Sony Computer Entertainment America LLC (SCEA), a division of Sony Corporation of America Inc., distributes and markets the PlayStation game console in North America; develops and publishes PlayStation software; and manages the U.S. third-party licensing program. SCEA has partnered with TEKsystems since 2012.

Industry Landscape

Companies from every industry can find tremendous value in Big Data. With the right Big Data solution for each unique business, meaningful insight can be captured and leveraged to make informed decisions that can have a financial impact on the organization. For example, consumer data can be used to understand specific customer segments and help tailor marketing messages; buying behavior data can inform financial decisions and optimize supply chain management strategies.

While the potentially vast amount of business intelligence (BI) offers great prospect, digesting and synthesizing its meaning can be a complicated and complex undertaking. Big Data technologies are maturing at a rapid pace, and keeping up requires hands-on knowledge and expertise from IT teams. New iterations of technologies—with different technical nuances—can be released in as little time as a week, making it extremely difficult to stay current with the latest technology and introduce/maintain Big Data initiatives.

Skill Sets Provided

Hadoop®, Hive™, Oozie, Flume, Shark™, Spark™, Sqoop™, Cloudera Impala, Cloudera Manager, Hadoop Distributed File System (HDFS), MapReduce, Oracle Database


SCEA was planning for the release of its PlayStation® 4 console (PS4TM), the next evolution of the company’s series of gaming consoles. With the technology changes since the previous console version, PlayStation 3 (PS3TM), SCEA’s IT Department would not be able to capture user gaming data the way they wanted to. User gaming data encompasses presale product information from manufacturing, point-of-sale data, data from activating the product and post-activation game play data. Whether the end customer purchased the console directly from SCEA’s website (i.e., online) or from a retailer (i.e., offline)—the data was coming in from different sources and the client had no way of viewing it in an integrated, unified form.

Critical BI could potentially be derived from user gaming data, which could be leveraged to inform decisions across the organization. For example, SCEA’s Marketing Department could use gamer data to extract customer segmentation information and identify trends among consumer groups, then customize its messages and marketing strategies to certain audience segments. Alternatively, SCEA’s manufacturing business function could use gaming data to derive user behaviors and patterns, measure the volume of users playing particular games and for how long, and adapt that insight to guide decisions regarding the supply chain.

But SCEA faced several key challenges in preparing for the PS4 console rollout:

  1. Acquiring the data: Data collection technology for PS3 wasn’t up to date with the requirements of PS4; SCEA would need a platform that was compatible with the new console and could capture hundreds of millions of user interactions on a daily basis.
  2. Storing the data: SCEA needed a cost-effective way to store the vast amount of user gaming data.
  3. Processing the data: Once acquired and stored, the data would require in-depth and complex processing in order to derive meaning and value. SCEA needed a data processing solution that could scale with the growing volume of data.

Using the traditional technology infrastructure supporting PS3, data processing could take up to 72 hours. Many variables could change in that amount of time, and SCEA knew that increasing the speed in which they were able to derive meaning from gamer data would allow them to better anticipate consumers’ needs. A faster data processing mechanism was much needed.

SCEA selected a small group from its internal IT team to deliver a Big Data solution; however, lacking the necessary in-house expertise, the project could not be handled internally and it was ultimately unsuccessful. SCEA realized the need for an external IT services partner with hands-on experience that could come in and not only support the current installation process, but also the post-installation discovery and roadmapping.


TEKsystems proposed a Big Data solution that would enable SCEA to capture the hundreds of millions of user gaming data points and translate the data into something meaningful. Our unified data architecture would enable the integration of heterogeneous data sets (e.g., online and offline data) so that SCEA would have a standard mechanism for evaluating, comparing and deriving value from the data. Specifically, we would enable the following data points to be captured:

  • Presell – Manufacturing information about each unique product (e.g., location manufactured, timeframe of shipment from manufacturing facility to purchase location)
  • Post-sale – Purchase information, such as the name and location of the specific retailer
  • Pre-activation – A measure for if and when the product was activated
  • Post-activation – Information about the consumer’s behaviors once the product has been activated (i.e., Is the consumer actively playing the console? Which game(s) are being played? How frequently is the consumer playing? What is the duration of play time?)
  • Post-play – Information and trends around the consumer’s behaviors that can be leveraged to develop user-specific marketing strategies (e.g., coupons or other sales promotions to encourage use)

To achieve the desired outcome and enable the generation of these key data points, TEKsystems recommended using an Apache Hadoop ecosystem leveraging a number of Hadoop components, including HDF, Hive, Impala, Oozie and Flume. The data would be brought into the environment via a connecting data layer, cloud or raw file format. Using MapReduce and Hive, data would be processed and pushed into a Hadoop grid. We would set up a staging layer for massaging and cleansing the data. Then, based on SCEA’s business requirements, we would leverage Apache Sqoop technology to organize, aggregate and transfer the data to an Oracle Database. Once the data has been pushed into Oracle, we would then utilize traditional BI tools—such as Oracle Business Intelligence Enterprise Edition (OBIEE)—for reporting.


We successfully introduced a unified data architecture into SCEA’s IT environment, based on our recommended approach. We provided a team of seven people who were highly skilled and experienced in delivering Big Data solutions.

Specifically, our team included three on-site resources and four off-shore resources based out of the TEKsystems Hyderabad Solution Center in Hyderabad, India. Our team offered Java, SQL and extract/transform/load (ETL) competencies and programming backgrounds, which enabled them to integrate traditional BI systems with SCEA’s Big Data for improved analysis. Using the Hadoop ecosystem, we were able to reduce the time required for data to become available from 72 hours to as little as two hours—a significant time savings that would allow for real-time business decisions and actions.

Building a scalable framework

Our team collaborated with the client’s Foster City, California-based BI team throughout the engagement. From there, we could obtain the initial raw form of data and input it into a processing system, standardizing it into a certain structure and format. Considering the millions of data points coming in, getting it into a standard structure was essential to being able to analyze and interpret the data. The newly standardized data format better positioned departments such as marketing and finance to absorb the data, run their own analytics and derive meaning and value—something these groups were previously unable to achieve.

As a result, SCEA can now not only capture structured data forms (i.e., online and offline data), but also has the potential to generate non-structured data, meaning free-form text from sources such as YouTube, Facebook and Twitter. Although SCEA did not have a need for obtaining non-structured data at the time of the engagement, our team deliberately built the unified data architecture as a scalable framework; as such, in the future—should SCEA desire—the architecture can absorb non-structured data (i.e., free-form text from social media sources) in addition to structured data forms.

Prior to our implementation of the unified data structure, data from online and offline sales could not be integrated and processed together. With our solution, SCEA is able to bring in data from both sources, an improvement that will enable better management of the supply chain. For example, if historic trends show a particular retailer in a specific location tends to have a shortage of stock during a peak shopping period, SCEA can plan around that and increase the product volume there while decreasing it elsewhere.

Extracting meaningful data

Prior to our partnership, SCEA had visibility into the user gaming data, but making sense of it was an issue. It was near impossible for the client to get anything meaningful out of the hundreds of millions of data points because of how fast technology changes. With our support, SCEA is able to derive value out of the data and make actionable business decisions. The client is now able to capture and evaluate critical data points, including presell, post-sale, pre-activation and post-play insights.

Informing business decisions

The unified data architecture has the potential to positively impact SCEA’s bottom line. One specific example of leveraging the architecture to drive a positive and direct financial impact is through customer retention. Consider a consumer getting discouraged by the level of difficulty of a game. The data may indicate a declining interest from the consumer in playing that particular game. The client’s marketing team can use the data to identify behaviors and patterns, and in turn, cross-sell and up-sell the product in real time. In this instance, they could provide a coupon to push the consumer past the level they were frustrated with, or they could push a sale or other option to the consumer. This was a key outcome, as keeping existing customers happy is less costly than acquiring new ones.

Mitigating risks

Notably, implementing a Big Data solution is a relatively new and highly complex undertaking for any business. Big Data technologies are constantly evolving and maturing—the technologies can change in just a short period of time, quite possibly on a weekly basis. For example, in a span of six months supporting this initiative for SCEA, there were a number of different versions of Hadoop released. Because of this, Big Data is a moving target that makes introducing a solution into enterprise environments a challenging task. Every step of the way was a learning process for SCEA, from scoping out the project, to executing to go live.

To mitigate the risk of potential delays and interruptions from the client’s learning curve, our team provided guidance and support throughout every step of the process, making certain we were successful with each step. We started with a phase of discovery and roadmapping, where we mapped the system landscape with all current issues. After we laid the system landscape, our team adopted the Agile methodology and iteratively interacted with SCEA to confirm we were aligned with their requirements. We also worked closely with SCEA’s Infrastructure Team, having recurring touchpoints to ensure all necessary hardware and support was scalable. Throughout this engagement, we took a highly collaborative approach which was critically important to our success; we worked closely with the client, allowing us to obtain a detailed understanding of their issues and infrastructure.

Key Success Factors

Agile Methodology

Traditionally, our team would follow a Waterfall approach where we would talk to the client, take in all the requirements, create a technical solution, obtain client approval and then implement the solution. This would provide SCEA with no visibility into our process until after implementation. Instead, by adopting an Agile methodology, the client was informed of our progress throughout the engagement. This iterative approach ensured the client was close to the engagement. We did not wait to fix all problems until after implementation; rather, we held standing meetings with SCEA that allowed for continuous learning and improvement.

Technical strength

This was a complex undertaking and providing SCEA the right people was critical to our success. Given that Big Data technologies are evolving with speed, there were new releases throughout the engagement. There were challenges and problems that emerged as technologies developed. We provided a core team with rich technical expertise and Big Data competencies that ensured our unified data architecture was a success and accommodated SCEA’s needs.


Our team was highly collaborative with a variety of different stakeholders throughout this Big Data initiative, which proved critical to our ability to deliver. For example, we kept SCEA engaged and informed of our progress every step of the way and ensured they understood each step of the Big Data solution. We collaborated with the infrastructure team and held recurring touchpoints to gain a detailed understanding of their data issues and goals, and ensure we had all the necessary hardware to support our scalable solution. We also worked closely with external stakeholders if needed, including the Hadoop vendor, in order to ensure optimal performance.

Send Us a Message
Choose one