By Jeff Weinstein – April 15, 2011
Here at comScore, we’re always interested in discovering new applications for our data by combining it with other available sources. And what better way to unleash this innovation than by enlisting the help of some of America’s best and brightest. This past weekend, comScore hosted a social media case study at Carnegie Mellon University along with the Heinz College School of Information Systems and Management, with the goal of developing a new analytics product targeted to the motion picture industry to forecast box office performance and evaluate marketing decisions based on digital analytic inputs.
The emergence of social media offers the potential to better predict box office performance, and the value this can create is enormous. Hollywood movies gross approximately $30 billion per year, so it begins with a very large market. The average production budget for a major studio movie often tops $70 million and another $35-40 million is typically spent on marketing. Each year a few runaway hits often compensate for the majority of movies that fail to break even. However, once a movie has been completed (whether it’s blockbuster or a bust), production becomes a sunk cost and the role of the movie marketer is to maximize yield on ticket sales. Another way to think about it is that a movie marketer’s job is to manage risk and employ strategies to minimize the downside on movies unlikely to be a strong box office draw and to maximize the ticket sales of the ones likely to be breakout hits.
The competition featured 21 groups with 1 to 5 participants on each team. Teams came from many of Carnegie Mellon’s graduate school programs including the Heinz College School of Information Systems and Management (ISM), Tepper School of Business (MBA), computer science masters program, and others participated. The competition began with each team submitting a one-page overview of their approach to building a predictive analytics product. The teams’ ideas blended comScore’s digital consumer behavioral data with externally available data sources, such as open APIs or using web page scraping techniques.
The variety of student backgrounds made for very interesting combination of submissions featuring very different approaches to the assignment. Computer science students outlined sentiment analysis algorithms for social media streams, relevance and influence ranking systems for social media targeted marketing, and television and radio voice sentiment analysis. Other students walked us through multivariable logistic regressions to isolate important predictive factors during different stages of the movie (pre-release, open weekend, on-going). Many submissions focused on the importance of social media as a predictive input, and a few had particularly clever solutions that also incorporated offline inputs and historical signals, including movie performance by genre, actor, director, etc. Other groups incorporated various macroeconomic indicators, which can provide strong signals of general consumer behavior. Despite these very different approaches to the project, one consistency across all groups was an extraordinarily high level of creativity.
The entries were judged by a committee from comScore and CMU: Linda Abraham (Chief Marketing Officer), Dean Logan (Senior Director, Cross Media), Jeff Weinstein (Director, Research and Development) and CMU professors Ari Lightman (Practice Professor, Digital Media and Marketing) and Michael Smith (Associate Professor of Information Technology and Marketing). We selected the 5 best initial proposals and invited them to perform an in-depth pitch of their idea the following afternoon. The panel was universally impressed with the technical, analytical, design, and marketing creativity of the final five groups.
The winning product was a multivariable logistic regression model from Asad Sheth and Patrick Clary from the Tepper School of Business. We were impressed by their detailed and inventive approach to data collection, which included Youtube trailer visitation from the comScore panel, sentiment analysis from Twitter, economic indicators, weighting based on social media influence, and prior sentiment correlation to box office outcome. Their writing was clear and concise and their presentation skills were fantastic. They dove deep into the model training method as well as outlined market positioning, tiered pricing strategy, and valuation steps. Their technical backgrounds (both had undergraduate degrees in computer science) were well balanced with their overall business acumen, which enabled them to deliver a memorable and convincing pitch. Way to go guys, and congratulations on being selected as the winning group from amongst a crowded and talented field!