Center Events

Economics of Data Workshop

December 7, 2018 | Silverman Hall 245A, 3501 Sansom St.; Room 401B, 3401 Walnut St.

Friday, December 7th- Silverman Hall 245A, 3501 Sansom St.

Saturday, December 8th- Room 401B, 3401 Walnut St.

This workshop is devoted to the ways in which data is restructuring existing competitive markets (i.e. by enabling better price discrimination, changing the boundaries of the firm, and creating monopoly power), as well as the design of  ‘data marketplaces.’ It will run over two days. The first day will begin with speakers who will give us a bird’s eye perspective of the role of data. In the second day, leading specialists will focus on particular aspects of the economics of data.

Organized by Annie Liang, Christopher Yoo, Aviv Nevo, Michael Kearns, and Rakesh Vohra.


Day 1:

11:15 am – 12:15 pm: Michael Schwarz, Corporate Vice President, Chief Economist, Microsoft Corporation

“Some Open Questions about the Value of Data and Simple but Useful Concept of Data Q”

How will the productivity gains from AI advances be shared? Is data becoming a strategic asset pressures as gold or is it becoming plentiful like water? What will emerge as the dominant business model for AI? Why the choice of KPIs becomes increasingly important? These are important open questions for both scientists and practitioners. Companies that are the most able to leverage existing data assets are most likely to survive and prosper. Yet, few companies if any make systematic attempts to measure how agile they are in using data. I describe a simple but useful concept of Data Q that attempts to provide a crude but useful measurement of organization’s ability to take advantage of the data it already owns.


12:15 pm – 1:15 pm: LUNCH

1:15 pm – 2:15 pm: Garrett van Ryzin, Cornell Tech and Head of Marketplace Labs at Lyft

“Data Science and the Transportation Tech Revolution”

Sparked by the growth of ridesharing startups (Uber, Lyft) – the transportation tech sector has already produced some of the highest valued startups in history.  Soon after this initial ridesharing wave, attention turned to autonomous vehicles with significant R&D efforts launched by tech giants (Google, Tesla, Apple), major automakers (GM, Daimler, Ford, FCA), ridesharing companies (Uber, Lyft)  and mobility startups (Adaptiv, Nutonomy, Mobileye).  Over the past year, the tech world pivoted yet again to scooters (Bird, Lime) and bikes (Jump, Motivate) – birthing yet another round of billion-dollar startups. In this talk, I examine the technological and economic forces behind this wave of innovation and investment in transportation and why in creates an unprecedented opportunity for data science.

2:15 pm – 2:45 pm: BREAK

2:45 pm – 3:45 pm: Harikesh Nair, Stanford University and Chief Business Strategy Scientist at

“Data-driven Ad-Tech and Mar-Tech at”

I will discuss the JD data ecosystem and how data and algorithms drive JD’s advertising and marketing technology platform. I will also talk about some ways in which data drives eCommerce strategy.

6:00 pm: Dinner for all the speakers


Day 2:

8:30 am – 9:00 am: LIGHT BREAKFAST

9:00 am – 9:45 am: Munther Dahleh, MIT

“A Marketplace For Data: An Algorithmic Solution”

In this work, we aim to create a data marketplace; a robust real-time matching mechanism to efficiently buy and sell training data for Machine Learning tasks. While the monetization of data and pre-trained models is an essential focus of industry today, there does not exist a market mechanism to price training data and match buyers to vendors while still addressing the associated (computational and other) complexity. The challenge in creating such a market stems from the very nature of data as an asset: (i) it is freely replicable; (ii) its value is inherently combinatorial due to correlation with signal in other data; (iii) prediction tasks and the value of accuracy vary widely; (iv) usefulness of training data is difficult to verify a priori without first applying it to a prediction task. As our main contributions we: (i) propose a mathematical model for a two-sided data market and formally define the key associated challenges; (ii) construct algorithms for such a market to function and rigorously prove how they meet the challenges defined. We highlight two technical contributions: (i) a new notion of “fairness” required for cooperative games with freely replicable goods; (ii) a truthful, zero regret mechanism for auctioning a particular class of combinatorial goods based on utilizing Myerson’s payment function and the Multiplicative Weights algorithm. These might be of independent interest. This is joint work with Anish Agarwal, Tuhin Sarkar, and Devavrat Shah. 


9:45 am – 9:55 am: BREAK

9:55 am – 10:40 am: Sven Seuken, University of Zurich

“The Design of a Combinatorial Data Market”

In this paper, we propose a market design solution for a data market. We focus on four specific challenges: (1) different providers have the capability to produce different sets of databases; (2) to answer typical queries from buyers, two or more databases must be joined; (3) data providers have high fixed costs for producing a database; and (4) buyers have combinatorial values over which databases are produced and thereby become available in the marketplace. The key idea of our solution is to use a reverse auction for the sellers, a posted-price mechanism for the buyers, and a fixed-point iteration algorithm for finding an outcome that balances the two sides of the market. Via simulations, we show how our market distributes the surplus between buyers and sellers. In particular, we demonstrate that our design rewards providers of “unique” data much more than providers of “common data.”This is joint work with Dmitry Moor, Tobias Grubenmann, and Abraham Bernstein.


10:40 am – 10:50 am: BREAK

10:50 am – 11:35 am: Alessandro Acquisti, Carnegie Mellon University

“Who Benefits From the Data Economy?”

In the public debate over the data economy, many claims are being made concerning the benefits that can be accrued from the collection and analysis of consumer data. Not all those claims have been empirically validated, or scrutinized. I will present a series of ongoing studies that aim at understanding and estimating how the economic value extracted from consumer data is being allocated to different stakeholders. 

11:35 am – 12:35 pm: LUNCH

12:35 pm – 1:20 pm: Amit Gandhi, Department of Economics, University of Pennsylvania, formerly Chief Economist Microsoft Cloud

“Economic Priors and Regularization”

Structural Econometrics and Machine Learning are both methods for pursuing high dimensional decision problems in the presence of uncertainty. In this talk we explore the general theme of bridging the gaps between these approaches through the use of economic priors as a basis for the regularization for such problems. This can helps achieve both “explainability” of decisions via the natural connection between economic theory to many problem domains, yet at the same time benefit from the powerful optimization, tuning, and automation techniques arising from ML/AI algorithms. We develop the idea using capacity planning and demand estimation as examples. 


1:20 pm – 1:30 pm: BREAK

1:30 pm – 2:15 pm: Michael Bailey, Facebook

“Social Networks and Economic Decision Making”

Social Networks play a foundational role in our decision making: where to live, who to marry, which trashy TV show to binge on next. But there has been very little empirical work on the role of networks in our economic decision making. I will overview some of the work we are doing on the computational social science team at Facebook around network formation and the dynamics of networks and what we’ve learned about how networks influence our beliefs and decision making including recent work on the impact of networks on product adoption and house purchases.

2:15 pm – 2:25 pm: BREAK

2:25 pm – 3:10 pm: Ginger Zhe Jin, Department of Economics, University of Maryland

“The Short Run Effect of GDPR on Technology Venture Investment”

The General Data Protection Regulation (GDPR) came into effect in the European Union in May 2018. We study its short-run impact on investment in new and emerging technology firms. Our findings indicate negative post-GDPR effects on EU ventures, relative to their US counterparts. The negative effects manifest in the overall dollar amounts raised across funding deals, the number of deals, and the dollar amount raised per individual deal.


3:10 pm – 3:20 pm: BREAK

3:20 pm – 4:05 pm: Dirk Bergemann, Department of Economics, Yale University

“Markets for Information”

We offer a comprehensive analysis of information markets through an integrated model of consumers, information intermediaries, and firms. The model embeds a large set of applications ranging from sponsored search advertising to credit scores to information sharing among competitors. We consider the ex ante sale of information and relate it to different products that brokers, advertisers, and publishers use to trade consumer information online. We discuss the endogenous limits to the trade of information that derive from its potential adverse use for consumers.This is joint work with Alessandro Bonatti (MIT) and Tan Gan (Yale University).



For more details about this event, please visit: