Ran Wei, PhD, is the Gonzales Brothers Professor of Journalism in the School of Journalism & Mass Communications at the University of South Carolina, USA. A former TV journalist, active media consultant, and incoming Editor-in-Chief of Mass Communication & Society, his research focuses on media effects in society and digital new media, including wireless computing and mobile media.

Webster, J. G., & Ksiazek, T. B. (2012). The dynamics of audience fragmentation: Public attention in an age of digital media. Journal of communication, 62(1), 39-56. click here

# Abstract

Audience fragmentation is often taken as evidence of social polarization.

We offer a theoretical framework for understanding fragmentation and advocate for more audience-centric studies.

We find extremely high levels of audience duplication across 236 media outlets, suggesting overlapping patterns of public attention rather than isolated groups of audience loyalists.

# Three factors that shape fragmentation

## Media Providers

The most obvious cause of fragmentation is a steady growth in the number of media outlets and products competing for public attention.

## Media Users

What media users do with all those resources is another matter. Most theorists expect them to choose the media products they prefer. Those preferences might reflect user needs, moods, attitudes, or tastes, but their actions are ‘‘rational’’ in the sense that they serve those psychological predispositions.

## Media Measures

Media measures exercise a powerful influence on what users ultimately consume and how providers adapt to and manage those shifting patterns of attendance. Indeed, information regimes can themselves promote or mitigate processes of audience fragmentation

# Three different ways of studying fragmentation

## Media-centric fragmentation

An increasingly popular way to represent media-centric data is to show them in the form of a long tail (Anderson, 2006).

Concentration can be summarized with any one of several statistics, including Herfindahl–Hirschman indices (HHIs) and Gini coefficients (see Hindman, 2009; Yim, 2003).

### Herfindahl–Hirschman indices (HHIs)

The Herfindahl index (also known as Herfindahl–Hirschman Index, or HHI) is a measure of the size of firms in relation to the industry and an indicator of the amount of competition among them. It is defined as the sum of the squares of the market shares of the firms within the industry (sometimes limited to the 50 largest firms), where the market shares are expressed as fractions. The result is proportional to the average market share, weighted by market share. As such, it can range from 0 to 1.0, moving from a huge number of very small firms to a single monopolistic producer.

$HHI = \sum_{i=1}^{N}(Xi/X)^2 = \sum{i=1}^{N}S_i^2$

• $X_i$ - 第i个企业的规模
• $X$ - 市场总规模
• $S_i$ - 市场占有率

### Gini Coefficients

The Gini coefficient is a measure of statistical dispersion intended to represent the income or wealth distribution of a nation’s residents, and is the most commonly used measure of inequality. A Gini coefficient of zero expresses perfect equality, where all values are the same (for example, where everyone has the same income). A Gini coefficient of 1 (or 100%) expresses maximal inequality among values (e.g., for a large number of people, where only one person has all the income or consumption, and all others have none, the Gini coefficient will be very nearly one).

if $x_i$ is the wealth or income of person $i$, and there are $n$ persons, then the Gini coefficient $G$ is given by:

$G = \frac{\sum{i=1}^{n}\sum{i=1}^{n}|x_i-xj|}{2n\sum{i=1}^{n}x_i}$

## User-centric fragmentation

This approach focuses on each individual’s use of media, which is fragmentation at the microlevel. Most of the literature on selective exposure would suggest that people will become specialized in their patterns of consumption.

It seems that user-centric studies are generally designed to describe typical users or identify types of users. They rarely ‘‘scale-up’’ to the larger issues of how the public allocates its attention across media.

## Audience-centric fragmentation

A useful complement to the media- and user-centric approaches described above would be an ‘‘audience-centric’’ approach. This hybrid approach is media-centric in the sense that it describes the audience for particular media outlets.

### How to Build Network

• Node - media
• Edge - duplication of audience(percent) - how to compute?

The enlarged portion shows the link (i.e., the level of duplication) between a pair of nodes, NBC Affiliates and the Yahoo! brand, where 48.9% of the audience watched NBC and also visited a Yahoo! Web site during March 2009.

### Network Metrics

• expected duplication v.s. observed duplication

Our approach was to compare the observed duplication between two outlets to the ‘‘expected duplication’’ due to chance alone. Expected duplication was determined by multiplying the reach of each outlet. So, for example, if outlet A had a reach of 30% and outlet B a reach of 20%, then 6% of the total audience would be expected to have used each just by chance.1 If the observed duplication exceeded the expected duplication, a link between two outlets was declared present (1); if not, it was absent (0) (see Ksiazek, 2011, for a detailed treatment of this operationalization).

• degree score: converted into percent

For each outlet, the number of links is totaled to provide a degree score. For ease of interpretation, we converted these totals to percentages. So, for example, if an outlet had links to all the other 235 outlets, its degree score was 100%. If it had links to 188 outlets, its degree score was 80%.

• network centralization score(degree centrality?)

To provide a summary measure across the entire network of outlets, we computed a network centralization score. This score summarizes the variability or inequality in the degree scores of all nodes in a given network (Monge & Contractor, 2003) and is roughly analogous to the HHI (see Hindman, 2009; Yim, 2003) that measures concentration in media-centric research. Network centralization scores range from 0% to 100%. In this application, a high score indicates that audiences tend to gravitate to a few outlets (concentration), whereas a low score indicates that audiences spread their attention widely across outlets (fragmentation).

# Result

The distribution shows that almost all 236 outlets have high levels of audience duplication with all other outlets(i.e., degree scores close to 100%). Furthermore, the network centralization score is 0.86%. This suggests a high level of equality in degree scores and thus evidence that the audience of any given outlet, popular or not, will overlap with other outlets at a similar level.

For instance, the Internet brand Spike Digital Entertainment reaches only 0.36%5 of the population, but its audience overlaps with close to 70% of the other outlets. Although we do not have data on individual media repertoires, these results suggest that repertoires, though quite varied, have many elements in common. The way users move across the media environment does not seem to produce highly polarized audiences.

The graph_tool module provides a Graph class and several algorithms that operate on it. The internals of this class, and of most algorithms, are written in C++ for performance, using the Boost Graph Library.

Python modules are usually very easy to install, typically requiring nothing more that pip install for basically any operating system. For graph-tool, however, the situation is different. This is because, in reality, graph-tool is a C++ library wrapped in Python, and it has many C++ dependencies such as Boost, CGAL and expat, which are not installable via Python-only package management systems such as pip. Because the module lives between the C++ and Python worlds, its installation is done more like a C++ library rather than a typical python module. This means it inherits some of the complexities common of the C++ world that some Python users do not expect.

The easiest way to get going is to use a package manager, for which the installation is fairly straightforward. This is the case for some GNU/Linux distributions (Arch, Gentoo, Debian & Ubuntu) as well as for MacOS users using either Macports or Homebrew.

Use brew to install graph-tool on MacOS

brew installed another python into /usr/local/Cellar/python/2.7.13/bin/python, so we can open that python to use graph-tool.

It seems that graph-tool cannot be directly installed into anaconda till now, as is said below:

You cannot really mix homebrew with anaconda without defeating the whole purpose of isolated environments. I would try to find a way to install graph-tools directly into your anaconda environment. It seems that there are packages on anaconda cloud. But I am not sure how easy it is to install those. – cel Dec 24 ‘15 at 7:50

# Question

There is an equation of exponential truncated power law in the article below:

Gonzalez, M. C., Hidalgo, C. A., & Barabasi, A. L. (2008). Understanding individual human mobility patterns. Nature, 453(7196), 779-782.

like this:

It is an exponential truncated power law. There are three parameters to be estimated: rg0, beta and K. Now we have got several users’ radius of gyration(rg)

You can directly get rg and prg data as the following:

How can I use these data of rgs to estimate the three parameters above? I hope to solve it using python.

# Answser

According to @Michael ‘s suggestion, we can solve the problem using scipy.optimize.curve_fit

# Reference

scipy.optimize.curve_fit(f, xdata, ydata, p0=None, sigma=None, absolute_sigma=False, check_finite=True, bounds=(-inf, inf), method=None, jac=None, **kwargs)

Use non-linear least squares to fit a function, f, to data.

Assumes ydata = f(xdata, *params) + eps

81天的数据，100多个G，很尴尬的数据量，花了很多时间去捣鼓这个数据，却一直在外围打转，对数据本身的特点还是一无所知，这样下去，何时才能真正搞懂这个数据？

The dataset is composed of 81 days’ user records of Baidu Reading. Each file has columns as below:

Now I plan to track how the attention flows between people’s reading behavior on different book. What I need to do is to extract a distinct book sequence(in the order of ‘time_trigger’) along with its duration of every user from every daily files. That’s apparently a parallel job.

I wrote a function that can get one user’s attention flows, defined as below:

I once tried to do this function for each person in each file. That consumed me such long time. After calculating for 9 hours, only 5 files had been processed. I gave up.

A better approach is to do this job in parallel, which takes 3 steps.

• The Python codes is designed to process only one day’s file
• The Linux shell is used to manage each Python process
• The logging package is imported to log real-time output and status.

The advantages of using logging module is that it can log real-time information into files, no matter if your ssh to the server is on. However, if you use print to record output, it cannot be seen after your ssh is off. If you try to write output into files, it cannot be seen instantly, because of the buffer mechanism of file IO.

# logging

I used watch to show the content of log files every 10 seconds.

I started several Python processes, each processing one day’s file, then the 24 CPU of my server was all on work. That makes me so satisfied and excited!

Computational Social Science

David Lazer1, Alex Pentland2, Lada Adamic3, Sinan Aral2,4, Albert-László Barabási5, Devon Brewer6, Nicholas Christakis1, Noshir Contractor7, James Fowler8, Myron Gutmann3, Tony Jebara9, Gary King1, Michael Macy10, Deb Roy2, Marshall Van Alstyne2,11

## Question 1: Network Science and Big Data

What are your impressions of the way that network science has gone? A lot of it increasingly (since small worlds especially) focuses on the shape of the network, rather than the attributes of nodes, do you think that’s the right way forward? Is there anything big missing from network sociology, or a direction that you think it should be going in? Will “small data” networks be drown out by big data?

How is Network Science going on?

If you look back at my original paper with Strogatz it has “collective dynamics” right there in the title—it was always the relationship between structure and behavior that we thought was interesting, not structure for its own sake.

We also didn’t intend for the “small world” model that we proposed to be interpreted as a realistic model of network structure; rather we were trying to make a conceptual point that even subtle changes in micro-structure could have dramatic effects on macro-structure and hence possibly also macro-behavior.

I’ve also come to believe that modeling exercises that are unconstrained by data have a tendency to gravitate toward phenomena that are mathematically interesting, which is no guarantee of empirical relevance. Fortunately I think that in recent years we’ve seen more emphasis on studying both network structure and collective behavior empirically.（最重要的是数学推导与经验知识的一致性）

What questions truly need big data?

For some questions, such as when we are interested in rare events or estimating tiny effect sizes, it is indeed necessary to have a very large number of observations; in some of our recent work on diffusion, for example, it turns out that a billion observations is not excessive. But for other questions, the scale of the data is much less important than its type or quality. Sometimes it matters that a sample is unbiased or representative; other times it is important to have proper randomization in order to infer causality; and other times still it is important simply that you have instrumented the outcome variable of interest. Regardless, the point is that the data is relevant to the question you’re asking, not how big or small it is.

## Question 2: Why socialogical imagination exists but not economic thinking?

Economists have been pretty successful at clearly articulating a set of core concepts that have spread out into the broader world and form the basis for economic thinking: supply and demand; markets; externalities; people respond incentives; perverse incentives; sunk costs; exogenous shocks; etc. Since your early work brought together core sociological concepts (namely social influence and the Matthew effect): i) Do you think sociology should try to reorganize itself around core, “Soc 101” concepts that every introductory class would cover? We often talk about the sociological imagination, but that is much less clear than economic thinking.

The social reality is too complex

This is a tough one. One of things I’ve always liked about sociology is its embrace of multiple viewpoints, both in terms of theory and methodology. Personally I think social reality is too complex to be adequately accounted for by any single theoretical framework—a point that Merton made very eloquently many years ago in his article on middle-range theories. Unfortunately I don’t think his argument was properly understood at the time (e.g. by rational choice theorists) and I don’t think it is still.

Perhaps that’s because simple universal frameworks are institutionally powerful even when they’re scientifically questionable. And that’s why it’s a tough question: because I think that one reason why economics is so much more influential than sociology in government, in the media, and in society, is precisely because economists can articulate a fairly coherent worldview that they can all (by and large) get behind, whereas sociologists can’t really agree on anything. Economists are therefore in a much better position to offer answers to questions that people care about, whereas sociologists tend to point out all the ways in which the question is more difficult than the questioner realized. Even if the sociologist’s response better reflects our true understanding of the world, it’s no surprise that most people would prefer to listen to the economist.

Sociologist should try to solve some nontrivial but solvable problems to reach a consensus.

That said, I wouldn’t advocate sociology trying to develop a single set of core concepts just to compete with economics. Rather I would propose that sociologists identify a small set of nontrivial real-world problems that we believe we can actually solve, or at least make some meaningful progress towards solving, and then demonstrating that progress. Identifying nontrivial but solvable social problems isn’t easy, nor do I think that solving problems is the only measure of progress in a discipline. So I certainly wouldn’t advocate that everyone drop what they’re doing to work on these problems, or even try to agree on what they should be. But I do think that being able to point to a set of problems that sociologists have arguably “solved” would greatly enhance our collective reputation and help us to attract more students.

ii) If you could pick a handful of sociological concepts and then have everyone outside of sociology learn them, and they’d be as familiar as the economic examples listed above, what would they be?

A book called Everything is Obvious: Once you Know the Answer

I wrote a book a few years ago called Everything is Obvious: Once you Know the Answer about the failures of commonsense reasoning and how we systematically ignore them. I think the contents of that book is pretty close to the list of concepts I would like everyone to understand, including: the nature of common sense itself; the difference between rational choice and behavioral conceptions of individual decision making; cumulative advantage and intrinsic unpredictability; the fallacy of the representative individual; the perils of ex-post explanations and dangers of “overfitting” to known outcomes; the consequences of overfitting for predictions about the future; and the implications of all of these problems for practical matters of predicting success, rewarding performance, deciding what is fair, and even what is knowable. I wouldn’t claim that these concepts constitute a core of knowledge comparable to core concepts in economics, nor do I think it would help students directly solve real-world problems of the kind I just advocated for, but I do think it would teach students some epistemic modesty and might eventually lead to more intelligent public discourse about these problems. I’m not sure the book has accomplished any of that, but that’s why I wrote it.

## Question 4: Try to compare sociology PhDs with industry demands

You made a transition from academia to the private sector. One of the ways that people have suggested improving the sociology PhD job market should is to make work in the private sector a clearer option from the beginning. What do you think about sociology PhDs and the private sector? How do you think sociology PhDs could or should be going about? What skills should they develop? How should they present themselves to companies?

Big data need theoretical knowledge to build valuable insights.

It’s true that companies are increasingly excited about extracting value from data, which has made data science a very in-demand skill set. I also think that companies are starting to appreciate that truly valuable insight requires more than just good computational and statistical chops—some degree of theoretical knowledge is also required in order to ask the right questions, define the right metrics, and avoid basic errors of sampling bias, causal inference etc. This latter trend is much earlier in its life cycle than the former, but I think as companies learn more about the complexities and compromises associated with “big data” they will increasingly demand data scientists with social scientific training.

The bright side and downsides for sociology PhDs to go to industry.

So on the bright side I think that there is real potential for sociology PhDs to find intellectually rewarding work in industry. The downside is that in order to realize any value from their sociological training they also need a level of technical skill that is well beyond what students can expect to learn in the vast majority of sociology PhD programs. In our postdoc hiring we are starting to see a handful of strong candidates with sociology PhDs—up from zero just a couple of years ago—so that’s encouraging. But I suspect that these students mostly figured it out on their own or took it up themselves to find the relevant courses in other departments. Which is fine, and if I were a current sociology PhD that’s what I would do, but I think it would be better for the field to provide a more systematic level of training.

## Question 5: The differences between writing for AJS and Natures

You’ve been very successful publishing in journals such as AJS, while targeting broader audiences through high-impact journals such as Nature and Science. How is writing for an AJS audience different from writing for Nature and Science? Where would you send your manuscript if was rejected by Science? Do you think more sociologists should be looking to publish beyond our traditional journals in order to reach a broader community of scholars?

Different readers need different writing strategies.

Writing for AJS is completely different from writing for Nature and Science in almost every sense: length, style, treatment of related literature, acceptable methodology, conception of theory, presentation of results…everything. It’s also different from writing for computer science conference proceedings and physics journals, and all of those different outlets are also different from one another. I also occasionally write magazine articles, op-eds and trade books, and those are also all different in their own way. Learning to write for different disciplinary outlets and in different styles is time consuming and sometimes frustrating—because different groups of readers care about such different things. But I think it’s an effort that sociologists should make.

Try speak to CSers in their languages

The fact is that with very few exceptions researchers in other disciplines don’t read sociology journal articles, and when they do they find them incredibly long and tedious. For example, all that effort that we devote to situating our work is completely lost on most computer scientists, so when they get to the results section they wonder why it was necessary to write 40 pages in order to explain one table of regression coefficients. Given that CS is a much bigger and more powerful discipline than sociology, if want to have an impact on them or convince them that we are worth taking seriously, we will have to speak to them in their language and probably in their own publication venues.

A single high impact paper is worth many low impact papers.

On the other hand a single high impact paper is worth many low impact papers, so from a career perspective it’s not necessarily a waste of time to devote a year or two to getting something into a top journal. I do often wish that we could find a more efficient way to publish our research without compromising quality, and in that regard online-only, open-access journals like PLoS One and Sociological Science have some appealing properties. But the reality is that we live in a highly competitive world where attention is scarce; so my fear is that if we stopped using A-journal publications as a differentiator, the likely substitute (relentless self-promotion on social media anyone?) might be even worse.

## Question 6: How to choose between academia and industry

You spent several years at the Columbia Sociology Department. During your time there you mentored several prominent junior scholars including Baldassari and Salganik. How was your experience being an academic sociologist and why did you decide to leave for industry? Will you consider returning to Academia?

Why leave Columbia for Yahoo! Research?

I really loved my time at Columbia but around 2006 it started to dawn on me that, whether it liked it or not, sociology was going to become a computational science, much as biology had become a computational science in the early 1990s. All around us social data were exploding in volume and variety, from email to social networking services to online experiments of the kind I did with Matt (Salganik). It also occurred to me, however, that sociologists weren’t well equipped to handle this transition and that if we were going to make rapid progress we would need to the computer scientists to help, and possibly psychologists and economists as well. Columbia is now pretty open to interdisciplinary collaborations of this sort, and their data science institute is a great example of that openness, but at the time it was very hard to see how it would work within the confines of traditional academic departments.

I was also having difficulty recruiting grad students with rigorous mathematical and computational backgrounds (as you noted there were some like Matt and Delia and also Gueorgi Kossinets, but they were really the exceptions), and raising funding to support the whole thing. Towards the end I felt like I was spending all my time writing grant proposals or sitting in meetings and almost no time doing actual research. So when Prabhakar Raghavan called me from Yahoo! to ask if I would come and help them set up a social science research unit it was very tempting. Even then I wasn’t sure I would do it, and certainly didn’t expect to do it for long, but it really worked out wonderfully and now I’ve been at Yahoo! and Microsoft Research for longer than I was at Columbia.

Doing Research more purely at Microsoft or Yahoo!

Perhaps surprisingly, I think the biggest difference between my experience at Columbia compared with Microsoft (or at Yahoo!) is that I now spend much more time doing and thinking about research. The other big difference is that, in contrast with most university faculty, I am surrounded (literally—we all sit in cubicles in an open plan office) by researchers from different disciplinary backgrounds including psychology, economics, physics, and computer science.One of my colleagues once observed that university departments comprise lots of people with similar training interested in different problems, whereas research labs like ours comprise lots of people with different training interested in the same problems. I think that’s roughly true, and it completely changes the nature of how we work, which is highly collaborative, interdisciplinary, and very problem oriented. That is not to say that we only do “applied” research—we do some of that but we also do a lot of basic science and publish all our work in all the same venues as our colleagues in universities. Rather what it means is that we are more concerned with the relevance of our work to real-world problems and less concerned about what particular disciplinary tradition it fits into.

Would I ever consider returning to academia? I don’t know. I’m very happy at Microsoft right now: I work with fantastic colleagues, we get amazing PhD student interns every summer, and we work on a wide variety of extremely interesting problems. It’s been a great experience and every day I’m grateful to have the job that I have. So although I wouldn’t rule out returning to academia one day I’m not in any hurry to leave.

## Question 7: Common sense and its importance for sociology

You recently wrote an article on common sense and its importance for sociology. What was the intuition for it?

Sociologists conflate causal explanations with explanations that “make sense” of outcomes they have observed

As I mentioned, I recently wrote a book about how people rely on common sense more than they realize, and in so doing end up persuading themselves that they understand much more about the world than they actually do. In course of writing the book, it occurred to me that sociologists make many of the same mistakes that other people do. Just like other people, that is, sociologists conflate causal explanations with explanations that “make sense” of outcomes they have observed, unconsciously substitute representative individuals for collectives, overfit their explanations to past data, and fail to check their predictions. I didn’t belabor this point in the book because, as I mentioned earlier, I wanted it be an advertisement for sociological thinking not a critique of it. Nevertheless I thought the implication was pretty clear, so I was disappointed that some of my colleagues who liked the book’s appeal to non-sociologists seemed to think it had nothing to say to them. I decided that if I wanted them to get the message I would have to sharpen it up a lot, and also make it a bit more constructive; so that’s what I tried to do in that paper.

## Question 8: Changes needed for Sociology department and some tips for current sociology graduates

If you were in charge of a Sociology department and could implement any change you’d like, what specific changes would you introduce to its graduate training program? Is there something that current sociology graduate students aren’t doing that they should be doing?

Changes needed for Sociology department

As I mentioned earlier, I think that a data science sequence (e.g. data acquisition, cleaning and management; basic concepts and programming languages for parallel computing; advanced statistics, including methods of causal inference; some basic machine learning; design and construction of web-based experiments) would be super useful for sociology graduate students, and would make them both better social scientists and also much more attractive to prospective employers. There are already a handful of courses of this sort being trialed in various places, including Stanford, Columbia, and Princeton, and sociology departments could work with their colleagues in other departments to pull together a reasonable sequence from existing pieces. It would take some effort and probably resources, but I don’t think it’s unfeasible.

Some tips for current sociology graduates

In the meantime, as I mentioned earlier: if I were a current sociology grad student, I would be busy taking courses in computer science and statistics to augment my sociology training. I would also look around for any groups doing computational social science and ask to join them.

It is an adventurous thing to join a new interdisciplinary fields like computational social science.

The downside of new, interdisciplinary fields is that nobody really knows what is involved or what the standards are, so you have to be prepared to take some risks and also to feel out of your depth much of the time. The upside is that it can be incredibly stimulating, and there is the possibility of doing something genuinely new. I think computational social science is in that phase now, so it’s a great time for ambitious and creative students to dive in and see what they can do.

Sociological theory, if it is to advance significantly, must proceed on these interconnected planes: 1. by developing special theories from which to derive hypotheses that can be empirically investigated and 2. by evolving a progressively more general conceptual scheme that is adequate to consolidate groups of special theories.

— Robert K. Merton, Social Theory and Social Structure

…what might be called theories of the middle range: theories intermediate to the minor working hypotheses evolved in abundance during the day-by-day routine of research, and the all-inclusive speculations comprising a master conceptual scheme.

— Robert K. Merton, Social Theory and Social Structure

Our major task today is to develop special theories applicable to limited conceptual ranges — theories, for example, of deviant behavior, the unanticipated consequences of purposive action, social perception, reference groups, social control, the interdependence of social institutions — rather than to seek the total conceptual structure that is adequate to derive these and other theories of the middle range.

— Robert K. Merton

## 理论总体系和中层理论

……我们能够形成有限的理论，能够预测一般趋势和普通的因果律；如果它们扩大到全人类，在很大程度上可能就不真实了。但是，如果把它们限于一定的国家，它们就具有某种真实性……通过缩小观察范围，通过把自己限定在某些类型的社群上，并且如实地表述事实，就有可能扩大政治理论的范围。通过采取这种方法，我们能够增加从事实中得出的真正政治公理的数量，同时，使这些公理更加充实、生动和牢固。与仅仅是空洞无物的概括相反，它们类似于培根的中级原理；这种原理是对事实的概括表述，但非常接近实际，可作为生活事务的指南。——乔治.康沃尔.刘易斯

With the introduction of the middle range theory program, he advocated that sociologists should concentrate on measurable aspects of social reality that can be studied as separate social phenomena, rather than attempting to explain the entire social world. He saw both the middle-range theory approach and middle-range theories themselves as temporary: when they matured, as natural sciences already had, the body of middle range theories would become a system of universal laws; but, until that time, social sciences should avoid trying to create a universal theory.

– Mjøset, Lars. 1999. “Understanding of Theory in the Social Sciences.” ARENA working papers.

Network Diversity and Economic Development

by Nathan Eagle, Michael Macy, Rob Claxton

heterogeneous social ties may generate these opportunities from a range of diverse contacts(1,2)

• greater job mobility (14, 15)
• higher salaries (9, 16, 17)
• opportunities for entrepreneurship (18, 19)
• increased power in negotiations (20, 21).

Although these studies suggest the possibility that the individual-level benefits of having a diverse social network may scale to the population level, the relation between network structure and community economic development has never been directly tested (22).

As policy-makers struggle to revive ailing econ- omies, understanding this relation between net- work structure and economic development may provide insights into social alternatives to traditional stimulus policies.

The communication network data were collected during the month of August 2005 in the UK. The data contain more than 90% of the mobile phones and greater than 99% of the residential and business landlines in the country.

The resulting network has 65 × 106 nodes, 368 × 106 reciprocated social ties, a mean geodesic distance (minimum number of direct or indirect edges connecting two nodes) of 9.4, an average degree of 10.1 network neighbors, and a giant component (the largest connected subgraph) containing 99.5% of all nodes (23).

Although the nature of this communication data limits causal inference, we were able to test the hypothesized correspondence between social network structure and economic development using the 2004 UK government’s Index of Multiple Deprivation (IMD), a composite measure of relative prosperity of 32,482 communities encom- passing the entire country (24), based on income, employment, education, health, crime, housing, and the environmental quality of each region (25). Each residential landline number was associated with the IMD rank of the exchange in which it was located, as shown in Fig. 1.

We developed two new metrics to capture the social and
spatial diversity of communication ties within an
individual’s social network. We quantify topological diversity as a function of the Shannon entropy

High diversity scores imply that an individual splits her time more evenly among social ties and between different regions.

Diversity was constructed as a composite of Shannon entropy and Burt’s measure of structural holes, by using principal component analysis(PCA). A fractional polynomial was fit to the data.

Eagle N, Macy M, Claxton R. Network diversity and economic development[J]. Science, 2010, 328(5981): 1029-1031.