Yesterday I successfully defended my dissertation! It was very exciting, after five years’ worth of work, to see the project come together so well. Many thanks go to my committee, Dr. Thomas E. Flores, Dr. Agnieszka Paczynska and Dr. Todd LaPorte (who was remote, so missed the picture) – all three played a huge role in guiding me through the arduous process of getting from coursework, to proposal, to final product.
Countless family, friends and professional colleagues played a major role in this project as well. The list would be too long for one post, but I give my sincerest thanks to all of you who gave feedback, listened to new ideas, and were there to provide encouragement when it got hard – this project wouldn’t have gotten finished were it not for all of you!
The events in Turkey last night were nothing short of astounding – the world watched a NATO country, in which all was normal as late as happy hour, descend into political chaos as a coup was attempted and by morning has returned to a tenuous balance with President Erdoğan still (apparently) in charge. While the final outcome was driven by loyalists being more militarily and politically powerful than the anti-Erdoğan contingent, the perceptions of the population about where authority lies and thus what action to take is critical as well. The process of meaning making among the population about what was going on, and the importance of both mass communication and authority in settling the events, mirrored some of the findings in my dissertation research. People maximize their information gathering after a shock to make meaning from events, and as the information cycle evolves, the authority of sources is identified and collective decisions are made. Last night’s events were live tweeted, Facebook shared in real time, and broadcast through all manner of medium throughout the night. This culminated with President Erdoğan taking to FaceTime to give an interview and reassert his control over the country.
At the end of the night it wasn’t the broadcast media that directly beamed Erdoğan’s message out, it was him on an iPhone FaceTimeing remotely. In the partial information of the social media and news churn, the person endowed with the authority to make decisive calls cut through and focused both the discussion and the collective action going forward. The medium that he used was secondary to the importance of having a voice of authority broadcast into a chaotic information environment. While the situation is still fluid, a quick check of the BBC, Washington Post, NY Times, LA Times, Süddeutsche Zeitung and Paris Match front pages have all claimed the coup has failed. That’s the power of authority, even in a complex media churn.
Kieran Healy, a sociologist at Duke University, had an interesting take on the role of internet-based media in this coup. He points out that there were people downplaying the role of social media and broadcast technology in preventing the coup, and he counters the argument with an interesting comparative analysis of King Juan Carlos’s role in stopping the attempted F-23 coup in Spain in 1981. But what really caught my eye in his post was his discussion of the importance of mass communication in supporting collective action processes. Social media and the digital information environment played a huge role in how this attempted coup played out, and the interplay between authority and information medium was key in this process.
My dissertation research looked specifically at peoples’ preferences for information sources and mediums during shocks, such as election violence, natural disasters, or in this case an attempted coup. Social scientists, such as James Fearon and David Laitin, know that people on the whole don’t like chaos and in most cases will find ways to cooperate and maintain stability. In my research people do this by developing a common conception of the event, then identifying the sources of authority and the mediums to find their message on. In a modern, hyper-connected digital environment people can now participate in massive collective action processes because everyone has multiple options for information gathering and sharing. This connectivity keeps people involved in a collective meaning making process – even when people didn’t know exactly what was going on throughout the night, they were engaged and the narrative remained fluid. In the case of Turkey the military could never consolidate the message.
With a fluid narrative, people wait to consolidate into a collective action – there’s not enough information to decide whether to submit to the military or stick with the government. Overall it seems people preferred the government, and in spite of a broadcast media shutdown once Erdoğan got his message out it spread quickly and provided enough information symmetry to turn the collective tide against the military faction behind the coup attempt. What last night’s attempted coup demonstrated is the importance of digital media in preventing the military from consolidating the narrative enough to control the populace, as well as the power of authority to cut through a chaotic information space and solidify collective action during a shock to the political and social system.
I watched from a distance on Twitter as the World Bank hosted its annual data event. I would love to have attended – the participants were a pretty amazing collection of economists, data professionals and academics. This tweet seemed to resonate with a theme I’ve been focused on the last week or so: There is a data shortage such that even the most advanced countries can’t measure the Sustainable Development Goals (SDGs).
I replied to this tweet with a query about whether there was evidence of political will among EU member states to actually collect this data. In keeping with the “data is political” line that I started on last week, political will is important because the European Statistical System relies heavily on EU member states’ statistics offices to provide data. The above tweet highlights two things for me – there needs to be a conversation about where the existing data comes from, and there need to be MPs or MEPs (legislative representatives) at meetings like the World Bank’s annual data event.
Since Eurostat and the European Statistical System were the topic of the tweet, I’ll focus on how they gather statistics. Most of my expertise is in their social and crime stats so I’ll speak to those primarily, but it’s important to note that the quality and quantity of any statistic is based on its importance to the collector and end user. Eurostat got its start as a hub for data on the coal and steel industries in the 1950s, and while its mandate has grown significantly the quality and density of the economic and business indicators hosted on its data site reflect its founding purpose. Member states provide good economic data because states have decided that trade is important – there is a compelling political reason to provide these statistics. Much of this data is available at high levels of granularity, down to the NUTS 3 level. It’s mostly eye-wateringly boring agricultural, land use, and industrial data, but it’s the kind of stuff that’s important for keeping what is primarily an economic union running smoothly(-ish).
If we compare Eurostat’s economic data to its social and crime data, the quality and coverage decrease notably. This is when it’s important to ask where the data comes from and how it’s gathered – if 2/3 of the data necessary to measure the SDGs isn’t available for Europe (let alone say, the Central African Republic) we need to be thinking clearly about why we have the data we have, and the values that undergird gathering good social data. Eurostat statistics that would be important to measuring the SDGs might include the SILC surveys that measure social inclusion, and general data on crime and policing. The SILC surveys are designed by Eurostat and implemented by national statistics offices in EU member states. The granularity and availability varies depending on the capacity of the national stats office and the domestic laws regarding personal data and privacy. For example, some countries run the SILC surveys at the NUTS 2 level while others administer them only at the national level. A handful of countries, such as France, do the surveys at the individual level and produce panel data. The problem is that the SILC data has mixed levels of availability due to national laws regarding privacy – for example, if you want the SILC panel data you have to apply for it and prove you have data storage standards that meet France’s national laws for data security.
Crime and police data is even more of an issue. Eurostat generally doesn’t collect crime data directly from member states. They have an arrangement with the UN Office on Drugs and Crime where crime and police data reported to the UN by EU member states gets passed to Eurostat and made available through their database. One exception is a dataset of homicide, robbery and burglary in the EU from 2008-2010 that is disaggregated down to the NUTS 3 level. When I spoke with the crime stats lead at Eurostat about this dataset he explained that it was a one-off survey in which Eurostat worked with national statistics offices to gather the data; in the end it was so time consuming and expensive that it was canceled. Why would such a rich data collection process get the axe? Because it’s an established fact that crime statistics can’t be compared across jurisdictions due to definitional and counting differences. So funders reasonably asked: What’s the point of spending a lot of money and time collecting data that isn’t comparable in the first place?
A key problem I see in the open data discussion is a heavy focus on data availability with relatively little focus on why the data we have exists in the first place, and by extension what would go into gathering new SDG-focused data (e.g. the missing 2/3 noted in the opening tweet). Some of this is driven by, in my opinion, an over confidence in/fetishization of ‘big data’ and crowdsourced statistics. Software platforms are important if you think the data availability problem is just a shortage of capacity to mine social networks, geospatial satellite feeds and passive web-produced data. I’d argue though that the problem isn’t collection ability, and that the focus on collection and validation of ‘big data’ distracts from the important political discussion of whether societies value the SDGs enough to put money and resources into filling the 2/3 gap with purpose-designed surveys instead of mining the internet’s exhaust hoping to find data that’s good enough to build policy on.
I’m not a Luddite crank – I’m all for using technology in innovative ways to gather good data and make it available to all citizens. Indeed, ‘big data’ can provide interesting insights into political and social processes, so finding technical solutions for managing reams and reams of it are important. But there is something socially and politically important about allocating public funds for gathering purpose-designed administrative statistics. When MPs, members of Congress, or MEPs allocate public funds they are making two statements. One is that they value data-driven policy making; the other, more important in my opinion, is that they value a policy area enough to use public resources to improve government performance in it. For this reason I’d argue that data events which don’t have legislative representatives featured as speakers are missing a key chance to talk seriously about the politics of data gathering. Perhaps next year instead of having a technical expert from Eurostat tell us that 2/3 of the necessary data for measuring the SDGs is missing, have Marianne Thyssen, the Commissioner for Employment, Social Affairs and Inclusion that covers Eurostat, come and take questions about EU and member state political will to actually measure the SDGs.
The World Bank’s data team, as well as countless other technical experts at stats offices and research organizations, are doing great work when it comes to making existing data available through better web services, APIs, and open databases. But we’re only having 50% of the necessary discussion if the representatives who set budgets and represent the interests of constituents aren’t participating in the discussion of what we value enough to measure, and what kind of public resources it will take to gather the necessary data.
The last two posts I wrote focused on the social and political structures that drive data collection and availability. In these posts I was primarily talking about statistics in wealthy countries, as well as developing countries that aren’t affected by conflict or violence. When it comes to countries that are beset by widespread conflict and violence, all the standard administrative structures that would normally gather, process and post data are either so compromised by the politics of conflict that the data can’t be trusted, or worse they just don’t exist. Without human security and reliable government structures, talking about data selection and collection is a futile exercise.
Conflict data, compared to other administrative data, is a bit of a mash up. There are long term data collection projects like the Correlates of War project and the UCDP data program, both of which measure macro issues in conflict and peace such as combatant types, conflict typologies, and fatalities. Because both projects have long timelines in their data they are considered the best resources for quantitatively studying violence and war. Newer data programs include the Armed Conflict Location and Event Data project and the Global Database of Events Language and Tone. These projects take advantage of geographic and internet-based data sources to examine the geographic elements of conflict. There are other conflict data projects that use communication technologies to gather local-level data on conflict and peace, including Voix des Kivus and the Everyday Peace Indicators project.
This is just a sample of projects and programs, but the main thing to note is that they are generally hosted by universities and the data they gather is oriented toward research as opposed to public administration. Administrative data is obviously a different animal than research data (though researchers often use administrative data and vice versa). To be useful it has to be consistent, statistically valid in terms of sampling and collection technique, and available through some sort of website or institutional application. If the aim of the international community is to measure the twelve Goal 16 Targets in the Sustainable Development Goals, particularly in countries affected by conflict, international organizations and donors need to focus on how to develop the structures that collect administrative data.
We can look to existing models of how to gather data, particularly sensitive data on things like violence. Household surveys are a core tool for gathering administrative data, but to gather representative samples takes a lot of work. It also requires a stable population and reliable census data. For example if a statistical office gets tasked by a ministry of justice to run a survey on crime victimization, the stats office would need to interview as many victims as possible to develop sampling tranches. The U.S. Bureau of Justice Statistics National Crime Victimization Survey is an excellent example of a large-scale national survey. One only needs to read the methodology section to grasp how large an undertaking this is; the government needs the capacity to interview over 150,000 respondents twice a year, citizens need to be stable enough to have a household, and policing data needs to be good enough at the local level to identify victims of crime. Reliable administrative statistics, especially about sensitive topics like crime victimization and violence requires: Functional government, stable populations, and effective local data collection capacity.
While many countries can measure the Goal 16 Targets, countries affected by conflict and violence (the ones that we should be most interested in from a peacebuilding perspective) fundamentally lack the political and social structures necessary to gather and provide reliable administrative data. Proposing a solution like “establish a functioning state with solid data collection and output processes at the local and national level” sounds comically simplistic, but for many conflict-affected states this is the level of discussion – talking about what kind data to collect is an academic exercise unless issues of basic security and population stability and institutional capacity are dealt with first.
I published a post yesterday about how administrative data is produced. In the end I claimed that data gathering is an inherently political process. Far from being comparable, scientifically standardized representations of general behavior, public data and statistics are imbued with all the vagaries and unique socio-administrative preferences of the country or locality that collects them.
Administrative criminal statistics are an interesting starting point if someone wants to understand how data reflects the vagaries of administrative structures. If someone thought “I would really like to compare crime rates across European Union member states” they would probably be surprised to learn that unless they just compare homicide rates it’s impossible to compare crime rates between countries. This is not only because definitions of different crimes are different between countries (though the UNODC has done a lot of work to at least standardize definitions), but the actual events of crime are counted differently. For example, Germany uses what’s called “principal offense” counting – this means that in the event that multiple crimes are committed at the same time, the final statistics only count the most serious crime. Belgium doesn’t use this counting method, so its crime statistics look much higher than Germany’s on paper. The University of Lausanne’s Marcelo Aebi, arguably the expert on comparative criminal statistics, published an excellent paper on comparing criminal statistics and the problems posed by different counting procedures (pages 17-18 for those who just want the gist).
Aebi makes a crucial point in the conclusion of his article: Statistics are social constructs and each society has a different way of constructing them. Statistics represent the things we have valued. The past-tense is important here; when we see data it’s showing us the past (the 2016 Global Peace Index uses numbers from 2015, for example), and thus represents what we valued at the time. Data can be used to build and test models of potential future events, but there is no such thing as ‘future data’. The value in data is that it can help citizens and policy makers understand what worked, or didn’t work, so that policies and behaviors can be adjusted going forward.
Of course institutional and administrative behavior is often resistant to trends in data (or very comfortable with data that supports the status quo). This can be for valid, or at least non-nefarious, reasons. For example the Sustainable Development Goals (SDGs) rely heavily on GDP as an economic indicator. The SDGs are supposed to represent sustainable growth and social development into the future, so it’s interesting that they use an economic indicator that many experts and organizations view as quite flawed.
Why would the SDGs rely so heavily on GDP then? For one, it’s a reliable indicator – everyone at least has some vague idea of what is represents. Two, it’s got a long history – we have tracked it for decades. Three, most of the people who created the SDGs come from backgrounds where GDP is a standard indicator – they pick targets and data based on their professional and institutional experience. They didn’t do this because they’re jerks. They did it because GDP represents the standard, if flawed, way that we measure economic performance. They probably also did it because gathering new data is an expensive, time consuming process that everyone says is important [for someone else to pay for].
This is all to say: If you want better public data, or to at least understand why the public data you have seems to reflect the status quo instead of telling you how to break out of it, it’s imperative to understand the qualitative political, social and administrative behaviors inherent to the place or people you’re researching. Once you’ve got that, you can start the political process of organizing the resources to get newer, better, data to formulate newer, better public policy and/or smashing the status quo.
The 2016 Global Peace Index (GPI) launched recently. Along with its usual ranking of most to least peaceful countries it included a section analyzing the capacity for the global community to effectively measure progress in the Sustainable Development Goals (SDGs), specifically Goal 16, the peace goal. The GPI’s analysis of statistical capacity (pp. 73-94) motivates a critical question: Where does data come from, and why does it get produced? This is important, because while the GPI notes that some of the Goal 16 targets can be measured with existing data, many cannot. How will we get all this new data?
Some of the data necessary to measure the Targets for Goal 16 is available. I’d say the GPI’s findings can probably be extended to the other goals, so we’ll imagine for the sake of argument that we can measure 50-60% of the 169 Targets across all the SDGs with the data currently available globally. How will we get the other 40-50%? To deal with these questions it’s important to know who collects data: The primary answer is of course national statistics offices. These are the entities tasked by governments with managing statistics across a country’s ministries and agencies, as well as doing population censuses. Other data organizations include international institutions and polling firms. NGOs and academic institutes gather data too, but I’d argue that the scale of the SDGs means that governments, international organizations and big polling firms are going to carry the primary load. Knowing the Who, we can now get to the How.
National statistics offices (NSOs) should be the place where all data that will be used for demonstrating a nation’s progress toward goals is gathered and reported. In a perfect world NSOs would have necessary resources for collecting data, and the flexibility to run new surveys using innovative technologies to meet the rapidly evolving data needs of public policy. This is of course not how NSOs work. Much of what happens in a statistics office is less about gathering new data, and more about making sure what exists is accessible. In my experience NSOs have a core budget for census taking, but if new data has to be collected the funding comes from another government office. This last bit is important: NSOs do not generally have the authority to go get whatever data is necessary. If NSOs are going to be the primary source for data that will be used to measure the SDGs, it is critical that legislatures provide funding to government offices for data gathering.
International organizations are the next place we might look to for data. The World Bank, in my opinion, is the gold standard for international data. United Nations agencies also collect a fair amount of data. What sets the Bank apart is that they do some of their own data collection. Most international organizations’ data though is actually just NSO data from member states. For example, when you go to the UN Office on Drugs and Crime’s database, most of what you’ll find are statistics that were voluntarily reported by member states’ statistical offices. The UN, World Bank, OECD and other myriad organizations do relatively little of their own data gathering; much of their effort is spent making sure that the data they are given is accessible. Unless legislatures in member states provide funding to government agencies to gather data, and the government agrees to share the data with international organizations, most international institutions won’t have much new data.
Polling firms such as Gallup gather international survey data that is both timely, accurate and covers a wide range of topics relevant to the SDGs. Unfortunately their data is expensive to access. As a for-profit entity they have a level of flexibility to gather new data that statistics offices don’t, but this level of flexibility is very expensive to maintain. A problem arises too when Gallup (and similar firms) decide that the data necessary to measure the SDGs is not commercially viable to gather and sell access to. In this case legislatures would need to provide funding to government agencies to hire Gallup to gather data that is relevant to measuring progress toward the SDGs.
There is a pattern in the preceding paragraphs. All of them end with the legislature or representative body of government having to provide funding for data gathering. How we gather data (the funding, budgeting, administration, and authority) is entirely political. This is a key issue that gets lost in a lot of discussion around ‘open data’ and demands for data-driven policy making. It is too easy to fall into a trap where data gets treated as a neutral, values-free thing, existing in a plane outside the messy politics of public administration. The Global Peace Index does a good service by highlighting where there are serious gaps in the necessary data for tracking the SDG Targets. This leads us to the political question of financing data collection.
If the UN and the various stakeholders who developed the Sustainable Development Goals can’t make the case to legislatures and parliaments that investments in data gathering and statistical capacity are politically worthwhile, it is entirely likely that the SDGs will go unmeasured and we’ll be back around a table in 2030 hacking away at the same development challenges while missing the harder conversation about the politics necessary to drive sustainable change.
I’ve arrived and settled into Munich until early July, and along with a few trips to other parts of Germany and Brussels, it should be a good stay on the Continent. At the moment version two of the dissertation is under review, so hopefully by early June I’ll have feedback and an idea of when a defense will be scheduled. *Fingers crossed*
In the meantime, I’m gearing up for the academic job market with a search primarily focused on Europe. The fun part of the search will be developing job market papers, of which a few are underway:
“Peace Durability, ICTs and Peacekeeping: Technology investments and post-conflict economic growth through peacekeeping operations,” with Nicholas Bodanac.
My colleague Nicholas Bodanac and I propose that peacekeeping missions use of ICTs can play a significant role in spurring economic development, by providing initial capital to encourage ICT infrastructure investment in post-conflict settings. This initial investment in ICT infrastructure can continue to have benefits, as education, business and government make increasing use of communications systems over time. This can support economic and political development, supporting more durable peace during and after a peacekeeping mission.
“Peacekeeping and the ‘Crowd’: How does Crowdsourcing Technology Support United Nations Peacekeeping Operations?”
I presented this paper at ISA back in 2014: “Multi-dimensional peacekeeping operations have evolved significantly since the late 1990s, but the capacity to meet the needs of local populations has lagged behind the political expectations of operational capacity. Efforts to improve data on local-level violence in Liberia, and local reporting of violence via SMS text messaging in eastern Democratic Republic of Congo have demonstrated the United Nations Department of Peacekeeping Operations’s (UNDPKO) interest in improving local-level information collection and public opinion. This paper will use these two cases to frame methods for localized data collection, then extend them by discussion how different crowdsourcing technologies and methods can be used by the UNDPKO. It will close with an analysis of the theoretical issues associated with technology-aided peacekeeping, and policy challenges that come with integrating crowdsourcing into the UNDPKO’s information collection and management processes.”
“Is the (Technology) Tail Wagging the (Development) Dog: How do private technology partners affect the goals of development programming?”
I’m revisiting this paper, which I presented in Singapore in early 2015: “Communications technologies (ICTs) have played an increased role in development programming since the early 2000s. Chief among these technologies are mobile phones, which have been integrated into everything from public health, education, disaster management, and public administration. The key question from the standpoint of how these programs affect local populations is whether they are designed based on the needs of the beneficiaries, or on path-dependent organizational decisions to use particular technologies based on popular trends or commercial availability. This paper will explore the political economy of using privately held technology and data services for international development, laying out theoretical and policy implications related to privacy issues, digital divides and ownership.”
“When Information Becomes Action: Information Technology Use in Kenya and Samoa when Managing Collective Action Problems During Crisis.”
Basically my dissertation research, but shortened and tightened up for a comparative politics journal.
“Are We Innovating Yet? Understanding how organizations and project leaders design crowdsourcing programs in conflict settings”
Also from my dissertation, a deeper look at a selection of crowdsourcing projects implemented for disaster response, conflict analysis, and violence prevention. The main question I ask is: How have organizations and project designers integrated mobile technologies into their interventions, and have these programs led to institutional innovation in humanitarian response and violence prevention? Does the use of these types of technologies lead to greater interaction between local and international actors, or is crowdsourcing merely an alternative mechanism for surveillance and data collection?
Other than writing, I’ll be up in Berlin getting acquainted with the peacebuilding/conflict research scene June 7-11, and I’ll be in Brussels June 22-26 for the European Political Science Association meeting presenting the paper on peace durability, ICTs and peacekeeping. Should be a fun June!
After a lovely year in Sydney as a research fellow with IEP I’ll be headed back to the Northern Hemisphere to finish my dissertation. I should be defending it this summer – once it’s done, it’ll be on to new and exciting research!
This also means that I now have the freedom and time to start posting here again. While the year was fun it was also busy, which meant limited time to blog. I’m planning on some data-oriented posts, which will be fun.
The schedule the next month or two includes a presenting at the Midwest Political Science Association meeting, hopefully popping into the Tech4Dev conference on short notice, giving a paper with the inimitable Nicholas Bodanac at the European Political Science Association meeting in June, and hopefully circling all the way back to the Build Peace Conference in September. If you’re going to be at any of these, or anywhere in between, give me a shout!