Aug 2024

State of open data in Russia during the war

Since the beginning of the full-scale invasion of Ukraine in February 2022, Russian authorities have been regularly removing data from public access. Nearly 600 datasets disappeared from the "open data" sections of official websites of federal executive bodies. However, despite military censorship, access to many data fields is preserved. Openness infrastructure continue to operate due to the high inertia of the bureaucratic system and the middle-level bureaucrats' efforts to protect their turf.

This report was first published in the Russian Analytical Digest

To Hide or Not to Hide

In the summer of 2023, following drone attacks on Moscow in May, Russian federal agencies conducted an inventory of public data. According to our sources in the government, the Ministry of Economic Development, which oversaw this task, set the goal of categorizing each dataset into one of three categories:

  • Critical, requiring immediate removal from the website (several dozen datasets);
  • Sensitive, requiring temporary removal from the website until it could be transferred to a closed circuit accessible through the “Gosuslugi” portal (about 2% of datasets);
  • Not requiring special action (more than 97% of datasets).

Datasets requiring immediate removal included information on infrastructure locations, such as lists of thermal power stations or power transmission lines, as well as statistics on oil, gas, and coal production. Sensitive datasets included other geodata (such as topographical plans or road infrastructure information), lists of infrastructure objects, and registries of licenses (e.g., for the turnover of alcoholic products or waste management)—information that, according to officials, could be used in planning military attacks or imposing sanctions.

Two weeks after the drone attacks on Moscow, a dataset with atmospheric characteristics by altitude, which could theoretically have been used in drone development, was removed from the website of Roshydromet, the federal agency responsible for monitoring and forecasting weather, climate, and environmental conditions. Later, departments gradually removed lists and registries of potentially attackable objects—power stations, combined heat and power plants, power lines, and other similar objects. A total of 36 federal agencies hid datasets with addresses of their institutions and territorial departments (out of 55 agencies that posted such data), among them not only security and infrastructure executive bodies, but also, for example, the Ministry of Education and Rosalkogoltabakkontrol, the agency responsible for regulating the production, distribution, and sale of alcoholic beverages and tobacco products.

This is just one of many examples of the increasing “closedness” of the Russian state. Since the beginning of the full-scale invasion of Ukraine in February 2022, Russian authorities have been regularly removing data from public access. According to our calculations, nearly 600 datasets have been removed from the “Open Data” sections of official websites of federal executive bodies in the ensuing 2.5 years. Another 360 datasets from federal executive bodies (Ministry of Defense, Federal Penitentiary Service, Ministry of Justice, Ministry of Economic Development, and Ministry of Sports) disappeared along with the state Open Data Portal, as these agencies did not repost the files on their own websites.

The number of datasets cannot be considered a universal measure of data openness, as they contain varying amounts of data. Some datasets contain a single indicator for a specific year, while others span 10–20 years. One-sixth of the removed files (101 datasets) contain only administrative data (addresses and phone numbers of institutions, lists of public events, or lists of information systems). Low informativeness and relevance are common characteristics of datasets that Russian authorities publish as “open data.” One-third of the remaining 1,800 files contain administrative data, and one-third have not been updated for two years or longer.

Data that was not formatted as “open” (machinereadable)—various registries; statistical reporting forms; and textual reports containing macroeconomic, financial, crime, and social benefits indicators—were also deleted, sometimes retroactively for all previous years, or ceased to be updated. The exact volume of hidden data is difficult to assess due to the chaotic nature of statistical and open data publication—files in different formats were placed in different sections of websites and sometimes appeared as interactive web widgets.

On the Path to Legitimacy

The fact that we now have a reason to talk about data closure indicates a fairly high level of openness in previous years. Over the last 30 years, the Russian state has gone through three phases regarding openness (Begtin et al., 2019):

  • 1991–2012: formation of the legislative base for implementing the concept of openness and open data; first projects on data disclosure (State Procurement); launch of the “Open Government” initiative.
  • 2012–2018: striving for maximum openness; “Open Government” and other institutions working for open data.
  • 2018 to present: abandonment of the previous concept; gradual transition to a paternalistic model of relations between power and citizens; creeping rollback of initiatives for open data.

Why did the Russian state move toward openness? The literature provides several potential answers, related to internal and external legitimacy and their uses by the governing authorities.

First, the Russian government was influenced by a desire for international legitimization and integration with supranational institutions. For instance, as a member of the G8, Russia expressed its willingness to join the Open Government Partnership, a multilateral initiative that secures commitments from national and sub-national governments to promote open government, in 2012. In 2013, the OECD, in partnership with Rosstat, the governmental statistics agency, assessed the quality of Russian official statistics and their compliance with international standards, in particular the system of publication of statistical information (OECD 2013).

Second, the desire to attract foreign investment required the implementation of international transparency standards. It might be that some autocracies disclose information (especially related to economic performance) as a signal to the international community and potential investors. Maerz (2016) finds that economic globalization and international pressure stimulate non-competitive autocracies to publish some information and develop other factors promoting transparency, such as electronic government. Comparing several post-Soviet countries that have adopted Open Government Initiatives, the author concludes that the degree of transparency was higher in Russia than in other cases. The improving quality of governance correlates with Foreign Direct Investments (FDI) in Russia, per World Bank data (World Bank n.d.). Hence, openness became a mechanism for ensuring the security of investments.

Third, the prospect of domestic legitimation encouraged the Russian government to disclose information and make it publicly available. Information flows are important for good governance (Islam 2006). Maerz shows that competitive authoritarian regimes adopt e-government and open data initiatives mainly for internal legitimation purposes. The government publishes not only datasets but also detailed information about the government itself, its responsibilities, legislative texts, and documents. All this information, as well as established e-government processes, improve the quality of bureaucratic performance and consequently increase government approval. Beazer and Reuter (2019) analyze economic performance data and show that the Kremlin party United Russia is punished by voters for poor economic performance in places where mayors were appointed. This suggests that the state should care about such performance, which would explain why it is interested in collecting data that are used to govern and make decisions based on this information.

Fourth, the authoritarian technocratic model of state governance presupposed intensive digitalization and reliance on expertise, for which data were also required. According to the theory of informational autocracy (Guriev and Treisman 2020), modern autocrats forego the use of ideology or mass repression, instead focusing on fully controlling the information sphere and creating the perception that they are competent economic managers. However, it can be hard to lie about economic indicators. As such, some autocrats actually work to improve the quality of government and state capacity, which goes together with a certain degree of accountability and the free flow of information. Given that there are few end users of the “raw data” nationally, it does not pose a great danger to the regime as long as media access thereto is controlled by the authorities.

At the same time, in regimes where elections do not play the primary role in electing politicians at any level of power, accountability is transformational. So-called “long route accountability” (Dewachter et al. 2018) presumes greater centralization of power and stricter bureaucratic oversight. In this model, bureaucrats are not directly responsible to citizens. Rather, citizens voice their dissatisfaction to higher-level politicians, who in turn influence outcomes for lower-level bureaucrats (usually through punishments ranging from formal reprimands to loss of resources).

The high level of digitization and centralization characteristic of state information systems leads to many details becoming available simply as a by-product of administrative processes—data become an artifact of the “digital paternalism” model.

In Russia, for example, there is a very high level of openness of judicial data. The “Justice” information system, launched in 2006, is used by courts across the country in their daily work, and also helps citizens monitor the judicial process. Simultaneously, researchers and journalists have the opportunity to collect this data and study the functioning of the judicial system. While the official module aggregating judicial data stopped working at the beginning of 2024, it remains possible to collect data directly from court websites. Journalists and researchers have developed special tools for this purpose, such as the judicial data parser of the “To Be Precise” project.

The peak of the movement toward openness was the creation of specialized institutions that were supposed to spearhead the openness agenda at the federal level. In February 2012, “Open Government” appeared, but in the six years of its existence, it did not receive either sufficient powers or sufficient funding, which ultimately made its work less effective than planned.

Nevertheless, thanks to “Open Government,” standards of openness for federal executive bodies were adopted and both the concept of “open data” and technical requirements for publication, lists, and procedures for data provision were defined. Federal authorities published 2,200 machine-readable datasets on their websites.

By the time of its closure for “technical maintenance” in March 2023, the Open Data Portal contained 27,000 datasets. Most of them (84%) were first uploaded during the period of “Open Government,” with updates peaking in 2017.

Overall, the portal was more often an object of criticism by researchers than a “flagship” of open data in Russia. As of early 2023, 60% of datasets had never been updated, 30% had never been downloaded, and only 2% (470 datasets) had been downloaded a hundred or more times.

But from 2018, the regime increasingly moved from the model of “accountability through openness” to a paternalistic model of interaction with citizens. In such a top-down model, open data practices, which involve transparency and free access to governmental data, were not considered a priority.

Reasons for Data Secrecy

The first signs of a rollback, or at least a slowdown, of the openness initiative emerged after the start of Vladimir Putin’s third term in 2012 and intensified following the annexation of Crimea in 2014. Confrontations with Western countries led Russia to lose interest in international legitimization through participation in supranational openness initiatives. In 2013, Russia postponed joining the Open Government Partnership (OGP), an organization created for international exchange of experiences in implementing principles of openness in government management.

In 2014, Russia withdrew from international cooperation in the field of openness, which had been one of the tasks of “Open Government.” At the same time, the country suspended negotiations to join the Organization for Economic Cooperation and Development (OECD). Following the dissolution of the G8 in 2014, there was no further mention of its Data Openness Charter, which Russia had joined in the summer of 2013.

Another factor militating against openness was the fact that openness had facilitated anti-corruption investigations, which had become a major driver of Russian opposition politics in the 2010s and posed a serious threat to the regime. In 2016, the disappearance of the names of the sons of General Prosecutor Yuri Chaika from the real estate registry gained widespread attention. Shortly after an investigation by Aleksei Navalny and the Anti-Corruption Foundation (see chaika.navalny.com), their names were replaced with special codes. In 2017, amendments were made to the “Law on State Protection” that formally allowed officials to hide information about themselves and their families from public registries.

However, those factors did not lead to an abrupt change in trends. Inertia meant that a number of openness initiatives continued to develop for some time.

The year 2022 became a turning point. Since then, the scale of data closure has been unprecedented. Three main groups of indicators appear to have been targeted for closure.

  • Economic indicators that increase Russia’s vulnerability to sanctions. This category includes data that potentially facilitate the imposition of sanctions against the Russian state and business sectors. Six main groups of indicators have been closed, including macroeconomic and financial data, foreign trade, government procurement, state property, officials’ incomes, hydrocarbon extraction, production, and banking reports. At least 15 agencies have hidden 93 datasets. Meanwhile, the authorities often use very formal arguments, and the logic behind their actions is bureaucratic (for example, export and import data were hidden “to avoid speculation”). There is no way to assess to what extent the concealed data actually pose a danger and to what extent their concealment is lobbied for by interest groups (for example, companies that benefit from reduced transparency) or carried out by bureaucrats who seek to shield themselves from potential consequences.
  • War-related information used in journalistic investigations. There are much earlier examples of data being closed off after being used in an investigation. However, if previously such data were mostly related to corruption, now they concern any areas even indirectly related to the war. Here, the logic is driven by media popularity: data is removed not because of its specificity, but after it becomes the subject of a journalistic article. This category includes four groups of indicators: mortality from external causes, the number of disabled persons, the number of prisoners, and data on social benefits and allowances. At least six agencies have hidden 12 datasets. Much of this data had been used to indirectly assess the extent of Russian military losses in the war.
  • Data on social and economic issues. Since 2022, there has been significant movement toward hiding data that could potentially generate negative publicity for the government. This category is complex due to the swathes of data that have been obscured, making it difficult to estimate the exact number of datasets and indicators affected. This includes data on crime, microloans, environmental pollution, injuries in emergencies, and the condition of the aviation fleet. These data are not directly related to military actions but may reflect the negative impact of war and sanctions on Russian society. Possibly, some indicators have been “closed” preemptively, before they attract media attention and fall into category two.

Still Not a “Black Box”: How Can We Study Russia Despite Declining Transparency?

The authorities’ actions so far do not appear to comprise a thought-out strategy. Rather, government bodies react situationally to apparent or potential threats. Often, the deletion of datasets, especially technical ones, is more of a bureaucratic formality: the data are hidden inconsistently, with some entities removing everything and others only specific files. Moreover, deleted information can sometimes be found on websites in the form of text or tables.

The removal of a particular dataset is often the result not of a direct order from above but of a decision made by individual officials. For example, RosTrud, which for many years ranked as the most open government body, unexpectedly deleted over a dozen datasets about social payments, most of which were unrelated to the war.

The roll-back of openness initiatives has not yet led to outright data secrecy. The closure of data still has a gradual, albeit relentless, character. Despite military censorship, access to data pertaining to many policy domains is preserved.

Bureaucratic inertia plays a role here: some individuals responsible for open data, who have been in their positions since more democratic times, continue to publish information out of habit. This inertia and the continued operation of these institutions can sometimes counteract the trend toward decreased transparency. Additionally, the continued publication of data may represent an effort by mid-level bureaucrats to protect their turf (Bach 2021).

Some data cannot easily be removed from access because an infrastructure of state regulation and management is built around them. This infrastructure relies on the availability of such data to function effectively, making it challenging to restrict access without disrupting essential regulatory processes.

What’s more, although the Russian state is no longer seeking international legitimization and the movement toward openness has stalled, the search for sources of internal legitimacy and the technocratic nature of the state governance model relying on informatization provide hope that access to data will be preserved for some time. In the current climate, if citizens still have the right to access data, it is not because civil control is perceived as a good thing, but rather due to the “state as a service” paradigm, which implies that the state will help you solve problems if you use technocratic methods to influence it rather than political ones.

Claims about the poor quality of Russian data and widespread falsifications are also greatly exaggerated, as we demonstrate in our study “Can Russian Data Be Trusted? A Hazard Map of Official Statistics.” Typically, cases of direct manipulation of indicators (such as mortality statistics) are well-known to specialists. Most falsifications occur at the middle and lower levels due to attempts to implement centralized management by indicators, focusing on achieving strictly set target values (Kalgin 2016). At the federal level, as shown in 2022–2024, the state prefers to hide sensitive data rather than engage in direct distortion of statistics.

Additionally, new digital projects are emerging that simplify access to data or even open up new data that was previously unavailable (Kokorin et al. 2024). For example, OVD-Info publishes data on political repression, the RIMA project archives independent Russian media, and the Cedar project gives researchers access to electoral statistics, judicial data, and many other sources that can be obtained upon request.

For the detailed timeline of data closure in Russia, see Appendix.

With a contribution from Evgeniya Mitrokhina, PhD candidate at the University of Wisconsin-Madison.

References

  • Tobias Bach. 2021. “Organizing Accountability in Transnational Governance: The European Commission’s Role in Negotiating and Implementing the EU’s Free Trade Agreements.” ARENA Working Papers 8/2021. University of Oslo.
  • Quintin H. Beazer, and Ora John Reuter. 2019. “Who Is to Blame? Political Centralization and Electoral Punishment under Authoritarianism.” The Journal of Politics 81, no. 2: 648–662.
  • I. Begtin, V. Burov, A. Sakoyan, O. Parkhimovich et al. 2019. Otkrytost’ gosudarstva v Rossii. Ekspertnyi doklad. Schetnaia palata, ANO “Informatsionnaia kultura,” ANO “Tsentr perspektivnykh upravlencheskikh reshenii.” https://www.infoculture.ru/wp-content/uploads/2019/06/Otkrytost-doklad.pdf?ysclid=lv058qqcui899214467.
  • Cedar. 2024. “Can Russian Data Be Trusted? A Hazard Map of Official Statistics.” https://cedarus.io/research/russian-statistics.
  • Sara Dewachter, Annalijn Conklin, Nathaniel Mason, Lucinda Gosling, Susan Watt and Ellen Chappell. 2018. “Beyond the Short versus Long Accountability Route Dichotomy: Using Multi-Track Accountability Pathways to Study Performance of Rural Water Services in Uganda.” World Development 102: 158–169.
  • Sergei Guriev and Daniel Treisman. 2020. “A Theory of Informational Autocracy.” Journal of Public Economics 186: 104158.
  • Roumeen Islam. 2006. “Does More Transparency Go Along with Better Governance?” Economics & Politics 18, no. 2: 121–167.
  • Alexander Kalgin. 2016. “Implementation of Performance Management in Regional Government in Russia: Evidence of Data Manipulation.” Public Management Review 18, no. 1: 110–138.
  • Dmitrii Kofanov, Vladimir Kozlov, Alexander Libman and Nikita Zakharov. 2023. “Encouraged to Cheat? Federal Incentives and Career Concerns at the Sub-National Level as Determinants of Under-Reporting of COVID-19 Mortality in Russia.” British Journal of Political Science 53, no. 3: 835–860.
  • Dmitry Kokorin, Dmitriy Gorskiy, Elizaveta Zubiuk, and Tetiana Kotelnikova. 2024. “Known Unknowns: Studying Russia in Condition of Growing Non-Transparency.” International Foreign Relations Journal 55, no. 1: 23–42.
  • Seraphine F. Maerz. 2016. “The Electronic Face of Authoritarianism: E-government as a Tool for Gaining Legitimacy in Competitive and Non-Competitive Regimes.” Government Information Quarterly 33, no. 4 (2016): 727–735.
  • OECD. 2013. “OECD Assessment of the Statistical System and Key Statistics of the Russian Federation.” Accessed May 15, 2024. https://www.oecd.org/sdd/Assessment-of-the-Statistical-System-and-Key-Statistics-of-the-Russian-Federation.pdf.
  • OVD-Info. 2024. “Reports and Data on Political Persecution in Russia.” https://en.ovdinfo.org/reports.
  • RIMA Media. 2024. “Russian Independent Media Archive.” https://rima.media/en.
  • World Bank. n.d. “Foreign Direct Investment, Net Inflows (% of GDP).” https://data.worldbank.org/indicator/BX.KLT.DINV.WD.GD.ZS.