March 2024

How to uncover electoral fraud in Russia using statistics: a complete guide

Vladimir Putin won the presidential election for the fifth time, receiving a record 87% of the votes in the history of modern Russia, with an equally record turnout of 77%. With it comes a record level of falsifications. Cedar’s team has studied the detailed results of 12 federal elections during Putin's reign and documented various types of violations using electoral statistics methods. This is how Russian elections transformed from being relatively fair to a complete imitation of the voting process.

In 2024, Vladimir Putin received 22 million anomalous votes
The first elections without an "honest core"
The number of "honest" regions fell fourfold compared to the year 2000
How the Kremlin's electoral strategy has changed over the last 24 years
What does the record scale of falsifications in the 2024 elections indicate
How do mathematical methods for detecting electoral fraud work

In 2024, Vladimir Putin received 22 million anomalous votes

Vladimir Putin won the presidential election for the fifth time, receiving a record 87% of the votes in the history of modern Russia, with an equally record turnout of 77%. For the first time since 2008, there were only four candidates for the presidency, with the winner leading the nearest competitor by 83 percentage points.

No anti-war candidates were allowed to participate in the election. In February, Russia’s Central Election Commission (CEC) denied registration to Boris Nadezhdin, declaring just over nine thousand of the 105,000 signatures collected by the politician's team invalid. Another candidate, a journalist and former deputy of the Rzhev City Council, Ekaterina Duntsova, who called for stopping military actions in Ukraine, was not even allowed to collect signatures. The commission found 100 errors in the statements of her initiative group, in some cases involving a typo of a single character.

Analysis of data from 93,000 polling stations indicates at least 22 million anomalous votes cast for Putin in the recent presidential election. Excluding electronic voting, this figure accounts for 35% of the total share of votes for Putin.

To assess the share of anomalies in Putin's results, we used the method of physicist and electoral statistics expert Sergey Shpilkin, as well as other statistical tools. Shpilkin's method helps identify how many votes were added to the winning candidate through vbrosy (an elections fraud technique which is characterized by ballots being added to the electoral basket on behalf of absent voters either during voting or when filling out the final protocols). We detail the methodology for counting anomalous votes in the last chapter.

Our assessment of anomalous votes likely represents the lower bound of the actual level of violations. We did not include the format of remote electronic voting (REV), available in 27 subjects of the Russian Federation, as well as in occupied Crimea and Sevastopol, in our calculations. REV sites cannot be analyzed alongside offline sites because the concept of turnout has a different meaning: it refers to those who pre-registered for REV and ultimately used it. Therefore, turnout for REV is typically above 90%, but it cannot be compared with offline sites.

Despite the challenges in quantitatively analyzing the scale of falsifications, we can still assert that the 2024 elections surpassed all previous presidential and parliamentary elections in terms of falsifications. In almost every previous election campaign, vote manipulations were actively used to promote state-backed candidates.

The first elections without an "honest core"

In our analysis, we relied on electoral data from twelve federal elections held from 2000 to 2024, including the vote on amendments to the Russian Constitution in 2020. This data contains information on more than one million polling stations, the exact number of registered and voting voters, and the order of the final distribution of votes. This allowed us to assess the volume of falsifications over the entire 24-year period, both at the federal level and by region.

A unique feature of the 2024 elections was their excessive "contamination" – there were so many falsifications that standard analysis methods come with caveats. The "honest core" of votes, which is necessary to analyze anomalies, is virtually absent. Most votes on the histogram of distribution are concentrated in the area usually considered to be vbrosy – in the "tail."

In a fair election, the turnout and the leader's result should show a distribution close to normal – a dome-shaped curve with a clearly expressed peak. As shown in the graph below, this is exactly how the 2000 presidential elections looked. In terms of falsifications, they indeed showed the minimum level of election fraud – 1.9 million votes were added to Putin's result. At that time, falsifications were only carried out in individual regions, mainly the republics of the North Caucasus.

In the case of mass vbrosy, the distribution changes shape: since vbrosy increases both turnout and the result, the right part of the distribution "lifts" upwards. What we observe in the histograms of 2004, 2008, 2020, 2024 – a typical picture for elections where vbrosy were actively used.

Sharp peaks at whole values of turnout are the result of another falsification technique — drawing up protocols or risovki. Where drawing was actively used, the distribution shows a phenomenon analysts in Russia called the "Churov saw" (named after Vladimir Churov, who headed the Central Election Commission from 2007 to 2016). Usually, both technologies are used simultaneously. The year 2008 is indicative in this regard – the entire right part of the distribution of votes for the president is heavily "distorted" by both vbrosy and risovki.

Our analysis indicates that the scale of falsifications in federal elections increased from one electoral cycle to another, with parliamentary elections being falsified more actively than presidential ones. This is likely because the ruling party traditionally has a lower trust rating in society than the president. In this respect, 2024 was exceptional – the number of anomalous votes in the presidential election broke records for all parliamentary campaigns.

The number of "honest" regions fell fourfold compared to the year 2000

The 2024 elections broke records in terms of regional falsifications as well. In 41 subjects of the Russian Federation, the "honest core" of votes was virtually invisible, typically indicating an extremely high level of violations that could not be measured. Among the regions where a quantitative assessment was still possible, the Voronezh region led in terms of election falsifications, with nearly half of all votes (48%) for Putin being anomalous. The Samara (39%), Lipetsk (38%), and Kaliningrad regions (34%), as well as the Republic of Buryatia (36%), came close to this mark.

There were only 15 regions where the share of falsified votes was less than 10% in the recent elections. These included Arkhangelsk (2%), Kirov (4%), and Tomsk regions (4%), as well as Altai Krai (5%). Surprisingly, according to Shpilkin's method, few falsifications were found in the Moscow region (2%), although in the 2021 parliamentary elections and the voting on constitutional amendments, the region was one of the most "unfair" in the country.

This time, other falsification techniques, less detectable by Shpilkin's method, might have been used there, such as transferring votes from losing candidates to the winner. Moreover, at least 799 thousand residents of the Moscow region voted electronically — the highest figure in the country after Moscow, where REV was used by 71% of the population or more than 3.5 million of all voters.

The degree of regions' involvement in falsifications varied depending on the campaign. Many of them repeatedly moved from being regions with a noticeable level of falsifications to being heavily falsified, and vice versa. This depended on the political situation, personnel decisions, and the level of opposition activity in the elections.

Regions with less than 10% of vbrosy were considered to have relatively fair elections. Most regions failed to stay in this category throughout all 24 years of Putin's rule. Here are those that nearly managed:

Sverdlovsk region;
Khabarovsk region (except for 2016, but the excess of 3 percentage points could be caused by the method's margin of error);
Vologda region (except for 2020);
Karelia (except for 2011);
Yaroslavl region (until 2021).

Regions with 10% to 30% of vbrosy from the final result for the state-backed candidate were considered "noticeably falsified". This group consistently included about 15 regions, among them:

Saint Petersburg
Arkhangelsk region
Kostroma region
Orenburg region
Novgorod region

Regions with more than 30% of vbrosy were characterized as "heavily falsified". Such levels of falsifications were often shown by Chuvashia, Primorsky Krai, Tuva, Penza, and Rostov regions. Since 2020, Pskov, Khakassia, and Chuvashia regions moved to this category.

Regions with total manipulations consistently included some ethnic republics – Kabardino-Balkaria, Karachay-Cherkessia, Dagestan, Chechnya, Ingushetia, from 2004 – Ossetia and Tatarstan. Whereas in 2000 there were only four totally falsified regions, by 2024 their number had increased tenfold.

How the Kremlin's electoral strategy has changed over the last 24 years

2000 — 2008: From anomalous regions to the abolition of gubernatorial elections

The 2000 presidential election showed the least amount of falsifications throughout Putin's era. Then, the manipulations were geographically concentrated in the national republics (predominantly in Dagestan, Tatarstan, and Bashkortostan), but the volume of anomalies at the federal level, by our estimates, did not exceed 5% of all votes for Putin or 1.8 million votes. The distribution of turnout and the leading candidate's result in these elections was close to normal and resembled a "bell" with almost symmetrical edges.

During Vladimir Putin's first presidential term, the federal government consolidated control over regional elites and electoral procedures. Just three years later, in the parliamentary elections, the amount of falsifications grew to 16% of votes for Putin, and by his second term, the country saw the expansion and formation of a cohort of regions with ubiquitous manipulations – against the backdrop of general apolitical sentiment during elections, they consistently demonstrate high voter turnout and record votes for pro-government candidates. Their number, from 2003 to 2004, increased threefold, to 13. This group then included not only the republics of the North Caucasus, Mordovia, and Tatarstan, where signs of open competition were never observed, but also Kemerovo, Oryol, Kalmykia, and the Yamalo-Nenets regions.

Having been re-elected for a second term in 2004, Vladimir Putin signed a federal law abolishing direct elections for regional heads. This radical revision of the federalism principle was justified by "anti-terrorism" considerations – the president logically linked the reform to the Beslan tragedy, threats of territorial losses, and the need to strengthen federal control over rebellious national republics. Since then, providing the ruling party with the necessary number of votes in elections has become one of the key unofficial criteria for the effectiveness of appointed governors.

By the 2007 parliamentary elections, falsifications became a nationwide trend, and the regime in Russia, according to the PolityIV index, finally moved from the category of "electoral democracies" to "competitive authoritarianism". The share of regions with a level of falsifications above 10% doubled, and the number of regions with relatively "honest" elections reduced to 23 regions, where 39% of voters lived. According to our calculations, in 2007, "United Russia" received at least 12 million anomalous votes. Dmitry Medvedev collected the same amount of anomalous votes in the 2008 presidential elections.

A distinctive feature of the 2008 presidential campaign was the record turnout at the time — 76.3%. A key role in achieving it was played by just 3,793 precinct electoral commissions (PECs). They recorded anomalously round turnout figures, that is, multiples of five: 65%, 70%, 75%, 80%, 85%, and 90%. That is, the result, which in a fair count has as much chance of appearing as any other values (like 65.23% or 73.54%), was shown by 4.1% of precinct commissions in 2008.

Disregarding the scenario in which several thousand PECs in different parts of the country showed the same result to the tenth of a percent, this anomaly indicates manual adjustment of final protocols. The number of votes added through the drawing up of protocols can be calculated separately. According to our calculations, due to such direct intervention, Medvedev secured himself 3.6 million votes in the 2008 election (some of these votes are already accounted for in the 12 million anomalous votes calculated using the Shpilkin method).

2011 — 2012: Protest Explosion and Strategy Breakdown

The 2011 State Duma elections were technically only slightly "dirtier" than the previous ones — the amount of manipulations identified by the Shpilkin method was comparable to the electoral cycle of 2007-2008

However, the 2011 election campaign was preceded by a powerful consolidation of opposition forces and mobilization of the protest electorate. The strategy of voting for any party other than "United Russia" and thousands of independent observers at polling stations, trained to recognize falsifications, caught the Kremlin off guard. According to exit polls, "United Russia" was definitely losing its constitutional majority. This was also indicated by the initial results announced by the Central Election Commission. However, the final result changed the balance of power. "United Russia" received one of its lowest results in history — 49.3% of the votes, almost 15% less than in the previous elections, but still retained two-thirds of the seats in parliament.

Observers across the country witnessed gross violations of regulations and ballot falsifications. The evident proof of manipulation, published in independent media, prompted a series of mass street protests. The largest protests took place in Moscow at Bolotnaya Square and Sakharov Avenue, gathering up to 150,000 people, with military columns being dragged into the capital.

The regime used force as a response to the protests of "angry citizens". Surprisingly, it also responded by briefly creating, electoral "transparency". Immediately following the series of winter rallies and in anticipation of the presidential elections in March 2012, Putin, who was then Prime Minister, addressed the CEC: "I propose and ask the CEC to install CCTV cameras at all polling stations in the country, of which we have over 90 thousand. At all of them. And let them work around the clock - day and night."

Apparently, the presidential administration decided that easily detectable and observable violations triggered protests and were an obvious reason for opposition mobilization, thus deciding to temporarily suppress the fraud. The pattern of vote distribution and turnout in the 2012 presidential elections was radically different from the similar chart for the 2011 parliamentary elections. The elections were only three months apart, and such drastic changes in voting character cannot be explained by external factors (media coverage, pre-election coalitions) — they can only be attributed to the CEC's direct decision to abandon the previous scale of falsifications.

The decision to limit falsifications was particularly noticeable in the electoral data for Moscow, where the protests were the most extensive, and the "effect" from the temporary abandonment of manipulative technologies was most visible.

Similar changes were demonstrated in many regions. In addition to Moscow, 24 more subjects of the Russian Federation returned to the category of regions with relatively honest elections in 2012 — for example, Astrakhan, Oryol, Tyumen, Chelyabinsk regions, and the Republic of Adygea. The number of vbrosy and risovki across the country reduced by 220% in just three months.

2016 — 2024: The end of the thaw and transition to controlled voting

The thaw only lasted until the 2016 State Duma elections. Indeed, during this election campaign, efforts were made to move falsifications away from large cities to rural and district precincts, where turnout is traditionally higher and election monitoring by observers is less intense. The overall volume of falsifications returned to the 2011 level, creating a clearly expressed second "hump" on the national chart at precincts with higher turnout. Meanwhile, opposition voters from large cities had fewer obvious reasons for irritation at their polling stations. Learning from the previous Duma campaign, the leaders of large cities deliberately sought to suppress turnout. In Moscow, it was 35.2%, in Saint Petersburg – 32.5%; across the country, nearly 57.5 million voters did not show up at the polling stations. This did not prevent the CEC from recognizing the results, and "United Russia" obtained a constitutional majority in the State Duma for the third time.

Parliamentary elections are considered a rehearsal for presidential ones — they test the system's capabilities and see how electoral technologies work in a new context. The "quiet" Duma elections of 2016, in this sense, predetermined the strategy for several cycles ahead and the further degradation of the quality of electoral procedures. The Kremlin embarked on a general demobilization of the electorate with a clearly controlled targeted mobilization of administratively dependent voter groups.

The method of identifying electoral anomalies shows that the category of regions with minimal falsifications (less than 10%) increased by 2.5 times by the 2018 elections. The nominal number of anomalous votes in the 2018 campaign dropped to the level of the 2004 elections.

However, this indicates not so much an intention to "play by the rules" but rather a shift from direct falsifications to a strategy of pressuring dependent electorates. Specifically during this period, the practice of compulsory mobilization of public sector employees and employees of state-dependent enterprises was streamlined. The presidential administration launched an "information system" across the country, appointing regional administrations responsible for its implementation in each subject of the Russian Federation. Their task was to "maximally inform the staff about the elections and convince employees to vote."

Simultaneously, representatives of the CEC negotiated the details of "information support" for the elections with large enterprises. "The scheme does not imply coercion to vote," sources close to the Kremlin assured journalists. However, numerous testimonies suggest otherwise: managers began demanding photos of filled-in ballots from employees, and students were asked to register at polling stations before being allowed to take exams.

Furthermore, a strategy played out to eliminate any possibility of political competition in the elections. This was achieved by a blatantly dull electoral campaign and the absence of real competitors. The State Duma limited the rights of independent observers, reducing their number and capabilities, tightened the registration rules for candidates, and, according to the report by the Committee of Civic Initiatives, effectively shifted the registration process to a "manual informal control" mode.

Thus, by 2018, the factor of internal electoral competition effectively disappeared from the system, presidential elections were no longer considered a platform for challenge, and administrative pressure on voters increasingly became the tool for falsifications.

In the 2021 parliamentary elections, such a strategy led to 14.1 million anomalous votes adding 17 percentage points to "United Russia's" result — half of the votes for "United Russia" were vbrosy.

The year before, this enabled the setting of a historical record for anomalies in absolute numbers during the voting on constitutional amendments. The voting lasted several days with the possibility of remote voting. This nullified the work of observers, but had little impact on the ability to coerce the dependent electorate to vote, according to testimonies from doctors, teachers, and housing and utility workers. According to official CEC data, slightly more than half (57.7 million) of all citizens eligible to vote expressed their support for the amendments, which "reset" Vladimir Putin's previous presidential terms. We estimate the number of anomalous votes during this voting to be at least 28.6 million. Excluding these anomalous votes, support would not be 78% of voting electors, but 65%.

In 2020, according to our calculations, just over 7,000 polling stations, or only 7.6% nationwide, engaged in risovki of protocols. They secured almost 7 million votes for the ruling government. This remains the record level of such falsifications.

By 2024, the Russian regime had fully fine-tuned workplace mobilization technologies, involving at least 105,000 organizations in "controlled voting". Many employees were coerced into voting, especially through electronic voting (REV). The REV option, a pilot for the presidential election level, appeared mainly in protest regions, where pro-government candidates traditionally score below the national average. E-voting testing was not conducted in any of the regions with total falsifications.

What does the record scale of falsifications in the 2024 elections indicate

Researchers in electoral statistics believe that the foundations of the electoral system in modern Russia were undermined even before Vladimir Putin came to power: the administration of Boris Yeltsin resorted to, albeit limited, falsifications during the 1996 elections, which allowed him to bypass the Communist Party candidate Gennady Zyuganov in some regions in the second round. However, over 24 years of Putin's era, electoral manipulations have evolved from a side effect of democracy during the transition period into the main mechanism for the regime's self-perpetuation.

A significant characteristic for the Russian administrative system was the electoral cycle of 2003-2004. It was the last time when the parliamentary elections ended unpredictably: “United Russia” received 37.6% of the votes, barely maintaining a parliamentary majority. With Putin's re-election for a second term in 2004 and the subsequent electoral reform, elections in Russia definitively changed their meaning, and the country's political system transitioned into a stage of “competitive authoritarianism”.

However, the institution of elections and their outcomes remain important, not for external legitimacy in the eyes of the international community, as commonly misconceived , but as a signal of regime strength and a test of its resilience. In the Russian authoritarian context, the main recipients of these signals are not so much the loyalists as the internal “rent-seekers” who need a guarantee of the security of their assets in the following year. The preservation of these assets depends, among other things, on the incumbent's ability to garner votes, regardless of the method: through honest voting, falsifications, administrative resources, or preventing competitors from participating in elections.

Record turnout and a historic result following the Belarusian scenario represent a deliberate demonstration of strength, akin to "flexing muscles". The record percentage in elections is typical of the transition phase of an authoritarian regime into its hegemonic, usually more repressive, phase. Demonstrating hegemony and an unchallenged superiority in the political space is intended to finally demobilize opposition-minded voters. Boycotting elections, or reluctance to participate in the absence of real, non-negotiated competitors, benefits the regime as it opens up more opportunities for falsifications.

However, ballot falsifications and vote counting directly on election day are only part of a more complex strategy. The system has many other levers to control the electoral process. Electoral statistics do not reveal such elements as hindering the media campaigns of alternative candidates, preventing them from running in elections, or politically motivated criminal prosecution.

Parliamentary structures usually function as a tool for co-optation. Through them, authoritarian systems often try to involve opponents in the “systemic” field to neutralize protest potential and increase their resilience. After the mass protests of 2011-12, the State Duma virtually lost this function. With few exceptions, newly created parties by independent actors are deprived of any chance of entering parliament. The deputies of the State Duma themselves made sure of this by complicating the process for all those wishing to nominate candidates, campaigning, and monitoring elections.

For the regime, it has become more relevant to undermine the remnants of deputies' autonomy in single-member districts and to ensure “United Russia” gets much more than two-thirds of the seats in the State Duma (i.e., a constitutional majority). These were the instructions given to the party by the presidential administration, for example, before the last elections in 2021, all to enhance the appearance of support for the authority and overall consolidation.

Under Putin, presidential elections lost any hint of real competition. Alternative candidates traditionally play the roles of extras or spoilers, whose task is to fragment the protest electorate's votes. For the 2024 elections, the presidential administration decided not to allow even spoilers, apparently taking into account the mistake of Alexander Lukashenko, who allowed Svetlana Tikhanovskaya to participate in the presidential elections in Belarus in 2020.

At the same time, the degradation of the Russian electoral system has not yet reached the stage where any possibility of influencing the voting is completely closed off, and breakthroughs happen even in authoritarian regimes. Research shows that opposition strategies supporting a "compromise candidate" can have a tangible impact even amidst falsifications. Recent examples include the united coalition in Malaysia, which managed to win the national elections in 2018, ending a sixty-year dictatorship. Or the effect of Svetlana Tikhanovskaya's participation in the 2020 Belarusian elections, who picked upthe baton from her husband, human rights activist Sergei Tikhanovsky, imprisoned after deciding to run for president.

In Russia, this was evident, for example, in the opposition strategy of “Smart Voting” during the municipal elections in 2019. And in the recent presidential elections, when the opposition mobilized the protest electorate to vote against Putin, spoiling ballots or voting for any other candidate, while social media platforms were flooded with videos of burning ballot boxes, ballots doused in paint, and “protest” queues at polling stations.

While this does not lead to substantial changes, it undermines the demonstration of the total control over the situation that is so important to the regime. The fact that opposition can initiate civic activity even in such harsh conditions is a good signal for Russia.

How do mathematical methods for detecting electoral fraud work

Shpilkin method

Mathematical methods for detecting electoral fraud, such as the method developed by physicist and electoral statistics expert Sergey Shpilkin, are based on analyzing the distribution of votes for different candidates in relation to voter turnout at each polling station. This approach is particularly effective in majoritarian systems with a clear "leader" in the vote, whether it's a party or an individual candidate. Unlike many other methods, Shpilkin's approach provides a quantitative estimate, albeit approximate, of the scale of falsifications.

The model assumes that the percentage of votes for a candidate and voter turnout should not depend on each other; electoral behavior in a given area with a given population is consistent, meaning who each voter decides to vote for does not depend on how many people decide to vote at the polling station.

In honest elections, turnout and results should show a distribution close to normal. According to the central limit theorem, in a large dataset, the majority of the data will cluster around the mean value; the more extreme the value, the less frequently it occurs. When Shpilkin's method is applicable, "honest" voting in a two-dimensional distribution will look like a clear cloud, the center of which approximately coincides with the average values of turnout and the candidate's result in the region.

Mathematically, its shape is described by the Gaussian curve.

This pattern was observable in most regions in the year 2000, for example, even in the Kemerovo region, which has been known for systematic falsifications since 2004.

If the density of the "cloud" decreases unevenly from the center to the edges, forming a long "tail" of polling stations with high results and turnout (upwards and to the right), it is a sign of falsifications. This anomaly is especially noticeable against the backdrop of candidates with a normal distribution of turnout and votes.

It's important to note that the assumption that the distribution must take the form of a Gaussian curve is not a necessary condition for applying the method. In fact, a "normal" distribution in the strict mathematical sense cannot exist in electoral data, as repeatedly stated by the method's author. Elections cannot be viewed as a collection of independent variables, as electoral behavior is influenced by many factors, from the socio-cultural characteristics of a region to the level of information coverage or the effectiveness of pre-election coalitions. Some polling stations, districts, or even entire regions may show results different from the "average" not only because of falsifications. This argument about population heterogeneity was brought up by the Central Election Commission in response to electoral analysts' publications about mass falsifications.

However, empirical data confirm the method's effectiveness. In examples from heterogeneous countries like Poland , Germany , and Spain , we see that there is no "comet tail," meaning no clear linear dependence of the leader's result on their turnout, which indicates the absence of detectable falsifications. The same is true for most regions of Russia in the year 2000. As Shpilkin suggests, heterogeneities are smoothed out in a large dataset, which ensures the method's functionality.

According to the method, in honest elections, the distributions of turnout and votes for the "leader" (Putin, Medvedev, "United Russia," or votes for amendments to the Constitution) and all other candidates are identical in shape and differ in absolute value due to the different number of votes. Vbrosy increase both the turnout and the result, but only for one of the candidates. Falsifications can be identified by summing the results of "all but the leader" and multiplying them by a coefficient so that the curve for "all others" matches the curve for the "leader" in the area where we see a distribution resembling the "normal" distribution. In other words, we consider those areas where the shape of the "leader's" curve by turnout begins to diverge from alternative candidates. The area painted red on the graph represents the excessive "win" for the pro-government candidate. We will call this area anomalous votes.

The methodology's limitations

We do not believe that all anomalous votes can be explained solely by vbrosy and risovki. The Shpilkin method has a number of significant limitations, and its accuracy depends on several conditions.

Firstly, it works best on a large amount of data, as it relies on the "law of large numbers" in probability theory. The larger the number of polling stations and the voters registered at them, the more adequately the model works. If the sample is small, the likelihood of distortions increases. Therefore, one should not expect a distribution close to normal with a small number of polling stations; identifying the "honest core" of votes in such data is difficult. For example, the Nenets Autonomous Okrug, which has only 34 thousand voters, is poorly assessed by this methodology.

To improve the accuracy of our assessment, we preliminarily removed regions with less than 180 thousand voters (there were five such regions) and polling stations where fewer than 100 people are registered (which is 4.4% of all polling stations in 2021) — special or temporary stations, such as military units, hospitals, ships, etc. from the sample. Some regions have undergone reorganization and merged over the past 24 years; we considered them as a single subject of the Russian Federation even in the years before the merger.

The second limitation is that the Shpilkin method cannot always identify the "honest core" of votes. In particular, for some regions with excessive falsifications, the "true" result and real turnout, on which falsifications could be assessed, are not visible. To determine whether the method is applicable to a particular region, we also constructed a two-dimensional histogram of the distribution of turnout and the leader's result, where each point represents a polling station.

We considered a region not amenable to quantitative analysis if one of the following conditions was met:

Practically all stations are concentrated in the diagram area with turnout and results significantly higher than the average values for these parameters across Russia (typically, in such regions, both turnout and results at the majority of polling stations are around 80-100%);
Even in the area of distribution with lower turnout, noticeable traces of integer anomalies signaling "falsified protocols" — horizontal and/or vertical stripes;
The cloud is so blurred that determining the area of "honest" votes is impossible.

A classic example of a region where the chosen method does not work is Chechnya. Almost in every federal election, the pro-government candidate there receives the maximum result in the country, close to 100%. Since elections in Chechnya are completely falsified, on the two-dimensional histogram, we see a dense "core," but it is located at high values. A high core is an indirect sign that the region engaged in massive falsification of protocols and entirely fabricated the desired result.

However, reconstructing the actual turnout and preferences of the Chechen population is impossible. The only opportunity to see it was in the 2018 elections when observers from Navalny's team were allowed into Chechnya and other "electoral sultanates," and individual polling stations showed results close to the overall indicators for the country.

Such regions cannot be subjected to quantitative analysis. But it can be confidently stated that falsifications there were ubiquitous. We marked these regions as "highly falsified" and did not include them in the quantitative assessment of vbrosy.

The chosen method cannot assess all manipulations with votes. It is best at identifying vbrosy, although there are limitations here too; for example, if ballots were falsely put in the ballot box at a station for not only the leader but also, to divert attention, for alternative candidates, the Shpilkin model will show an underestimated estimate of falsifications. The method only partially captures falsified protocols at the vote counting stage and poorly detects the transfer of votes from losing candidates to the winner — in this case, the "leader's" result grows, but the turnout increases insignificantly.

In summary, three limitations of the Shpilkin method can be highlighted:

1. We cannot cover absolutely all regions in all federal elections with quantitative analysis. In some regions, falsifications are so large-scale that the boundary between real and fabricated results is not fixed. The number of such regions changes from election to election. There were only four in 2000 and 2003, and 39 in 2020.

2. The method may underestimate the scale of falsifications if less obvious methods than just vbrosy for the candidate-leader are used: falsified protocols, transferring votes from losers to the winner, vbrosy for alternative candidates. The result may also be underestimated due to averaging a large amount of data;

3. The estimate of vbrosy in a regional context may also be slightly overstated due to differences between rural and urban populations. Research on electoral statistics notes that rural stations traditionally have a higher turnout than urban ones, which could cause some votes to erroneously fall into the anomalous voting zone. It's best to regard these estimates as preliminary. The most indicative evidence of falsifications is the comparison of the scale of falsifications over time — from one election to another — within the same region. In this case, radical changes in voting nature cannot be explained by heterogeneity.

To further check the applicability of the Shpilkin method to analysis by individual regions, we summed up the level of falsifications each year by regions for which the method works and compared these values with the assessment of falsifications from aggregated data for all of Russia. For the general calculations for Russia, we only included those regions that are included in the regional analysis. Estimates of falsifications by both methods were similar, confirming the validity of the analysis. Minor differences are explained by the limitations of the Shpilkin method, described above.

The method for assessing integer anomalies

The method for assessing integer anomalies is another way to detect falsifications, specifically risovki where regardless of the actual number of ballots in the box, the members of the polling station commission enter a fabricated result into the final protocol.

Partially detecting such falsifications and assessing their scale is possible by counting integer anomalies, that is, an excess of polling stations where either the turnout or the winner's result takes on whole values. In an honest vote count, round numbers should occur no more often than any other numbers.

If integer values occur too frequently, it indicates falsifications; there can be no alternative explanation for "spikes" at whole values. People tend to invent whole numbers — a member of the polling station commission is more likely to give Putin exactly 80% or 82%, but not 82.32%.

On a two-dimensional diagram showing the distribution of turnout and the leader's result, integer anomalies appear as horizontal (corresponding to whole values of the leader's result) and vertical (corresponding to whole turnout) stripes; on a one-dimensional diagram, they appear as sharp "peaks" at whole values.

One of the methods for quantitatively assessing such falsifications was proposed by statistician and scientific researcher at the University of Tübingen, Dmitry Kobak: to assess the excess of such stations, he suggests using the Monte Carlo method. The advantage of this method is that it allows for assessing the statistical significance of the result — whether such a number of whole values could result from chance, or if it's unlikely. However, there are also several drawbacks.

Firstly, this method only allows for estimating the number of stations, not the number of excess votes for the winner. Secondly, it smooths out integer peaks but does not consider that the "honest" value is actually between the peaks, i.e., at non-integer values (for example, we would consider votes in the turnout interval from 80.2% to 80.9% as "honest"). Because of this, the quantitative result of falsifications obtained by this method will be underestimated.

We propose another method for assessing integer anomalies, based on the assumption that the "honest" election result is the result obtained at non-integer values, i.e., between the peaks. Our method also allows for estimating the number of anomalous votes, not just the number of stations with anomalies (although it is also applicable to them). Detailed calculations are provided in the appendix below.

The values we obtained are approximately twice as high as Kobak's calculations, which can be explained by averaging the peaks. But our calculations are likely also underestimated: surely not all dishonest stations indicated whole percentages. As with vbrosy, a more indicative assessment here is the evaluation of falsifications over time, from year to year, rather than absolute values.

The idea of our developed method is to use a two-dimensional histogram of the absolute number of votes for the leader, with turnout and the leader's result in percentages on the axes. The projection of this histogram on the "leader's result" axis is shown in the previous part of the article.

To count anomalies only by turnout or only by the leader's result, one could simply subtract a smoothed curve from the real data on such a one-dimensional graph. However, from one-dimensional graphs, it's impossible to get an estimate that includes both turnout anomalies and the leader's result anomalies since some stations exhibit both anomalies. On a two-dimensional histogram, it's challenging to reconstruct a smoothed image without peaks, as the number of points in a 0.1% interval by turnout and result is too low.

Therefore, we used a different approach, based on counting the density of points on a two-dimensional diagram. First, we'll demonstrate it by counting the number of "dishonest" stations, and then extend it to counting the number of "dishonest" votes.

We assumed that the peaks are located in intervals [n-0.1%, n+0.2%], where n is a whole number percentage, both for turnout and result (our estimates hardly change with slight variations of these intervals). The grid of integer anomalies in these intervals is shown in the graph above.

We considered the density of points at non-integer percentage values as "honest." Therefore, we calculated the density of points in these intervals (marked in blue) and multiplied it by the coefficient S/Sfair where S is the total area, Sfair is the area corresponding to values between whole percentages, thus obtaining an estimate of the total number of stations that did not "draw" whole values in the protocols (the same value that would have resulted from the smoothed picture minus the peaks):

To get the number of stations where protocols were "drawn," accordingly, you need to subtract this value from the total number of stations.

Similarly, you can calculate the "drawn" votes. In this case, you need to add not the number of polling stations but the number of votes on them. Continuing the analogy with density, the number of votes in this case is equivalent to the point mass.

This value is also subtracted from the total number of votes for the leader, thus obtaining the excess of "drawn" votes.

One of the arguments in favor of using Monte Carlo simulations is the supposed dependency of the number of integer anomalies on the size of the station, as on smaller stations, a whole result is more likely to occur by chance. In our method, the proportion of stations with anomalies almost does not depend on the lower limit of the station size. If there is such a dependency, then the higher the cutoff size of the stations we consider, the higher the proportion of dishonest stations, which cannot be explained by the method's limitations. Rather, it might be explained, for example, by the fact that urban stations with a larger number of voters may be more often involved in falsifications.

Research and infographics by: Alesya Sokolova, Katya Lakova, Aleksandr Bogachev

With a special contribution from the senior research fellow at the Finnish Institute of International Affairs Margarita Zavadskaya.

The code used to prepare this material can be found in our GitHub account.