LEAKED: IRS Attempts to Trace Monero in Chainalysis Training Video, Response from Monero Community

22 Oct 2024

0:00

/34:03

We’re living in a comic book. A legendary cat and mouse game is taking place, with the world’s financial privacy at stake. Tyrannical regulatory bodies are obsessed with ensuring no one can exchange money without their knowledge. They join forces with international law enforcement and corporate technology partners that weaponize surveillance on the blockchain.

Who stands in their way? A bunch of Monero nerds and nodes.

The ultimate saga of cryptoanarchy continues, and anyone who values their money is caught in the middle. Read this article to discover the assault on private blockchains and how you can protect yourself.

Many thanks to the talented ULY.XYZ for the art in this article.

Intro

It's no secret that the IRS and other agencies have been looking to trace private cryptocurrencies like Monero. Quickly gaining popularity amongst darknet markets and cybercriminals, Monero is now the de-facto choice for any financial exchange where confidentiality is paramount.

Faced with the impossible task of tracing Monero, the IRS humbled themselves and asked for help in an open request. The leading cryptocurrency surveillance platform Chainalysis (or as they like to call themselves ‘blockchain data’ platform), answered the call.

This relationship blossomed into a $20M contract to support the activities of different IRS departments including the Criminal Investigation Divsion, Cyber & Forensic Services, Small Business / Self-Employed, and Office of Fraud Enforcement.

Does the IRS have visibility into Monero and other cryptocurrencies?

A newly leaked training video gives us an inside look into their techniques and capabilities. Today we’ll go deep into how financial surveillance is conducted and more importantly, how people can protect themselves, featuring a response from the Monero Research Lab.

Putting Out an S.O.S

Quickly, lets pore over the history of relevant crypto surveillance. In September 2020, the IRS put out the request, specifically the IRS CID (Criminal Investigation Division). They were looking for ways to investigate distributed ledger transactions involving private cryptos such as:

Bitcoin Lightning
XMR
ZEC
DASH
GRIN
KMD (Komodo)
XVG
ZEN

The IRS wanted to be able to pinpoint behavior associated with a particular user to conduct investigations, and associate their activity with other users, gathering open-source intelligence such as their names, all packaged in a neat little GUI (graphical user interface).

Two companies won the contract, from Sept 2020 to Sept 2021, although only Chainalysis is of note as they were able to deliver a working product.

Three months after the contract ended, Chainalysis put out a press release advertising support for tracing Lightning Network transactions in December 2021.

The IRS obviously liked the what they were seeing and Chainalysis won a $22M contract granting the IRS 620 yearly licenses to the software, access to an API, educational materials, training, and passes to conferences. You can find the redacted contract here.

When reviewing the training, it's important to remind ourselves that these efforts represent the highest levels of proficiency in financial surveillance. A grand cat and mouse game involving regulatory bodies, law enforcement, and centralized software providers up against a bunch of Monero nerds and nodes, spread around the globe.

Tracing Monero: A Kiddy Investigation

The leaked presentation was an IRS Criminal Investigations office hours session, by a Cybercrime investigator from Chainalysis Government Solutions.

In a 35 minute video, they gave an informative presentation on the Monero cryptocurrency, its history, and the challenges it presents investigators. They end the presentation with a demonstration of a ‘real-life’ investigation using Chainalysis tooling.

I’ve downloaded and re-uploaded the video for you on our Odysee channel if you want to watch it in full. I’ve written about the important parts below.

Bitcoin and Monero: We Are Not The Same

The privacy preserving properties of Monero became evidently clear in the investigation. For those that don’t know much about Monero, read through this section - if you know the fundamentals you can skip ahead.

Ring Confidential Transactions & Stealth Addresses

Like Bitcoin, Monero is based on an Unspent Transaction Output (UTXO) model. Unlike Bitcoin, transaction addresses and amounts do not appear on the blockchain.

Stealth addresses hide real Monero public addresses. Stealth addresses automatically generated for every transaction in Monero. Transactions sent to a public Monero address are transformed and sent to a one-time stealth address which is unlinkable to the real address. The real Monero address is never published on the blockchain.

Ring Confidential Transactions (RCT) hide transaction amounts. These are a mandatory feature on Monero since September 2017 that hides transaction amounts while letting block validators verify that the outputs of each transaction matches the inputs (ensuring Monero isn’t generated out of thin air). This is possible through the cryptographic magic of Pederesen commitments which verify that total sum of encrypted inputs match the sum of encrypted outputs. This is a grade school simplification of a deep topic, for more details see the official Monero documentation and this excellent article by Teemu explaining the math behind Pedersen commitments.

The end result is that Monero addresses and amounts can’t be analyzed on the blockchain. You will see this on display when we discuss the investigation.

Mixing Things Up

In Bitcoin, every transaction is signed by the private key of the sender to prove who sent it. This make things simple, but also associates transactions to the sender, and the inputs of each transaction can be traced back to the outputs and user who held the coins last.

In Monero, Ring Signatures put a stop to this and are arguably the most important privacy feature in Monero, working in tandem with Stealth Addresses and RCT.

Ring Signatures hide the sender in a transaction. In Monero, Ring Signatures are another mandatory feature. Ring Signatures work by taking the money you want to send (1 real input) and mixing it with a number of decoys, these are real outputs from real transactions chosen pseudo-randomly from the blockchain.

Your private key and the decoy public keys are used together to create a ring signature, which then signs the transaction for the group. A transaction signed by a ring signature can be cryptographically verified to come from the group of keys, however it's infeasible to figure out which of the keys actually made the signature.

Monero started requiring ring signatures in 2016, however users could choose the size of their mixing pool. Because ring sizes are public to the Monero blockchain, custom ring sizes harmed transaction privacy and in 2018, the Monero network enforced a standard ring size of 11. Today in 2024, the current ring size is 16.

Triple Edged Sword

To summarize, the privacy preserving properties of Monero work together to prevent observers from seeing transaction amounts, public addresses, and participants in a transaction. With these fundamental components of transactions under wraps, how could Chainalysis begin to trace Monero? As we’ll see, Chainalysis has some tricks up their sleeve.

What’s Visible?

Although fundmental components of Monero transactions are hidden, there are still some attributes of transactions that can be used in investigations, here’s are the important features mentioned in the presentation.

Fees paid: On the blockchain this will be the exact XMR value, but within Chainalysis’ tool, they will be represented as a multiplier of the current default rate. (1x, 2x, 5x, 10x…)

Mixins: Size of the ring signature, number of decoy outputs which will depend on the current mandatory ring size for Monero.

Unlock Time: You can set transactions to have lock outputs, where the outputs cannot be used until X blocks have passed. 10 blocks is the minimum.

Chainalysis’ Internal Monero Tool

Chainalysis’ has a Monero block explorer which lists out all of the most recent blocks in the Monero blockchain.

Drilling down into a block and a transaction takes you to the transaction overview. The transaction overview has the most basic information at the top with inputs and outputs split at the bottom. In the top section (transaction features box) we see:

IP Address (only before the Dandelion update Nov 2021)
Number of inputs
Number of outputs
Fee structure (1x, 2x…)
Transaction Heuristics (not explained)

When looking at inputs and outputs we see that decoys are greyed out and struck through, Chainalysis claims it has methods to detect whether those decoys were actually previously spent. We’ll dissect this a bit more in the next section.

If there’s a special RPC note in the IP column of the transaction that means a user connected directly to Chainalysis’ Monero node, exposing its IP address.

There’s also timing data between the first and second time an output was observed on the blockchain, indicated in the column containing a number of milliseconds (ms).

A ‘Real-Life’ Investigation

The mock investigation starts based off of a real investigation that targeted Darknet Market administrators allegedly working out of Colombia. And one other major piece of information, a list of about 70 transactions hashes from an external swap service, MorphToken.

These swaps of interest are from BTC to XMR, and transactions typically have 2 outputs, one will be the real output (back to the person initiating the swap) and one back to MorphToken as their change.

Loading the transaction hashes from a Google doc, the presenter loads those transaction ids summaries. The tool automatically identifies the transactions as MorphToken swaps, which are then highlighted with an orange color.

The IP addresses of important services are also highlighted as part of the investigation.

With the initiating transactions identified, the investigator then looks to find co-spends, meaning future transactions on the Monero blockchain that spend multiple outputs from the original set of MorphToken outputs. Co-spends are suspicious because only the real holder of the funds could spend multiple outputs, it would be highly unlikely for multiple co-spends to show up as decoys on an unrelated transactions.

The investigator picks the most likely co-spend transaction to drill down into that has 4 co-spends included in the 4 inputs of the transactions. It's very likely the target moving their XMR to the next point. The target in this case also connected directly to a malicious Chainalysis node, but looking up their IP pointed to a VPN. No luck this time.

Because 4 co-spends are present in this transaction, it's extremely likely that those are the genuine inputs and so the other decoy inputs are greyed out. The presenter calls this an indicator of ‘common control’.

On the output side, the investigator looks for highlighted IP addresses that have already cross-referenced with exchanges, like ChangeNow.

At this point Chainalysis would pass this information over to law enforcement which would ask ChangeNow for more information on the transactions.

Because there are still a lot of decoys, it's unclear what the true outputs are. These are shots in the dark.

The investigator goes back to another transaction that shows ‘common control’, and has 2 co-spends in its 2 inputs.

Another highlighted IP address points to Exodus wallet, which the investigator claims doesn’t log information of the users that use it and would be another shot in the dark.

Then the investigator chooses a transaction without any co-spends as a transaction of interest. This transaction is also a MorphToken swap, and there’s some additional information at the top.

Well, a lot of information - including how much BTC was deposited and how much XMR was sent out (160.96 xMR), as well as what address it was sent to.

The first output here is easily identified because it shows up in the list of MorphToken transactions as an output that would be confirmed to be spent later by MorphToken.

The other output is likely to be the user and the decoys have been ruled out - we’ll talk about why that might be later.

Following this output to its next transaction we could assume that we’re getting close to the user. It's easy to determine that the original MorphToken swap output is an input in this transaction.

We also notice an output with an RPC IP address, indicating that the user connected directly to an Chainalysis malicious node.

It's another VPN, this time Slovakian.

We follow the transaction forward one more time to find a transaction with all of the decoys ruled out with RPC IP addresses having connected to the node. This time—it's in Colombia—not protected by a VPN. It's likely the target!

Now that they had the target’s IP address, they fed it into another Chainalysis tool named reactor which would scan for all transactions related to the IP.

And it found multiple related to a Centralized Exchange and Merchant Services entity, both places that likely KYC.

This was the final lead that was needed to find the identity and apprehend the subject.

Cheating on The Investigation

When I watched this for the first time, I had many questions. Why MorphToken? Who was running the investigation? How’d they rule out so many transactions? How did they learn about the transactions in the first place?

Before I go into this, it's important to note that this is my own speculation, based off of seeing specific information about the swap inside the Chainalysis tool. This was exchange specific information including the amounts of BTC and Monero, which could only really come from MorphToken.

After a quick search, this mock investigation was very likely based off of a real FBI investigation in March 2020 against Darknet Market actors.

This information comes from Blueleak documents, a 270GB leak of law enforcement files, including an FBI intelligence report outlining Darknet Market investigations in 2020, referencing use of a ‘proprietary software tool that analyzed financial transactions of the Bitcoin blockchain’.

That almost certainly sounds like Chainalysis.

Here’s an FBI Expressions of Likelihood chart in case you needed to translate my words into percentages of chance.

Assuming that this is the same investigation, then we must note that the FBI was able to retrieve data from the MorphToken swap service. Although MorphToken was outside of US jurisdiction (based in Panama) and didn’t require KYC, they still cooperated with the FBI.

A quote from the report:

This assessment is made with high confidence, based on FBI investigations, blockchain analysis, use of proprietary software, information from MorphToken, and information obtained from Darknet sites and forums that cater to DNM actors.

MorphToken had some level of cooperation and could have provided more information about the transactions including the internal IDs associated with trade requests on the platform along with the real transactions that sent XMR to the target of the investigation.

In fact we can get a window of what data was fed into Chainalysis by MorphToken in the tool itself.

With the genuine transaction date, time, and amounts - you can trace this transaction forward and eliminate decoys based off of the spending patterns (decoy outputs occurring days from the original transaction are obvious fakes).

There were 70 transactions associated with the user in the investigation, and if Chainalysis knew the timing of these real transactions then it could more easily disqualify decoys based on the times.

This is a known weakness of the decoy selection algorithm as Rucknium, anonymous statisician of Monero research lab and Monero contributor j-berman have pointed out.

Response from the Monero community

We had the privilege of asking the Monero community our questions from this article, and received a response from Rucknium - an anonymous statistician of Monero Research Lab who’s conducted some of the first professional statistical analyses of the Monero blockchain and contributed work through Monero’s CCS (community crowdfunding system), including current work on improving Monero’s Decoy Selection algorithm.

The following are comments from Rucknium in response to questions:

What do you think about the training video?

What I didn't see in the video was any discussion of the false positive rate of these techniques. That's a big criticism I have about many blockchain surveillance companies, as a scientist: There is little evaluation of the uncertainty inherent in their findings. That can easily cause false accusations, just like people in the past were falsely accused based on unscientific analysis of arson patterns, ballistics, and bite marks in criminal forensics.

What about malicious nodes on the Monero network? Is there a way to distinguish them and prevent them from collecting data on users?

Spy nodes can play two roles. First, they can act as malicious remote nodes to de-anonymize users who do not run their own nodes and instead use remote nodes to submit transactions to the network. If users connect to those remote nodes without any proxy like a VPN, Tor, or I2P, then their home IP addresses can be exposed.

Second, they can listen for transactions as they are relayed between nodes to try to find which node was the first one broadcast the transaction, which is the actual source node of the transaction. As the Chainalysis employee said, the Dandelion++ protocol implemented in Monero in 2020 made this type of de-anonymization attack much more difficult. There's an alternative to Dandelion++ called Clover that could provide better privacy in certain cases. Myself and other Monero Research Lab researchers may evaluate Clover for possible implementation in Monero.

No one needs permission to join the Monero network. It is decentralized. There is not a reliable way to know which nodes may be spy nodes if the spies decide to blend in, but Monero's node connection code tries to be connect to a diversity of IP addresses in the IP address space to avoid connecting to too many nodes that may be controlled by one entity.

There are a couple of solutions to the remote node problem. First, users can run a node on their own computer instead of relying on a remote node that may be malicious. In the most recent version of the Monero GUI wallet, pruning was enabled by default ( https://github.com/monero-project/monero-gui/releases/tag/v0.18.3.4 ). Pruning cuts the required disk space to run a node in half. Before the change in the pruning default, I performed an analysis of the safety of having more pruned nodes on the network in Appendix B of https://github.com/Rucknium/misc-research/blob/main/Monero-Black-Marble-Flood/pdf/monero-black-marbl...

If users cannot run their own node, they can use a proxy like Tor to connect to remote nodes. There are still some risks when using remote nodes like nodes lying about the necessary transaction fees, but at least a proxy will shield a user's IP address from the malicious node. Users can ask someone they trust to run a node for them, and only connect to that node.

The Chainalysis tools mentions heuristics they use to rule out decoy transactions, what could these be and how do you think their investigation worked?

In their case study, a large consolidation transaction was helpful to Chainalysis to generate a hypothesis of which ring members were the real spend. They had information about transaction outputs sent by a single coin swapper, MorphToken*. Large consolidations are known to be risky with ring signatures when an adversary has a large amount of information about which outputs a single user owns. Chainalysis basically performs an Eve-Alice-EVE (EAE) attack, an attack that the Monero Research Lab has theorized. Chainalysis use the consolidation transaction for the first leg of the attack and then the IP address gathered by a malicious remote Monero node for the second leg.Different transaction fees was at the top of their list of ways to distinguish transactions. I worked on fee uniformity a lot last year. I developed a formula for the privacy risk of the non-uniform fees, identified non-uniform fees in the blockchain data, and asked Exodus wallet to fix their non-standard Monero fees. Justin Berman fixed some fee uniformity issues in MyMonero a few years ago. Links:https://github.com/Rucknium/misc-research/blob/main/Monero-Fungibility-Defect-Classifier/pdf/classify-real-spend-with-fungibility-defects.pdf https://github.com/Rucknium/misc-research/tree/main/Monero-Nonstandard-Fees https://old.reddit.com/r/Monero/comments/176e1zr/privacy_advisory_exodus_desktop_users_update_to/https://github.com/mymonero/mymonero-core-cpp/pull/36

Follow up: Was MorphToken cooperating with the attacker?

In my opinion, the video isn't completely clear about this. Maybe there are three ways that the parent transaction outputs could have been labeled "MorphToken"1) (Least amount of info given to Chainalysis.) There was no formal relationship between Chainalysis and MorphToken. Chainalysis used its spy nodes to figure out which node was broadcasting MorphToken-related transactions. This kind of analysis would have a lot of error because of Dandelion++.2) (Medium amount of info given to Chainalysis.) MorphToken told Chainalysis which transactions it sent, but did not give any other info. This would mean that Chainalysis would know which transaction outputs MorphToken was responsible for, but not which user they were sent to. So this would allow Chainalysis to narrow things down.3) (Most info given to Chainalysis.) MorphToken collected a lot of information about swaps, including possible personally identification info about people like IP address and which coins they were swapping from. (i.e. if a cluster of bitcoin addresses were swapping frequently with Morphtoken, then likely the same user was swapping many times.) This info is possibly given to Chainalysis.

Takeaways from a User Perspective

Motivated investigators will likely force additional information about target transactions from exchanges that include dates, times, and amounts.
You absolutely need to avoid using a wallet that logs data or cooperates with law enforcement. Luckily there are many open-source wallets for this, including Monerujo, Stack Wallet, and others.
Co-spends will land you into the most trouble, this will happen when you receive large amounts of Monero from one place and decide to send most or all of it elsewhere. Break up large transactions if you can.
Timing attacks (temporal analysis) can also take place to rule out decoys transactions, you may consider randomizing the time between transactions as your Monero moves or to avoid transactions that are in close succession to each other.
It's very likely that entities are surveiling the blockchain by running malicious nodes (or even proxies that forward traffic to genuine nodes). That’s why it's so important to run your own node. And never forget to use a VPN or onion routing.

Conclusion

All cryptocurrency is very likely being surveilled. The training video illustrated the most sophisticated techniques by state actors such cooperating with blockchain surveillance firms and exchanges. In Monero’s case, all of this external information was needed.

In other cryptocurrencies with transparent ledgers such as Bitcoin, tracing funds is trivial. Due to the widespread popularity of KYC exchanges, Bitcoin transactions can easily be translated into financial associations between different users.

Given that all of the IRS departments are working with Chainanalysis, there’s a high possibility that these types of techniques will be used to identify those avoiding taxes using cryptocurrency.

Bitcoin funds can be traced to their source and marked as ‘dirty’ funds, leading to situations like Binance being fined $4.4B for transacting with Iranian groups.

In my opinion, the future looks grim for Bitcoin and other cryptocurrencies that do not have any privacy preserving strategies.

Take Back Our Tech

Lets use technology that doesn't use us.

LEAKED: IRS Attempts to Trace Monero in Chainalysis Training Video, Response from Monero Community