(Prev title: How China Detects and Blocks TLS Proxies)

Abstract

Starting in 2021 June, the Great Firewall, operated by the GOV of CHN, started blocking and interfering with the widely used TLS proxies used by CHN internet users to bypass the well-known internet censorship in CHN. We have found evidence pointing toward a hidden reputation system against overseas IPs and exposed a common vulnerability of modern TLS proxies.

Introduction

Starting June 2021, we have seen massive reports indicating TLS proxies started to malfunction in CHN. While many proxy servers got their 443 port timed out in CHN, some other impacted servers still have their accessible HTTPS decoy websites reachable from CHN. This unexpected behavior may indicate that CHN is testing new technology to sabotage TLS proxy without harming normal TLS-based web browsing.

In this report, we summarized the efforts we put into the investigation of this incident, including our guesses, experiments with their data, verifications tool(s), and our final conclusion.

Background of Proxy usage in CHN

It is well known that CHN has the most censored Internet in the world if we do not count PRK’s under-developed internet in. While most of the internet users in CHN are still satisfied with their domestic internet ecosystem, there are still many people who try to access various services offered in only other countries. The activity to circumvent the infrastructure-level internet censorship of CHN called the Great Firewall(GFW) is known as Wall-breaking, or Fanqiang (literally, get over the wall) in CHN.

There have been many popular traffic proxying tools or protocols used mostly/only for wall-breaking, ranked according to their age:

  • OpenVPN (distinct due to low availability)
  • Shadowsocks/ShadowsocksR (endangered due to accurate port/TCP timingout/RST)
  • Project V toolsets and their forks/derivations (in good health)
  • Trojan-GFW (in good health)

As new wall-breaking tools are being developed and get popular in China, GFW also adapts to attack the emerging new technology.

Attack

Against proxy

On the experimenting TLS servers we have developed for CHN users, we observed massive TLS handshake failures caused by connection resets, which will effectively prevent TLS proxy to function correctly. However, we didn’t discover any major difference between a TLS handshake process initiated by a normal web browser vs. one initiated by a TLS proxy client (e.g. Trojan-Qt5, Windows x86_64). As the creator of Trojan-GFW protocol claimed, Trojan-GFW uses a “real handshake” to establish the TLS connection with the proxy server.

With the previous knowledge about the TLS, our initial guess points towards the TLS fingerprint leaked in the ClientHello message. Therefore we have experimented with many different client implementations with different ClientHello messages, including one utilizing TLS fingerprint extracted from the latest version of a mainstream browser.

However, the connectivity to the proxy server isn’t improved at all.

Against generic TLS connections

We also captured some packets to explain the different behavior between a real web browser and a proxy client. The pcap shows something interesting. We noticed the latest version of Google Chrome sends out at most 9 ClientHello messages with 3 unique fingerprints to the server in one attempt to load a specific webpage hosted on one of the proxy servers bring interfered. This behavior indicates that Google Chrome is also suffering from RST injection while it retries several times before giving up and displays ERROR_CONN_RST to its user. On the other hand, TLS proxy clients we tested with do not really retry, they just give up the connection attempt directly upon seeing RST.

We also see the supportive discussion on some internet technology-related forums in CHN. According to reports from many internet users and server administrators [1][2][3], the quality of the TLS connection deteriorated over the past month. Most reports have mentioned the same behavior: “Connection reset by peer” which is usually caused by RST injection.

Verification

We also did several verifications towards different guesses we had. Utilizing uTLS, we were able to forge any known ClientHello message and therefore forge the TLS fingerprints.

TLS Fingerprint Discrimination

Fig 1. TLS Connection Attempts made toward different servers with various TLS fingerprints

The data does not show any sign of TLSFingerprint-based discrimination.

Server-side supported ALPN Discrimination

Fig 2. TLS Connection Attempts made toward the same server with various server-side ALPN compatibility and TLS fingerprints

The data does not show any sign of server-side-ALPN-based discrimination.

SNI Discrimination

Fig 3. TLS Connection Attempts made toward 2 servers with various SNI and Hostname

The data does not show any sign of SNI-based or Hostname-based discrimination.

Other information and Conclusion

We have noticed that the RST is observed only on vantage points with the same ISP China Telecom. And the only server free from RST (us-cn2gia-1) is the one routing through AS4809 CN2. In other words, the RST attack happens on AS4134 CHINANET-BACKBONE only. This observation might indicate that this attack is not initiated by GFW, but China Telecom instead.

Also, according to the data we showed in Fig 1, the attack strength varies across different targets. This observation might indicate that this attack is based on a reputation system.

As our conclusion:

  • GFW does not show the ability to identify TLS proxy traffics from normal HTTPS web browsing traffics in real-time.
  • GFW isn’t utilizing a purely fingerprint-based discrimination. We don’t know if fingerprint matters in current state but it is not a major decision factor.
  • Statistics gives a strong signal about a backend IP reputation system. This may indicate that GFW is analyzing the TLS traffic and there exists difference between proxy traffic and real web browsing traffic.
  • No evidence about SNI or ALPN discrimination.
  • TLS Proxies which does not retry on RST is especially vulnerable to this random RST injection attack. A browser-like retrying mechanism might be needed for robustness.

Credits

My advisor, Professor Eric Wustrow at CU Boulder

Professor J. Alex Halderman at UMich

Jack Wampler at CU Boulder

And everyone else from Refraction Networking who withstood my spamming-like updates on this incident

Software Tool used during this project:

Refraction Networking/uTLS

https://tlsfingerprint.io/

Also, the source code of the attack verification tool is published on GitHub: https://github.com/Gaukas/GFW-2021Summer-TLS-RST-Incident Due to this attack is now inactive, you may not be able to get observation agreeing with my experiments.

Appendix

Updated 7/28/2021 16:00 UTC

The attack is no longer observable on any server for several days. We are now concluding the case and moving on.

Updated 7/23/2021 16:30 UTC

The attack is again observable ONLY on one server I have control over.

Besides, a minor timeout problem appears on all targets.

And no sign of server-side ALPN discrimination.

Updated 7/23/2021 00:33 UTC

The attack is still inactive.

Internet discussion[1][2][3] indicates this attack is:

  1. Limited to AS 4134 CHINANET-BACKBONE (in contradictory to CN2 AS4809).
  2. Bidirectional

The ASN relation matches our data, where our only good IP is also the only server we have enforcing AS4809 routing.

We did not get a chance to verify the statement of Bidirectional behavior.

Updated 7/22/2021 18:36 UTC

The attack is no longer active. I will keep an eye on it for the next 48 hours.

Updated 7/22/2021 00:00 UTC

Thanks to Prof. Halderman and Jack, I checked the SNI. Sadly no strong evidence for SNI sniffing.

Updated 7/21/2021 00:00 UTC

I have published the experimenting tool I used to check/confirm the attack on GitHub:

https://github.com/Gaukas/GFW-2021Summer-TLS-Proxy-Attack

Latency (ping) in ms from test node to all targets, in the same order:
65, 69, 93, 191, 63, 199, 131, 133

Packet loss, in the same order:
8%, 12%, 2%, 0%, 0%, 0%, 0%, 0%

According to the data we collected today, there is no sign of a strong correlation between TLS Fingerprints (from ClientHello) and TLS handshake successful rates. Instead, there is a strong sign of a hidden IP reputation system within the GFW. By utilizing the Reputation System, GFW could selectively attack different servers with different intensities.

Original Post 7/20/2021 00:00 UTC

Our first guess was TLS Fingerprint-based attack. Thanks to the project TLSFingerprint.io, we exploited that different TLS clients may initialize the TLS connection with noticeable differences which is enough to categorize the users. If a TLS proxy client uses some unpopular fingerprints, the request made by such client would be very easy to be separated from normal browser-based HTTPS requests.

By collecting enough fingerprints from variant TLS proxy client implementations, we found that most of the popular TLS proxy clients use minor TLS fingerprints which is vulnerable to the attack we imagined. The only exception is the XTLS/Xray-core, which in their release v1.4.2, they have noticed and fixed the TLS fingerprinting vulnerability (big thanks to uTLS) by forging the major browsers’ fingerprints. However, even with Xray-core in use, the TLS proxies are still unavailable. But so far at least we can say GFW isn’t utilizing the fingerprint-based attack in real-time.

After I set up a test node in China in order to do some PCAP, I see strange things happening:

  • Google Chrome also had a difficult time connecting to the decoy sites. It keeps changing the fingerprints used for the handshake and the first time I try to connect, 3 different fingerprints with 6 ClientHello messages were involved.
  • cURL requests to decoy sites are suffering from SSL handshake failure due to connection reset.
  • TLS proxy requests are not guaranteed to fail. Still can see some successful requests.

This has been interesting. By expanding the experiments, I found that more than half of the TLS handshake to the proxy server under attack would experience RST before the handshake completes. As a rough conclusion: GFW does not show the ability to identify TLS proxy traffics from normal HTTPS traffics in real-time.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.