<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>How the Great Firewall of China Detects and Blocks Fully Encrypted Traffic</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>01/01/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10461783</idno>
					<idno type="doi"></idno>
					<title level='j'>USENIX Security Symposium</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Mingshi Wu</author><author>Jackson Sippe</author><author>Danesh Sivakumar</author><author>Jack Burg</author><author>Peter Anderson</author><author>Xiaokang Wang</author><author>Kevin Bock</author><author>Amir Houmansadr</author><author>Dave Levin</author><author>Eric Wustrow</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[One of the cornerstones in censorship circumvention is fully encrypted protocols, which encrypt every byte of the payload in an attempt to "look like nothing". In early November 2021, the Great Firewall of China (GFW) deployed a new censorship technique that passively detects-and subsequently blocksfully encrypted traffic in real time. The GFW's new censorship capability affects a large set of popular censorship circumvention protocols, including but not limited to Shadowsocks, VMess, and Obfs4. Although China had long actively probed such protocols, this was the first report of purely passive detection, leading the anti-censorship community to ask how detection was possible.In this paper, we measure and characterize the GFW's new system for censoring fully encrypted traffic. We find that, instead of directly defining what fully encrypted traffic is, the censor applies crude but efficient heuristics to exempt traffic that is unlikely to be fully encrypted traffic; it then blocks the remaining non-exempted traffic. These heuristics are based on the fingerprints of common protocols, the fraction of set bits, and the number, fraction, and position of printable ASCII characters. Our Internet scans reveal what traffic and which IP addresses the GFW inspects. We simulate the inferred GFW's detection algorithm on live traffic at a university network tap to evaluate its comprehensiveness and false positives. We show evidence that the rules we inferred have good coverage of what the GFW actually uses. We estimate that, if applied broadly, it could potentially block about 0.6% of normal Internet traffic as collateral damage.Our understanding of the GFW's new censorship mechanism helps us derive several practical circumvention strategies. We responsibly disclosed our findings and suggestions to the developers of different anti-censorship tools, helping millions of users successfully evade this new form of blocking.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Fully encrypted circumvention protocols are a cornerstone of censorship circumvention solutions. Whereas protocols like TLS begin with a handshake that comprises plaintext bytes, fully encrypted (randomized) protocols-such as VMess <ref type="bibr">[23]</ref>, Shadowsocks <ref type="bibr">[22]</ref>, and Obfs4 <ref type="bibr">[7]</ref>-are designed such that every byte in the connection is functionally indistinguishable from random. The idea behind these "looks like nothing" protocols is that they should be difficult for censors to fingerprint and therefore costly to block.</p><p>On November 6, 2021, Internet users in China reported blockings of their Shadowsocks and VMess servers <ref type="bibr">[10]</ref>. On November 8, an Outline <ref type="bibr">[42]</ref> developer reported a sudden drop in use from China <ref type="bibr">[69]</ref>. The start of this blocking coincided with the sixth plenary session of the 19th Chinese communist party central committee <ref type="bibr">[1,</ref><ref type="bibr">4]</ref>, which was held on November 8-11, 2021. Blocking these circumvention tools represents a new capability in China's Great Firewall (GFW). To our knowledge, although China has been using passive traffic analysis and active probing together to identify Shadowsocks servers since May 2019 <ref type="bibr">[5]</ref>, it is the first time the censor has been able to block fully encrypted proxies en masse in real time, completely based on passive traffic analysis. The importance of fully encrypted protocols to the entire anticensorship ecosystem and the mysterious behaviors of the GFW motivate us to explore and understand the underlying mechanisms of detection and blocking.</p><p>In this work, we measure and characterize the GFW's new system for passively detecting and censoring fully encrypted traffic. We find that, instead of directly defining what fully encrypted traffic is, the censor applies at least five sets of crude but efficient heuristics to exempt traffic that is unlikely to be fully encrypted traffic; it then blocks the remaining nonexempted traffic. These exemption rules are based on common protocol fingerprints, a crude entropy test using the fraction of set bits, and the fraction, position, and maximum contiguous count of ASCII characters in the first TCP payload.</p><p>Due to the black-box nature of the GFW, our inferred rules may not be exhaustive; however, we evaluate our inferred rules on real-world traffic from a network tap at CU Boulder, and provide evidence that our rules have significant overlap with the GFW's. We also find that the inferred detection al-gorithm would block roughly 0.6% of all connections on our network tap. Possibly to mitigate over-blocking caused by false positives, our Internet scans show that the GFW strategically only monitors 26% of connections and only to specific IP ranges of popular data centers.</p><p>We also analyze the relationship between this new form of passive censorship and the GFW's well-known active probing system <ref type="bibr">[5]</ref>, which operate in parallel. We find that the active probing system also relies on this traffic analysis algorithm but has additional packet length-based rules applied. Consequently, the circumvention strategies that can evade this new blocking will also prevent the GFW from identifying and subsequently active-probing the proxy servers.</p><p>We derive various circumvention strategies from our understanding of this new censorship system. We responsibly and promptly shared our findings and circumvention suggestions with the developers of various popular anti-censorship tools, including Shadowsocks <ref type="bibr">[22]</ref>, V2Ray [59], Outline <ref type="bibr">[42]</ref>, Lantern <ref type="bibr">[20]</ref>, Psiphon <ref type="bibr">[21]</ref>, and Conjure <ref type="bibr">[33]</ref>. These circumvention strategies have been widely adopted and deployed since January 2022, helping millions of users bypass this new censorship. As of February 2023, all circumvention strategies these tools adopted are reportedly still effective in China.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Background</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Traffic Obfuscation Strategies</head><p>Tschantz et al. divide approaches to obfuscating censorship circumvention traffic into two types: steganograpic and polymorphic [57, &#167; V]. The goal of steganographic proxies is to make circumvention traffic look like allowed traffic; the goal of polymorphism is to make circumvention traffic not look like forbidden traffic.</p><p>The two most common approaches to achieving steganography are mimicking and tunneling. Houmansadr et al. <ref type="bibr">[39]</ref> conclude that mimicking a protocol is fundamentally flawed and suggest that tunneling through allowed protocols be a more censorship-resistant approach. Frolov and Wustrow <ref type="bibr">[35]</ref> demonstrate that even when a tunneling approach is used, it still requires effort to perfectly align protocol fingerprints with popular implementations, in order to avoid blocking by protocol fingerprints. For instance, in 2012, China and Ethiopia deployed deep packet inspection to detect Tor traffic by its uncommon ciphersuits <ref type="bibr">[44,</ref><ref type="bibr">55,</ref><ref type="bibr">67]</ref>. Censorship middlebox vendors have previously identified and blocked meek <ref type="bibr">[29]</ref> traffic based on its TLS fingerprint and SNI value <ref type="bibr">[28]</ref>.</p><p>To avoid this complexity, many popular proxies opt for polymorphic designs. A common way to achieve polymorphism is to fully encrypt the traffic payload, starting from the first packet in a connection. Without any plaintext or fixed header structure to fingerprint, the censor cannot easily identify proxy traffic with regular expressions or by looking for specific patterns in traffic. This design was first introduced in Obfuscated OpenSSH in 2009 <ref type="bibr">[16]</ref>. Since then, it has been employed by Obfsproxy <ref type="bibr">[24]</ref>, Shadowsocks <ref type="bibr">[22]</ref>, Outline <ref type="bibr">[42]</ref>, VMess <ref type="bibr">[23]</ref>, ScrambleSuit <ref type="bibr">[68]</ref>, Obfs4 <ref type="bibr">[7]</ref>, and partially used in Geph4 <ref type="bibr">[58]</ref>, Lantern <ref type="bibr">[20]</ref>, Psiphon3 <ref type="bibr">[21]</ref>, and Conjure <ref type="bibr">[33]</ref>.</p><p>Fully encrypted traffic is often referred to as "looks like nothing" traffic, or misunderstood as "having no characteristics"; however, a more accurate description would be "looks like random". In fact, such traffic does have an important characteristic that sets it apart from other traffic: Fully encrypted traffic is indistinguishable from random. Since there are no identifiable headers, traffic will have high entropy homogeneously throughout the entire connection, even in the first data packet. By contrast, even encrypted protocols like TLS have relatively low-entropy handshake headers that convey supported versions and extensions.</p><p>In </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Active Probing Attacks and Defenses</head><p>In active probing attacks, the censor sends well-crafted payloads to a suspected server and measures how it responds. If the server responds to these probes in an identifiable way (e.g. lets the censor use it as a proxy), the censor can block it. As early as August 2011, the GFW was observed to send seemingly random payloads to foreign SSH servers that accepted SSH logins from China <ref type="bibr">[49]</ref>. In 2012, the GFW first looked for a unique TLS ciphersuit to identify Tor traffic; it then sent active probes to the suspected servers to confirm its guess <ref type="bibr">[64,</ref><ref type="bibr">66,</ref><ref type="bibr">67]</ref>. In 2015, Ensafi et al. conducted a detailed analysis of the GFW's active probing attacks against various protocols <ref type="bibr">[27]</ref>. Since May 2019, China has deployed a censorship system to detect and block Shadowsocks servers in two steps: It first uses the length and entropy of the first packet payload in each connection to passively identify possible Shadowsocks traffic, and then sends various probes, in different stages, to the suspected servers to confirm its guess <ref type="bibr">[5]</ref>. In response, researchers proposed various defenses against active probing attacks, including consistent server reactions <ref type="bibr">[9,</ref><ref type="bibr">34]</ref> and application fronting <ref type="bibr">[36,</ref><ref type="bibr">45]</ref>. Shadowsocks, Outline, and V2Ray have incorporated probe-resistant designs <ref type="bibr">[5,</ref><ref type="bibr">19,</ref><ref type="bibr">32,</ref><ref type="bibr">34,</ref><ref type="bibr">43,</ref><ref type="bibr">71]</ref>, making them unblocked in China since September 2020 <ref type="bibr">[5]</ref>, until the recent blocking in November 2021 <ref type="bibr">[10]</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Methodology</head><p>We crafted and sent various test probes between hosts inside and outside of China, letting them be observed the GFW. We observed the GFW's reactions by capturing and comparing traffic on both endpoints. This logging allows us to identify any dropped or manipulated packets, as well as active probes.</p><p>Experiment timeline and Vantage points. We summarized the timeline and vantage point usage of all major experiments in Table <ref type="table">1</ref>. In total, we used ten VPSes in TencentCloud Beijing (AS45090) and one VPS in AlibabaCloud Beijing (AS37963). We did not observe any differences in the censoring behavior between our vantage points within China or any affected external vantage points. We used four VPSes in DigitalOcean San Francisco (AS14061): three of them were affected by the new censorship, the other one was not. We turned these four VPSes into sink servers; that is, the servers listen on all ports from 1 to 65535 to accept TCP connections, but do not send any data back to the client. We also employed two machines in the CU Boulder (AS104) for Internet scanning and live traffic analysis. We checked the IP addresses of our VPSes against IP2Location database <ref type="bibr">[3]</ref>, confirming their geo-locations are as reported by their providers.</p><p>Triggering censorship. Because fully encrypted traffic is indistinguishable from random data, beyond using actual circumvention tools, we developed measurement tools that send random data to trigger blocking in our study. The tools initiate a TCP handshake, send a random payload of a given length, and then close the connection.</p><p>Using residual censorship to confirm blockings. Similar to how the GFW blocks many other protocols <ref type="bibr">[13,</ref><ref type="bibr">14,</ref><ref type="bibr">17,</ref><ref type="bibr">63]</ref>, after a connection triggers the censorship, the GFW blocks all subsequent connections having the same 3-tuple (client IP, server IP, server port) for 180 seconds. This residual censorship allows us to confirm blocking by sending follow-up connections from the same client to the same port of the server. We make five TCP connections one by one with a one-second interval in between. If all five connections failed, we conclude that the 3-tuple is blocked. Once a 3-tuple is blocked, we do not use it for further tests in the next 180 seconds.</p><p>Accouting for probabilistic blocking with repeated tests.</p><p>We often had to make multiple connections with the same pay-load before we observed blocking. In Section 6.3, we explain that this is because the GFW employs a probabilistic blocking strategy, where censorship is only triggered approximately a quarter of the time. To account for this probabilistic behavior, we send the same payload in up to 25 connections before drawing any blocking (or not blocking) conclusion. If we can successfully make 25 connections with the same payload in a row, then we conclude that the payload (or server) is not affected by this censorship. If after sending the payload at least once, a sequence of 5 subsequent connection attempts timeout (due to residual censorship), we label the payload (and server) as affected by censorship. We use this method of repeated connections to measure blocked payloads in all the tests throughout our study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Characterizing the New Censorship System</head><p>We conduct experiments to understand how the GFW detects and blocks fully encrypted connections. Detailed in Table <ref type="table">1</ref>, between Nov 6, 2021 and May 18, 2022, we used three VPSes in China and three sink servers in the US to conduct our experiments. During the same period, we also used one VPS in AlibabaCloud Beijing (AS37963) to repeat all our experiments. We did not observe any differences in the censoring behavior among our vantage points within China or any affected external vantage point. On February 16, 2023, we reran our experiments and confirmed all detection rules still held. This time, we used one VPS in TencentCloud BJ and one sink server in DigitalOcean SFO. Algorithm 1 presents a high-level overview of the GFW's detection rules we inferred, and Figure <ref type="figure">1</ref> illustrates examples of these inferred rules in action. While we cannot infer the order in which these rules get applied or if they are exhaustive, our experiments confirm specific components of the GFW's censorship strategy. We find that, instead of directly defining what fully encrypted traffic is, the censor applies at least five sets crude but efficient heuristic rules to exempt traffic that is unlikely to be fully encrypted traffic; it then blocks the remaining non-exempted traffic. These exemption rules are based on common protocol fingerprints, a crude entropy test using the fraction of bits set, and the fraction, position, and maximum contiguous count of ASCII characters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Algorithm 1</head><p>The GFW uses at least five heuristic rules to detect and block fully encrypted traffic. The censor applies this algorithm to TCP connections sent from China to certain IP subnets and employs probabilistic blocking (Section 6). Allow a connection to continue if the first TCP payload (pkt) sent by the client satisfies any of the following exemptions:</p><p>len(pkt) &#8804; 3.4 or popcount(pkt) len(pkt)</p><p>&#8805; 4.6.</p><p>Ex2: The first six (or more) bytes of pkt are [0x20, 0x7e].</p><p>Ex3: More than 50% of pkt's bytes are [0x20, 0x7e].</p><p>Ex4: More than 20 contiguous bytes of pkt are [0x20, 0x7e].</p><p>Ex5: It matches the protocol fingerprint for TLS or HTTP.</p><p>Block if none of the above hold.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Entropy Exemption (Ex1)</head><p>We observed that the fraction of bits set influences whether a connection is blocked. To determine this, we sent repeated connections to our server and observed which were blocked.</p><p>In each connection, we sent one of 256 different byte patterns, consisting of 1 byte repeated 100 times (e.g., \x00\x00\x00 . . . , \x01\x01\x01 . . . , . . . , \ xff \ xff \ xff . . . ). We sent each pattern in 25 connections to our server, and observed if any patterns resulted in blocking subsequent connections, indicating the payload triggers blocking. We found 40 byte patterns triggered blocking, while the remaining 216 patterns did not. Example patterns that were blocked include \x0f\x0f\x0f . . . , \x17\x17\x17 . . . , and \x1b\x1b\x1b . . . (and 37 others). All of the blocked patterns consist of bytes with exactly 4 (out of 8) bits that were 1 (for instance, \x1b in binary is 00011011). We hypothesized that the number of set bits (1 bits) per byte may play a role, as uniformly random data will have close to the same number of total 1s and 0s in binary. In effect, this is essentially measuring the entropy of the bits within the client's packet.</p><p>We confirmed this by sending combinations of bytes that were individually allowed, but together resulted in being blocked. For example, both \ xfe \ xfe \ xfe . . . and \x01\x01 \x01 . . . were not blocked individually, but these bytes sent together as \ xfe \x01\xfe \x01 . . . resulted in blocking. We note \ xfe \x01 has 8 (out of 16) bits set to 1 (an average of 4 bits per byte set), while \ xfe has 7 out of 8, and \x01 has 1 of 8 set, explaining why individually they are allowed, but together they are blocked.</p><p>Of course, random or encrypted data will not always have exactly half of the bits set to 1. We tested how close to half the GFW needed in order to block, by sending a sequence of 50 random bytes (400 bits) with an increasing number of bits set. We produced 401 bitstrings with 0-400 bits set to 1, and shuffled each string, yielding a set of random strings with 0-8 bits set per byte (in increments of 0.02 bits/byte). For each string, we made 25 connections and sent the string to observe if it triggered subsequent connections to be blocked. We found that all strings with &#8804; 3.4 or &#8805; 4.6 bits/byte set were not blocked, while strings with between 3.4 and 4.6 bits/byte set were blocked.</p><p>There was a single exception to this for a string with 4.26 bits/byte set, which we determined was not blocked due to having over 50% of its bytes be printable ASCII characters; we show next this is an exemption rule (Ex2). We repeated our experiment and confirmed that other strings with the same number of bits set with less printable ASCII are indeed blocked.</p><p>In summary, we find that the GFW exempts a connection if the fraction of bits set in the client's first data packet deviates from half. This corresponds to a crude measure of entropy: random (encrypted) data will have close to half of the bits set to 1, while other protocols usually have fewer 1 bits per byte due to plaintext or zero-padded protocol headers. For instance, Google Chrome version 105 sends a TLS client hello with an average of only 1.56 bits set per byte, falling outside the censorship range, owing to padding with zeros.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">ASCII Characters Exemption (Ex2-4)</head><p>We observed several exceptions to the bit counting rule we discovered in Section 4.1. For instance, the pattern \x4b\x4b\ x4b . . . was not blocked, despite having exactly 4 bits set per byte. Indeed, there are actually 70 characters (8 choose 4) that have exactly 4 bits set, but our analysis found that only 40 of those triggered censorship. What about the other 30?</p><p>These other 30 byte values all fall within the byte range that comprises the printable ASCII characters, 0x20-0x7e. We conjecture that the GFW exempts characters presumably to allow "plaintext" (human-readable) protocols.</p><p>We found three ways in which the GFW exempts connections based on printable ASCII characters in the first packet payload from the client: if the first six bytes are printable (Ex2); if more than half of the bytes are printable (Ex3); or if it contains more than 20 contiguous printable bytes (Ex4).</p><p>First six bytes are printable (Ex2). We observe that the GFW exempts blocking if the first 6 bytes of a connection fall within the printable byte range 0x20-0x7e. If there are characters outside this range in the first 6 bytes, then a connection may be blocked, assuming it does not have other exempting properties (for example, fewer than 3.4 bits per byte set). We tested this by generating messages where the first n bytes were sourced from different character sets (such as ASCII printable characters) and the rest of the message would be random unprintable characters. We find that for n &lt; 6, we observe censorship, but for n &#8805; 6 where the first n bytes are ASCII printable characters, no blocking occurs.</p><p>Half of the first packet are printable (Ex3). If more than half of all bytes in the first packet fall into the printable ASCII f9 ab cd ef 9a 8d c1... </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>NOT BLOCKED</head><p>Run of &gt;20 printable bytes (c) Contiguous printable exemption (Ex4): the GFW counts the max number of contiguous printable bytes, and exempts a connection if the value is more than 20 bytes.   range 0x20-0x7e, the GFW exempts the connection. We tested this by sending packets consisting of 10 bytes of characters outside this range (e.g. 0xe8), followed by a repeating sequence of 6 bytes: 5 within the range (e.g., 0x4b), and one outside. We repeat this 6 byte sequence 5 times, and then pad the end of the string with n bytes outside the range (in Python notation: " \xe8" * 10 + (" \x4b" * 5 + " \xe8") * 5 + " \ xe8" * n). This experiment gives us a variable-length pattern that decreases the fraction of bytes in the printable ASCII range as we increase n. We find that for n &lt; 10, connections are not blocked, while for n &#8805; 10 they are. This corresponds to blocking when the fraction of printable characters is less than or equal to half, and not blocking when greater than half.</p><p>We design our probes to avoid triggering other GFW exemptions, such as bit counts (Ex1), printable prefixes (Ex2), or runs of printable characters (Ex4). For example, we use 0x4b and 0xe8 as our printable and non-printable characters respectively, since they both have exactly 4 bits set. This prevents the GFW from exempting our connection from blocking due to the bit count rule (Ex1) discussed previously. In addition, we avoid having contiguous runs of printable 0x4b characters, as we observed that such runs can also exempt a connection from blocking, which we discuss next. We repeated our experiments with other patterns that also met these constraints (e.g. 0x8d and 0x2e), and observed the same results.</p><p>More than 20 contiguous bytes are printable (Ex4). A contiguous run of printable characters can also exempt blocking, even if the total fraction of printable characters is less than half. To test this, we sent a pattern of 100 bytes of a character outside the printable range (0xe8) with a varying number of contiguous bytes from the printable range (we used 0x4b). Our payload started with 10 bytes of 0xe8, followed by n bytes of 0x4b, and then 90n bytes of 0xe8, for a total length of 100 bytes. We varied n from 0-90, and sent each of the 91 payloads in 25 connections to our server. We found that with n &#8804; 20, the connection was blocked. For n &gt; 20, the connection was not blocked, indicating the presence of a run of printable characters exempts blocking. Of course, past n &gt; 50, the connection will also be exempt, because of Ex3.</p><p>Other encodings. We tested whether Chinese characters in the first packet were exempted from blocking in the same way as printable ASCII characters did. We used strings of 6-36 Chinese characters encoded in UTF-8, as well as GBK (identical to GB2312 for the character we used). All of these tests were blocked, suggesting that there is no exemption for Chinese characters. It is possible that the presence of Chinese characters in these encodings is rare, or that parsing these encodings adds unjustified complexity since it is hard to know where an encoded string starts or ends.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Common Protocols Exemption (Ex5)</head><p>To avoid blocking popular protocols by mistake, we observe that the GFW explicitly exempts two popular protocols. The GFW appears to infer protocols from the first 3-6 bytes of the client's packet: If they match the bytes of a known protocol, the connection is exempted from blocking, even if the rest of the packets do not conform to the protocol. We tested six common protocols and found that the TLS and HTTP protocols are explicitly exempted. This list may not be exhaustive, as there may be other exempted protocols we did not test.</p><p>TLS. TLS connections start with a TLS Client Hello message, and the first three bytes of this message cause the GFW to exempt the connection from blocking. We observe that the GFW exempts any connection whose first three bytes match the following regular expression:</p><p>This corresponds to the one-byte record type, followed by a two-byte version. We enumerated all 256 patterns of 'XX\x03\x03' followed by 97 bytes of random data, and found all patterns were blocked except those that start with either 0x16 (corresponding to the Handshake TLS record type, used in the Client Hello) or 0x17 (corresponding to the Application Data record type). While normal TLS connections do not begin with Application Data <ref type="bibr">[52,</ref><ref type="bibr">53]</ref>, when TLS is used over Multipath-TCP (MPTCP) <ref type="bibr">[31]</ref>, it is common for one of the TCP subflows to be used for the Client Hello and for other subflows to send Application Data immediately after the TCP connection is established <ref type="bibr">[15]</ref>. As of today, only TLS versions 0x03[0x00-0x03] have been defined <ref type="bibr">[52,</ref><ref type="bibr">53]</ref>, but the GFW allows even later (not yet defined) versions.</p><p>HTTP. The byte pattern used by the censor to identify HTTP traffic is simply the method followed by a space. If a message starts with GET , PUT , POST , or HEAD , the connection will be exempt from blocking. The space character (0x20) after each verb is necessary to exempt connections from blocking. Not including this space character, or replacing it with any other byte will not exempt the connection. The other HTTP methods (OPTIONS , DELETE , CONNECT , TRACE , PATCH ) fall into the ASCII printable exemption (Ex2), as the first 6 bytes are printable characters. We find that the method is caseinsensitive: GeT , get , and similar variations are exempt. Typos in the verb (e.g., TEG ) are not exempt.</p><p>Non-exempted protocols. We tested other common protocols: SSH, SMTP, and FTP would be exempt as they all start with at least 6 bytes of printable ASCII (rule Ex2). DNS-over-TCP is exempt due to containing a large fraction of zeros, making it exempt by the Ex1 rule. However, if a large enough amount of random data was appended after a DNS-over-TCP message, it would be blocked.</p><p>This observation raises the question of why the censor has explicit rules to exempt TLS and HTTP, but not other protocols. After all, the censor does not need to exempt these two protcols explicitly: HTTP will commonly be exempt by printable ASCII for the first 6 bytes (rule Ex2), and TLS Client Hello messages have relatively low bit-wise entropy (rule Ex1), owing to many zero fields. Nonetheless, the censor may employ these simple but efficient rules to quickly exempt the bulk of traffic (TLS and HTTP) from the more in-depth analysis of calculating the popcount, fraction of ASCII, etc.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">How the GFW Disrupts Connections</head><p>Once the GFW detects fully encrypted traffic using Algorithm 1, it blocks the subsequent traffic as introduced below.</p><p>Packets are dropped from client to server. We triggered the GFW's blocking and compared the captured packets from both the sending client and receiving server. We observe that after triggering blocking, the client's packets are dropped by the GFW, and do not reach the server. However, packets sent by the server are not blocked and are still received at the client.</p><p>UDP traffic is not affected. The new censorship system is limited to TCP. Sending a UDP datagram with a random payload cannot trigger the blocking. Additionally, once a 3-tuple (client IP, server IP, server Port) is blocked due to a triggering TCP connection, UDP datagrams to or from the same (server IP, server Port) are not affected. Because of the absence of UDP blocking, users may experience odd behavior while using Shadowsocks: they can still access websites or use apps that rely on UDP (e.g. QUIC or FaceTime), but cannot access websites that use TCP. This is because Shadowsocks proxies TCP traffic with TCP and proxies UDP traffic with UDP. Not detecting or blocking UDP traffic may reflect the censor's worse is better engineering mindset. From a practical view, the current TCP blocking can already effectively paralyze these popular circumvention tools, while employing UDP censorship requires additional resources and invites extra complexity to the censorship system.</p><p>Traffic on all ports can get blocked. We set up a sink server listening on all ports from 1 to 65535 in US. We then let our client in China continuously make connections with 50-byte random payloads to each port of the US server and stop when a port got blocked. We find that blocking can happen on all ports from 1 to 65535. Therefore, running circumvention servers on an unusual port cannot mitigate the blocking. We also do not observe any difference in censor's behaviors among ports.</p><p>The duration of residual censorship is affected by the number of on-going residual blocking. We find that once this new censorship system blocks a connection, it continues to drop all subsequent TCP packets having the same 3-tuple (client IP, server IP, server port) for 120 or 180 seconds. This behavior is often referred to as "residual censorship" <ref type="bibr">[13,</ref><ref type="bibr">14,</ref><ref type="bibr">17,</ref><ref type="bibr">63]</ref>. Unlike some other residual censorship systems <ref type="bibr">[13]</ref>, the GFW's residual censorship timer does not reset when additional packets are sent.</p><p>We also find that the GFW seems to limit the number of connections it residually blocks at any given time. We let our clients in China repetitively make connections to 500 ports of a single server simultaneously. In each connection, the client sent 50 bytes of random data and then closed the connection. We recorded the duration of each occurrence of residual censorship. As shown in Figure <ref type="figure">2</ref>, in comparison to the 180 s duration when only one port is blocked, the residual censorship duration in this experiment decreased dramatically. Residual censorship duration -When we repetitively send 50-byte random data to 500 ports of a single server simultaneously, the residual censorship time decreases dramatically. About 40% of the blockings lasted only 10 s, shorter than the 180 s duration when only one port was blocked. This suggests that the GFW may limit the number of connections it residually blocks at any given time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5">How the GFW Reassembles Flows</head><p>In this section, we examine how the GFW's new censorship system reassembles flows and considers flow directions.</p><p>A complete TCP handshake is necessary. We observe that sending a SYN packet followed by a PSH+ACK packet containing random data (without the server completing its end of the handshake) is not sufficient to trigger blocking. The blocking is thus harder to exploit for residual censorship attacks <ref type="bibr">[13]</ref>.</p><p>Only client-to-server packets can trigger the blocking. We find that the GFW not only checks if the random data is sent to a destination IP address that falls in an affected IP range, it also examines and will only block if the random data is sent from client to server. The server here is defined as the host that sends a SYN+ACK during the TCP handshake.</p><p>We learned this by setting up four experiments between the same two hosts. In the first experiment, we let the Chinese client connect and send random data to the foreign server; in the second experiment, we still let the Chinese client connect to the foreign server, but let the foreign server send random data to client; in the third experiment, we let the US client connect and send random data to Chinese server; in the forth experiment, we let the US client connect to the Chinese server, but then let the Chinese server send random data to the US client. Only connections in the first experiments were blocked.</p><p>The GFW only examines the first data packets. The GFW appears to only analyze the first data packet in a TCP connection, without reassembling the flows with multiple data packets. We tested this with the following experiment. After a TCP handshake, we send the first data packet with only one byte of payload \x21. After waiting for one second, we then send the second data packet with a 200-byte random payload. We repeated the experiment 25 times, but the connections never got blocked. This is because after seeing the first data packet, the GFW had already exempted the connections by rule Ex1 as it contained 100% printable ASCII in the payload. In other words, if the GFW reassembled multiple packets into a flow during its traffic analysis, it would have been able to block these connections.</p><p>We found that the GFW does not wait until seeing an ACK response from the server to block a connection. We configured our server to drop any outgoing ACK packets with an iptables rule. We then made connections with 200-byte random payloads to the server. The GFW still blocked these connections though the server never sent any ACK packets.</p><p>The GFW waits more than 5 minutes for the first data packets. We examine how long the GFW monitors a TCP connection after the TCP handshake, but before it sees the first data packet. From the observation that it requires a complete TCP handshake to trigger the blocking, we infer the GFW may be stateful. It is thus reasonable to suspect the GFW only monitors a connection for a limited amount of time, as it can be expensive to maintain a state forever without expiring it.</p><p>Our client completed TCP handshakes and then waited for 100, 180, or 300 seconds, before sending 200 bytes of random data. We then repeated the experiment but used iptables rules to drop any RST or TCP keepalive packets in case they helped the GFW keep the connection state active. We found that these connections still triggered blocking, suggesting the GFW maintained connection states for at least five minutes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Relation with the Active Probing System</head><p>As introduced in Section 2.2, the GFW has been sending active probes to Shadowsocks servers since 2019 <ref type="bibr">[5]</ref>. In this section, we study the relationship between this newly discovered real-time blocking system and the existing active probing system. By conducting designed measurement experiments and analyzing historical datasets, we show that while these two censorship systems work in parallel, the current traffic analysis module of the active probing system applies all five sets of exemption rules summarized in Algorithm 1 and Figure <ref type="figure">1</ref>, with one additional rule that examines the payload length of the first data packet. We also show evidence that the traffic analysis algorithm used by the active probing system <ref type="bibr">[5]</ref> may have evolved since 2019.</p><p>Active probing experiment. Prior to the deployment of this new real-time blocking system, inferring the traffic analysis algorithm of the active probing system was extremely challenging, if possible at all. This is because the GFW employs an arbitrary delay between seeing a triggering connection and sending active probes <ref type="bibr">[5, &#167;3.5]</ref> One US host is known to be affected by the current blocking system, while the other US host is unaffected. In total, our client in China repetitively sent around 170k connections to each port of the two US servers. The only exception is, when the residual censorship was triggered and the client could not make connections to the affected server, the total number of successful legitimate connections was around 33k.</p><p>for which probes by the GFW are triggered by which connections we send. Now that we have inferred a list of traffic detection rules of this new blocking system in Section 4, we can test if a payload exempted by Algorithm 1 will also not get suspected by the active probing system. We conducted the experiments between May 19, 2022 and June 8, 2022. As shown in Table <ref type="table">2</ref>, we crafted 14 different types of payloads: three of them are random data with lengths of 2, 50, and 200 bytes; the remaining 11 were data with various lengths that will only be exempted by exactly one of the exemption rules in Algorithm 1. We then sent the same 14 payloads from a VPS in Tencent Cloud Beijing China, to 14 ports of two different hosts in DigitalOcean San Francisco US. One US host is known to be affected by the current blocking system, while the other US host is unaffected. This way, if we received any probes from the GFW, we know certain exemption rules used by the current blocking system are not used by the active probing system.</p><p>In total, our client in China sent around 170k connections to each port of the two US servers. We then took steps to isolate the GFW's probes from other Internet scanners'. We check the source IP address of each probe against IP2Location database <ref type="bibr">[3]</ref> and AbuseIPDB <ref type="bibr">[2]</ref>. We do not consider it as a probe from the GFW if it was a non-Chinese IP or from a known spammer IP address. We further check if the probe belongs to any known types of probes sent by the GFW.</p><p>The two systems work independently. The new censorship machine makes its blocking decisions purely based on passive traffic analysis, without relying on China's well-known active probing infrastructure <ref type="bibr">[5,</ref><ref type="bibr">27,</ref><ref type="bibr">64,</ref><ref type="bibr">66,</ref><ref type="bibr">67]</ref>. We know this because, while the GFW still sends active probes to the servers, in more than 99% of the tests, the GFW did not send any active probes to the server before blocking a connection. For example, as summarized in Table <ref type="table">2</ref>, we made 33,119 connections but only received 179 active probes. Indeed, similar to the findings by prior work [5, &#167;4.2], active probes are rarely triggered.</p><p>We want to emphasize that this finding does not mean that defenses against active probing are not necessary or not important anymore <ref type="bibr">[5,</ref><ref type="bibr">9,</ref><ref type="bibr">34]</ref>. On the contrary, we believe that the GFW's reliance on purely passive traffic analysis is partially because Shadowsocks, Outline, VMess, and many other censorship circumvention implementations have adopted effective defenses against active probing <ref type="bibr">[5,</ref><ref type="bibr">9,</ref><ref type="bibr">19,</ref><ref type="bibr">32,</ref><ref type="bibr">34,</ref><ref type="bibr">43,</ref><ref type="bibr">71]</ref>. The fact that the GFW still sends active probes to servers implies that the censor still attempts to use active probing to accurately identify circumvention servers whenever possible. The active probing system applies the five exemption rules, with one additional length rule, to suspect traffic. This experiment suggests two points. First, similar to the findings by Alice et al. [5, &#167;4.2], the active probing system applies an additional rule to examine the length of the connection. In our case, only connections with 200-byte payloads ever triggered the active probing, not ones with 2 bytes or 50 bytes. Second, the traffic exempted by any of the five rules discovered in Algorithm 1 will also not trigger the active probing system. The active probing system has evolved since 2019. We want to know if the same detection rules in Algorithm 1 were historically used to trigger active probing. To analyze it, we obtained 282 payloads that got replayed (and thus once triggered the GFW) in the low-entropy experiment from Alice et al. [5, &#167;4.1]. We then wrote a program to determine if a payload would be exempted by the current blocking system, and fed the program with the obtained 282 payloads. As a result, 45 probes that previously triggered active probing were exempted (by rule Ex3). On May 19, 2022, we repeatedly sent these 45 payloads through the GFW, confirming that they were indeed exempted from the current blocking. For each payload, we made 25 connections with it from a VPS in Ten-centCloud Beijing to a sink server in DigitalOcean SFO. This result suggests that the GFW has likely updated the traffic analysis module of its active probing system since 2020. In addition, the probes sent by the current GFW are also different from those observed in 2020 <ref type="bibr">[5, &#167;3.2]</ref>. The new probes are essentially random payloads that are distributed in trios of 16, 64, and 256 bytes. For each of these lengths, the GFW sent about the same number of probes: 48, 46, and 47 to one server, and 238, 228, and 233 to the other.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Understanding the Blocking Strategies</head><p>In this section, we conduct measurement experiments to characterize the censor's blocking strategies. We find that, possibly to mitigate false positives and reduce operation costs, the censor strategically limits the scope of blocking to specific IP ranges of popular data centers, and it applies a probabilistic blocking strategy to 26% of all connections to these IP ranges.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Internet Scanning Experiment</head><p>On May 12, 2022, we performed a 10% IPv4 Internet scan on TCP port 80, from a server located at CU Boulder. Following prior work that identifies unreliable hosts in Internet scans <ref type="bibr">[41]</ref>, we remove IPs that respond with a TCP window of 0 (as we cannot send them data), or do not accept a subsequent connection. This leaves us with 7 million scannable IPs. We then randomly and equally split these 7 million IP addresses into nine subsets, and assigned each to our nine vantage points in TencenCloud Beijing datacenter. We then used a measurement program we wrote and installed in all nine vantage points for the experiment. For each IP, the program connects to its port 80 sequentially up to 25 times, with a onesecond interval in between. In each connection, we send the same 50 bytes of random data that can trigger the blocking. If we see 5 consecutive connections time out (fail to connect) after we have sent data, we label the IP as affected. Otherwise, if all 25 connections succeed, we label the IP as unaffected. We label IPs that we cannot connect to at all as unknown (e.g., the server is down, or a network failure unrelated to the GFW prevents us from connecting in the first place).</p><p>We also repeated this process but sent 50 bytes of \x00, which does not trigger blocking by the GFW. If a server is marked as affected in this test, it is likely due to the server blocking us, and not the GFW, and we remove these IPs from our results. This leaves just over 6 million IPs.</p><p>Finally, we remove "ambiguous" results that may be due to intermittent network failures or unreliable vantage points. Specifically, we remove IPs that either of our random or zero scans labelled unknown (we were never able to connect), or had intermittent connection timeouts (e.g., several connections timed out, but not 5 consecutively). This leaves 5.5 million IPs that we can easily label as unaffected (all 25 connections succeeded) or affected (at some point it appeared blocked after we sent random data).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Not All Subnets/ASes are Affected Equally</head><p>Of the 5.5 million processed IPs, 98% of them are unaffected by the GFW's blocking, suggesting that China is fairly conservative in employing this new censorship. We group these 5.5 million IP addresses into their allocated IP prefixes and ASes, using pyasn with an AS database from April 2022 <ref type="bibr">[51]</ref>. For IP prefixes larger than /20, we break the allocation into a set of /20 prefixes to keep allocations roughly the same size. Our 5.5 million IPs comprise 538 unique ASes that have at least 5 results, and the vast majority of these are largely unaffected by the GFW's blocking.</p><p>Figure <ref type="figure">3</ref> shows the distributions of the fraction of affected ASes and /20 prefixes. We found that more than 90% ASes are affected in an all-or-nothing way: either all IP addresses we tested in the AS are affected by the GFW's blocking, or no IP addresses we tested in the AS are unaffected. We also observe that only a few ASes are affected: over 95% of ASes see less than 10% of their IPs affected, and only 7 ASes see more than 30% of their IPs affected. Figure <ref type="figure">4</ref>: Top affected ASNs -We observe that not all ASes are affected, and even within each AS, different prefixes are affected differently. For each AS, we looked at each /20 in their network, and calculated the fraction of IPs blocked in each /20 subnet. The results were very close to all-or-nothing: either all IPs in a /20 were affected, or none were.</p><p>Figure <ref type="figure">4</ref> shows the top affected ASes. While this is skewed toward larger ASes (which have more IPs in our scan), it shows both ASes that are heavily affected (e.g., Alibaba US, Constant) and ones that are not (Akamai, Cloudflare). In addition, some ASes have a mix of affected and not affected prefixes (Amazon, Digital Ocean, Linode). All of the affected or partly-affected ASes we see are popular VPS providers that could be used to host proxy servers, while large unaffected ASes do not typically sell VPS hosting to individual customers (e.g. CDNs).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3">Characterizing Probabilistic Blocking</head><p>As introduced in Section 3, we send up to 25 connections with the same before drawing any conclusions about blocking. This is necessary because the censor implements blocking probabilistically. In other words, just sending a random payload to an affected server once would only sometimes trigger blocking; however, if one keeps making connections with the same payload to the affected server, blocking will occur eventually. This raises the question on what the probability is for a connection to get blocked, and why the censor implements blocking only probabilistically.</p><p>Estimating the blocking rate. From our 10% Internet scan (Section 6.2), there were 109,489 IP addresses that we label as blocked. As shown in Figure <ref type="figure">5</ref>, the distribution of the number of successful random data connections we can make to each IP address before getting blocked fits a geometric distribution. This result suggests that the blocking of each connection is independent, with a probability of 26.3%.</p><p>Why probabilistic blocking is used. We conjecture that the censor employs probabilistic blocking possibly for two reasons: First, it allows the censor to only examine one-fourth of Figure <ref type="figure">5</ref>: CDF of the number of successful connections from our client in China to each of 109,489 affected IP addresses before getting blocked. We made up to 25 connections to port 80 of each IP address. The distribution fits a geometric distribution, suggesting the blocking of each connection is independent, with a probability of p = 26.3%. connections, reducing computation resources. Second, it helps the censor reduce the collateral damage to non-circumvention connections. While this reduction also comes at the expense of lower true positives, the residual censorship may make up for it: once a connection is determined to be blocked, subsequent connections are also blocked for several minutes after, making it difficult for proxy users to successfully connect once detected. This may also further support prior claims that censors put more emphasis on reducing their false positive rate than in achieving a high true positive rate <ref type="bibr">[57]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Evaluating the GFW's Detection Rules</head><p>In this section, we evaluate the false positive rate and comprehensiveness of the GFW's detection rules we inferred in Section 4. To determine the impact this blocking may have on regular traffic, we simulate the inferred detection rules to traffic on our university network without actually blocking any traffic. Different from the GFW, we simulate the detection rules against all TCP connections observed without limiting the detection to 26% of connections to specific IP ranges of popular data centers. We expect to see little to no circumvention traffic in this network, and any traffic that would be blocked under detection rules likely represents false positive blocking. We find that the inferred detection algorithm would block roughly 0.6% of all connections on our network. Due to the black-box nature of the GFW, our inferred rules may only be a subset of what the GFW uses; however, we show that all connections that Algorithm 1 would block were indeed blocked when we sent their prefixes along with random data through the GFW, suggesting our inferred rules have good coverage of what the GFW uses.  Figure <ref type="figure">6</ref>: Common exemptions -For each connection on CU Boulder tap, we determine which rules in Algorithm 1 would exempt it from being blocked. We divide the exemption rule Ex5 in Section 4.3 into 3-, 4-, and 5-byte patterns and present them in three rows for fine-grained classification. We analyze 1.7 billion connections collected from July 2022 until September 2022. For brevity, this graph only shows intersections with a count greater than 1,000,000. We observe 37 different intersections of exemptions in the full set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.1">Traffic Analysis Experiment</head><p>We have access to a 40 Gbps network tap at CU Boulder that allows us to process copies of all incoming and outgoing packets on our campus. Using this, we collected a dataset comprising only destination port numbers and the first 6 bytes of payload data for connections that do not already satisfy the other exemption rules in Algorithm 1. More precisely, we implemented a custom packet analysis tool using PF_RING <ref type="bibr">[50]</ref>.</p><p>For each connection, we inspected the first data packet sent by the client. We ensured that the packet has a correct TCP checksum, and that its sequence number is the first expected data packet after the TCP handshake in the connection (making sure we have not missed the first data packet). For connections that are not exempted by Algorithm 1-i.e., those we expect to be blocked-we logged the destination port and the first six bytes of the connection to help identify its protocol. We performed this collection between July 2022 and September 2022. In total, we analyzed 1.7 billion connections and logged 442,928 unique 6-byte prefixes of wouldbe-blocked connections. For each of these 442,928 6-byte prefixes, we append the same 194-byte random data to it to make a 200-byte payload. We then repetitively sent each payload past the real GFW in September 2022, to test whether they were indeed blocked, or if instead there were exemptions we had not previously identified. For each payload, we made up to 25 connections with it from a VPS in TencentCloud Beijing to a sink server in DigitalOcean SFO. The first 6-bytes of blocked connections -For the 9.7 million (0.6%) connections from our tap that would be blocked under the GFW rules we inferred, we count the occurences of their unique first 6-bytes. The most popular 6byte prefix appears in over 479 thousand connections (5.0%), meaning a rule that explicitly allowed this 6-byte value could reduce the GFW's false-positive rate by this amount.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.2">Experiment Results and Analysis</head><p>Estimating the false positive rate. In total, we analyzed 1.7 billion connections on our network between July 2022 and September 2022. For each connection, we determine which rules in Algorithm 1 would exempt it from being blocked. As shown in Figure <ref type="figure">6</ref>, we observe on average that 0.6% of TCP connections from our tap would be blocked under the GFW's detection rules we inferred.</p><p>There are at least two strategies the censor employs to reduce the false positive rate. First, as introduced in Section 6, the GFW only applies this censorship to a fraction of IP subnets. This decision may be an attempt to mitigate the baserate problem faced by the censor <ref type="bibr">[11]</ref>. Since relatively few connections in total are proxy connections, even a small false positive rate (such as 0.6%) would result in blocking mostly benign traffic, if applied broadly. By narrowing the scope of IPs it is applied to, China can reduce the collateral damage of its censorship. Second, as explored in Section 6.3, even for traffic towards this subset of IP subnets, the GFW is observed to block only about one-quarter of all traffic, reducing the false positive rate to one-fourth.</p><p>It is possible that the 0.6% of connections we identified may be fully encrypted proxies. To investigate this possibility, we keep a count of the number of unique 6-byte prefixes we see in each connection that would be blocked under the GFW's rules. If these connections are all truly fully encrypted proxies, we would expect to see a uniform distribution over the 256 6 possible 6-byte values. Otherwise, if there are 6-byte values that occur frequently, it could be headers of popular protocols, indicating false positives in the GFW's blocking.</p><p>Figure <ref type="figure">7</ref> shows the distribution of the first 6 bytes of all 9.7 million connections from our tap that would be blocked  under the GFW rules we inferred. In addition, Table <ref type="table">3</ref> shows the top 6-byte values from would-be blocked connections. While we are not able to identify many of these protocols, their frequency along with the low entropy indicates that they are not likely to be fully encrypted proxies.</p><p>Estimating the comprehensiveness of the inferred rules. Among the 442,928 payloads we crafted and sent past the real GFW, we found only one prefix got exempted by the GFW, which alerted us to the TLS Application Data prefix exemption (\x17\x03[\x00-\x09]). We added this exemption to our inferred rules (Ex5). This result suggests our inferred rules have good coverage of what the GFW uses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8">Circumvention Strategies</head><p>Our understanding of this new censorship system allows us to derive multiple circumvention strategies. In Section 8.1 and Section 8.2, we introduce two widely adopted countermeasures that have been helping users in China bypass censorship since January 2022 and October 2022, respectively. We discuss other circumvention strategies in Appendix A. We responsibly and promptly shared our findings and suggestions with the developers of various popular anti-censorship tools that have millions of users, which we detail in Section 8.3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.1">Customizable Payload Prefixes</head><p>The exemption rules Ex2 and Ex5 from Algorithm 1 only look at the first several bytes in a connection, allowing the GFW to efficiently exempt non-fully encrypted traffic; however, this lends itself to a potential countermeasure. Specifically, we propose prepending a customizable prefix to the payload of the first packet in a (circumvention) connection.</p><p>Customizable IV header. Shadowsocks connections begin with an Initialization Vector (IV), which is of length 16 or 32 bytes depending on the encryption ciphers <ref type="bibr">[22]</ref>. As introduced in Section 4.2, turning the first six (or more) bytes of the IVs into printable ASCII will exempt connections by the rule Ex2. Similarly, turning the first three, four, or five bytes of the IVs into common protocol headers will exempt connections by the rule Ex5 (e.g., turning the first three bytes of an IV into 0x16 0x03 0x03). These countermeasures require minimal changes to the client and no changes to the server, and therefore has been adopted by many popular circumvention tools <ref type="bibr">[48,</ref><ref type="bibr">56,</ref><ref type="bibr">62,</ref><ref type="bibr">72]</ref>. Restricting the first few bytes of a 32-byte IV to be printable ASCII will not reduce the randomness to the point that affects the security of encryption. For example, even fixing the first six bytes to printable ASCII still leaves the IVs with 26 random bytes, which is still more than a typical 16-byte IV.</p><p>Limitations. This is a stopgap solution and could potentially be blocked by the censor fairly easily. The censor may skip the first several bytes and apply the detection rules to the rest data in a connection. Protocol mimicry is also difficult in practice <ref type="bibr">[39]</ref>. The censor can enforce stricter detection rules, or actively probe a server to check if it is genuinely running TLS or HTTP. Nevertheless, the fact that this strategy still works as of February 2023, more than one year since its adoption by many popular circumvention tools in January 2022, underscores that even simple solutions can be effective against finite-resourced censors <ref type="bibr">[8,</ref><ref type="bibr">30,</ref><ref type="bibr">57]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.2">Altering Popcount</head><p>As introduced in Section 4.1, the GFW exempts a connection if its first data packet has an average popcount-per-byte &#8804; 3.4 or &#8805; 4.6 (Ex1). Based on this observation, one can increase (decrease) the popcount by inserting additional ones (zeroes) into the packet to bypass censorship. We introduce and analyze a flexible scheme that alters the popcount-per-byte to any given value or range. We implemented this scheme on Shadowsocks-rust [54] and Shadowsocks-android <ref type="bibr">[6]</ref>, helping users in China bypass censorship since October 2022 <ref type="bibr">[8]</ref>.</p><p>In January 2023, a large-scale circumvention service in China (that asked not to be named), also implemented a version of this scheme and found similar success. At a high level, we take original fully-encrypted packets as input: By operating only on the ciphertexts, we do not risk violating confidentiality. When sending a packet, we first compute its average popcount-per-byte; if the value is greater than 4, then we determine how many one-bits we would have to add to the packet in order to obtain a popcount over 4.6. Conversely, if the popcount is less than 4, then we determine how many zero-bits we would have to add to decrease the popcount to less than 3.4. In either case, we append the necessary number of one-or zero-bits to the original ciphertext and then append 4 bytes denoting the number of bits added, ultimately giving us a bit-string B that has a popcount-per-byte that would not subject it to censorship.</p><p>Of course, simply appending ones or zeroes would be easy to fingerprint. To address this, we do bit-level random shuffling. In particular, we leverage the existing shared secrets, such as password, as a seed to deterministically construct a permutation vector. In each connection, we update this permutation vector and use it to shuffle all the bits in the bit-string B before sending it. To decode, the receiver first updates the permutation vector and then uses it to un-shuffle the bit-string; then it reads the last 4 bytes to determine the number of bits added, removes that number of bits, and is thus able to recover the original (fully encrypted) packet.</p><p>In practice, we take two additional steps to further obfuscate the traffic. Since it is an obvious fingerprint if all connections share the same popcount-per-byte value, we set the goal value to a parameterizable range. Second, since the 4-byte length tag in plaintext may be a fingerprint, we encrypt it (the same way these circumvention tools encrypt proxy traffic).</p><p>This scheme has several advantages. First, the scheme supports parameterizable popcount-per-byte in case the GFW updates its popcount rule to block an even larger range. Second, because of its careful design, there are no obvious fingerprints that would signal to the censor that this is a popcount-adjusted packet. Finally, it incurs low overhead; it adds only as many ones (or zeroes) strictly necessary (padded to the nearest byte). In the worst case-increasing the popcount from 4 to 4.6this incurs only about 17.6% overhead. As a result, it could feasibly be applied not just to the first packet, but to every packet in the connection, thereby insulating it against future updates to the censor that might look past the first packet.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.3">Responsible Disclosure</head><p>On November 16, 2021, ten days after the GFW employed this new blocking <ref type="bibr">[10]</ref>, we revealed details of this new blocking to the public <ref type="bibr">[37,</ref><ref type="bibr">38]</ref>. With the development of our understanding of this new blocking, we derived and evaluated different circumvention strategies. We responsibly and promptly shared our findings and suggestions with the developers of various popular anti-censorship tools that have millions of users, including Shadowsocks <ref type="bibr">[22]</ref>, V2Ray [59], Outline <ref type="bibr">[42]</ref>, Lantern <ref type="bibr">[20]</ref>, Psiphon <ref type="bibr">[21]</ref>, and Conjure <ref type="bibr">[33]</ref>. Below we introduce our disclosure and the responses from the anticensorship community in detail.</p><p>On January 13, 2022, we shared our first circumvention strategy with a group of developers. This solution, detailed in Section 8.1, requires minimal code changes to the clients and no changes to the servers. By January 14, 2022, Shadowsocksrust developer zonyitoo, V2Ray developer Xiaokang Wang and Sagernet developer nekohasekai had already added this circumvention solution as an option to their clients <ref type="bibr">[48,</ref><ref type="bibr">62,</ref><ref type="bibr">72]</ref>. On October 4, 2022, database64128 implemented a user-customizable version of this strategy on Shadowsocks-go <ref type="bibr">[18]</ref>. On October 25, 2022, Outline developers adopted a highly customizable solution for their client <ref type="bibr">[56]</ref>. On October 14, 2022, we released a modified Shadowsocks <ref type="bibr">[8]</ref> that employed the popcount-altering strategy we detailed in Section 8.2.</p><p>As of February 14, 2023, all circumvention strategies adopted by these tools are reportedly still effective in China. In January 2023, Outline developers reported that the number of Outline servers (that opted-in for anonymous metrics) had doubled since they adopted the mitigation above. In January 2023, a large circumvention service provider in China (that asked not to be named at this time) also implemented our proposed scheme and has also found success.</p><p>While we did not study countries other than China, our proposed circumvention strategies are reported to be also working in Iran, another country that reportedly blocks and throttles fully encrypted proxies <ref type="bibr">[65]</ref>. On February 13, 2023, Lantern developers reported that the adopted protocol "accounted for the majority of our Iran traffic" since January 2023. On February 13, 2023, a different circumvention service provider reported that, after enabling Outline's mitigation feature in November 2022, their services turned from being completely blocked to serving 850k daily users from Iran.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9">Ethics</head><p>Censorship measurement research carries an element of risk and responsibility which we take seriously. Our research involves handling sensitive network traffic, scanning large numbers of hosts, and performing network measurements in a sensitive country. Due to the sensitive nature of this work, we approached our institution's IRB with our detailed research plan for review. While the IRB determined that the work does not involve human subjects (and thus does not require IRB review), we have designed and implemented extensive precautionary efforts to minimize potential risks and harms. In this section, we discuss these risks and detail the precautionary measures we adopted to manage and mitigate them.</p><p>Traffic analysis. We worked closely with our university's network operators, who have extensive experience in managing such projects, to deploy our network measurement tool to ensure it is within the network use policy and respects user privacy. We design our experiments to avoid collecting potentially sensitive information, such as IP addresses, which could reveal human identifiable information. We collect minimal information and focus on tracking aggregate statistics to avoid potentially identifying individuals. Specifically, we only analyzed the very first TCP data packet in each connection and ignored any subsequent packets. In addition, we only logged the first six bytes of data and keep an aggregate count of their occurrences; no raw traffic was ever inspected by a human nor logged. We practiced the least privilege principle, giving only a subset of our team access to this data.</p><p>Internet scanning. To minimize the risk of overwhelming servers when performing Internet-wide scans, we followed the best practices outlined in prior work in Internet scanning and widescale censorship measurement <ref type="bibr">[26,</ref><ref type="bibr">60]</ref>. We set up a dedicated webpage, along with a reverse DNS to it, on port 80 of our scanning host at CU Boulder. The webpage explains what data our scanning collects, and offers ways to opt out of future scans. During our entire experiment period, we received and honored seven removal requests, which is typical based on past experiences scanning the Internet [25, &#167;5.3] [26, &#167;5.1]. Our follow-up scans to these servers were low-bandwidth: we sent less than 100 bytes for each request, and each server only performed one connection at a time to avoid overwhelming their network or connection pool resources.</p><p>The use of vantage points. Active censorship measurement from within censored countries requires additional considerations and prudent evaluation. We first explored the possibility of performing the measurement remotely but confirmed that this censorship could not be triggered from outside of China. While it may be low risk to have sensitive queries observed by the censor, we follow similar standards discussed in prior work to limit the number of these sensitive queries we send <ref type="bibr">[5]</ref>. In particular, we only send queries on port 80 to servers that are listening on that port, and made no concurrent connections to the same server to avoid overwhelming server operations.</p><p>Our research team consulted experts with a deep understanding of the nature and legal concerns of Chinese censorship, who helped us make informed decisions on which VPS providers to use and how to use them. We selected two large-scale VPS providers run by well-known commercial companies in order to avoid any potential legal risks to individuals. We registered our VPSes with the accurate identity and contact information of one of our researchers who is neither a citizen of nor resides in China. We received no complaints from the providers throughout our research. As done in prior work <ref type="bibr">[5]</ref>, we do not inform these large VPS providers of the experiments ahead of time, to avoid potential experiment bias (e.g. interference in results) or placing potential legal obligations or burdens on the VPS providers.</p><p>We manage the risk of potentially getting any server blocked by the GFW temporarily or in the long term. For all hosts we controlled in this study, we assigned dedicated IP addresses to them to avoid blocking shared IP addresses. In addition, we rented our non-censoring network hosts from a VPS provider that permits censorship circumvention usage and even offers automatic installation of circumvention tools. Similar to the findings in prior work on residual censorship in China <ref type="bibr">[13,</ref><ref type="bibr">14,</ref><ref type="bibr">17,</ref><ref type="bibr">63]</ref>, we tested using our own servers and confirmed that the GFW never blocked any of our machines' IP addresses for more than 180 seconds, and the blocking only affected traffic from our clients to the servers, without interfering with traffic from others'. Knowing that our servers were used for five months but never experienced any long-term blocking, we proceeded to perform our large-scale scans.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="10">Conclusion</head><p>In this work, we exposed and studied China's latest censorship system that dynamically blocks fully encrypted traffic in real time. This powerful new form of censorship has affected many mainstream circumvention tools partially or in full, including Shadowsocks, Outline, VMess, Obfs4, Lantern, Phiphon, and Conjure. We conducted extensive measurements to infer various properties about the GFW's traffic analysis algorithm and evaluated its comprehensiveness and false positives against real-world traffic. We use our knowledge of this new censorship system to derive effective circumvention strategies. We responsibly disclosed our findings and suggestions to the developers of different anti-censorship tools, helping millions of users successfully evade this new form of blocking. bytes are printable ASCII. One straightforward way to satisfy this property would be to simply base64-encode all of the encrypted traffic. This, too, is only a stopgap solution; base64-encoded data is easy to detect, and the censor could simply base64-decode and then apply its rules. Although it is effective against the GFW today, we do not consider it as a long-term solution.</p><p>More than 20 contiguous bytes of printable ASCII. The GFW exempts connections if the first packet has more than 20 contiguous bytes of printable ASCII. One way to satisfy this is to base64-encode only a small portion of the fullyencrypted packet-or even just insert at least 21 printable ASCII characters into the ciphertext. While we believe this would be more difficult to detect than base64-encoded the entire packet, it also strikes us as a short-term stopgap.</p><p>All of the above countermeasures can be implemented on the client-side only, without requiring support from the proxy server. This is possible by applying an idea from prior work <ref type="bibr">[12]</ref>: sending a packet such as the ones described above that gets processed by the censor but not by the proxy. For instance, prior to sending the actual first packet of the connection, the client could send a packet that satisfies one of the above rules but that has a broken checksum (which the censor will not check, but the proxy will) or a limited TTL (large enough to reach the censor but not the destination). While these techniques were first verified against Iran's Protocol Filter, we have verified that these same approaches work against the GFW's blocking of fully encrypted traffic. Although this provides an encouragingly easy path for deployment, it alone does not elevate these stopgap solutions to longer-term ones.</p></div></body>
		</text>
</TEI>
