# RSS For Twitter Feeds #RSS #Nitter #CAPTCHA There is an almost infinite sea of posts and videos on tools, packages, updates, vulnerabilities, frameworks, ideas, etc. out there, and I strongly believe that keeping an eye over what's new is beneficial. I am always learning new things from stuff that floats across my feeds. Even if I never use the thing, I often pick up ideas or ways of doing things which I can apply in a different context or a completely unrelated way. The way I do this is primarily via RSS. Over the years, I have accrued many feeds, mostly blogs, and a few tech news sites which overlap with my areas of interest. There are, however, a number of people who choose to share their thoughts via 𝕏, meaning there is no RSS feed I can subscribe to. ## Nitter Nitter is "A free and open source alternative Twitter front-end focused on privacy and performance.". One of its killer features is that it provides an RSS feed for each account, problem solved! For a while... 𝕏 started limiting access to the public APIs which Nitter relied on, this pretty much killed the project. Only a couple of [public instances](https://github.com/zedeus/nitter/wiki/Instances) have been kept alive. I believe they were able to do this because they implement strict rate limiting, filtering and CAPTCHAs. This obviously blocked RSS readers. The Nitter instance I use is [nitter.poast.org](https://nitter.poast.org), and when you first visit it shows a 'Verifying Your Browser' message for a couple of seconds and then lets you in, no 'click the checkbox' or 'select all the pictures of road signs'. I got curious... What exactly was being verified and how? ## Reverse Engineering The CAPTCHA I opened dev-tools and hit 'Copy as cURL', then just started taking chunks out until it stopped working. All that was required was: ```bash curl 'https://nitter.poast.org/xkcd/rss' \ -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:138.0) Gecko/20100101 Firefox/138.0' \ -H 'Cookie: res={HEXADECIMAL_VALUE}' ``` The magic value stored in the `res` cookie was the key. One internet search for `"Verifying your browser" "res="` revealed [simon987/ngx_http_js_challenge_module](https://github.com/simon987/ngx_http_js_challenge_module). This is a "Simple javascript proof-of-work based access for Nginx". This was perfect! [Proof of work](https://en.wikipedia.org/wiki/Proof_of_work) challenges are designed to deter SPAM by requiring each user to do a little work. For an individual, the effort is trivial, but for a bot farm the costs start adding up. All I needed was to extract the proof algorithm, automatically generate values for the `res` cookie, and then I could set the cookie when making requests for the RSS feed. What I also loved about this wasn't a hack or a working-around, I was going along with the sites' requirements. Looking at the source of the module revealed a large blob of obfuscated JS. I dumped the source into Claude and asked it to explain how it worked. Turns out the vast majority of the code is a JS implementation of SHA1 presumably for older browsers, the actual algorithm is really simple. Here it is ported to PHP: ```php function solveChallenge(string $challenge): string { $i = 0; $byteOffset = hexdec($challenge[0]); // Get position from first character of challenge while (true) { $solution = $challenge . $i; $solutionHash = sha1($solution, true); if (ord($solutionHash[$byteOffset]) === 0xB0 && ord($solutionHash[$byteOffset + 1]) === 0x0B) { return $solution; } $i++; } } ``` The execution time of this seems to vary wildly. It can be as high as 8 seconds or as little as 500Ξs, the vast majority of the runs I did completed in ~10ms. All that was left to do was extract the challenge value from the verification page, I choose to use a dead simple and admittedly fragile regex: `{regex}[A-Z0-9]{40}`. It works.