A Captive Audience.
Published 16 March 2025 at Yours, Kewbish. 2,341 words. Subscribe via RSS.
Introduction
Finding free WiFi has been very ingrained into how I use the Internet. Until last summer, I didn’t have a data package on my phone plan. Whenever I’d be out and about off-campus, I had to rely on free WiFi. I remember scurrying around for scraps of signal on the boundaries of whatever coffee shop had guest WiFi in order to ask my friends where they were or check maps. When I open my laptop in any new library, café, or airport, the first thing I do is connect to the free WiFi. On Etsy, there are thousands of results when I search for “free wifi sign”, so business owners and digital artists alike must be aware that free WiFi is highly in demand.
I’ve been lucky to grow up in a city and in contexts where WiFi was more or less available when needed: I put off getting data for as long as I did because I was always either at university, or at home, or on the bus between the two, all of which had free WiFi. WiFi has never felt like a commodity or resource to be carefully rationed. I’ve always just kind of expected that it’d be there.
This last year, I’ve had the opportunity to do a lot of travelling, and that also means connecting to a lot of free WiFi while in transit and in new environments. Fun fact: if you use NetworkManager on Linux, you can run sudo ls /etc/NetworkManager/system-connections
to get a list of all of your previous connections, which by default are saved to enable autoconnection whenever the network is in range again. I’ve accumulated a bunch of train networks: I have the UP Express, VIA Rail, and Eurostar connections saved. I also have many a coworking space and café, as well as a sprinkling of hotels.
All this travelling and all this free WiFi also meant a lot of WiFi captive portals. Captives are the interstitial page that usually displays the company’s information and the standard usage terms and conditions before you’re able to meaningfully access the network. You might have to provide your email or phone number for data harvesting purposes, and in the world of airline WiFi, they’re also used for requiring payment upfront, but I won’t be focusing on paid WiFi here. After you submit the form, you’re released from your Internet captivity and can begin normally using the network. They’re a staple of free WiFi access, and I’d suspect most folks’ brains don’t even process the form before just trying to click through as fast as possible.
Despite these portals popping up on an everyday basis, have you ever wondered how they work — how they’re triggered, usually without your interaction? In April last year, I was in Seattle for OSSNA2024, and after getting back to my hotel, I was trying to log in to the guest WiFi network to check my email. I signed my phone in first, which worked without a hitch: a terms and conditions page popped up and asked for my room number before I could get access. Same goes for my parents’ devices. But on my laptop, I was only able to connect to the network — my status bar applet was showing full connection and I had an IP address assigned — but I couldn’t figure out how to get the captive portal to show up so I could actually get access to anything. After some educated futzing around with the URLs of past free WiFi portals that were still in my browser history, I managed to open the hotel’s WiFi portal from an old, unrelated gstatic.com/generate_204
link. Since then, I’d been wondering how this worked — why did captive portals seem to be all using this gstatic
thing? Why did my phone always get a portal while my laptop was hit-or-miss?
This is a post about captive portals, HTTP hijacking, and the hardcoded URLs that tech companies use to trigger popups. I’ll also take a brief detour into the world of Parisian libraries and how I automated one of their captive page logins. I hope this’ll serve as a fun behind-the-scenes look at what happens when you hit up a new coffee shop and connect to their network for the first time.
O Popup, Wherefore Art Thou?
When you connect to a WiFi network that uses a captive portal, the IP address assignment takes place as usual with DHCP. Your machine and the DHCP server (usually your router) perform a two round-trip exchange to offer, request, and acknowledge your assigned IP address.
From this point onwards, there’s a couple ways that captive portals are implemented.
- For one, DHCP servers can return DNS server addresses in their responses. A free WiFi network can set their firewall to only allow captive users to use the network’s DNS server for all resolution at first, and then only return the IP address of the captive portal page for all lookups. The firewall can be configured to allow noncaptive users to use other DNS servers once they’ve authenticated.
- Otherwise, networks might use HTTP redirects: all web traffic is pointed first at an intermediary server that returns a redirect to the captive portal page. When your client device first connects to the network, it sends a request to a standard captive portal detection URL) and checks the status code to see if the network allows unfettered access or if it redirects to a portal page. This only works with HTTP sites, though, because with HTTPS, the router can’t do anything with the intercepted traffic: it doesn’t have the decryption keys, so it can’t modify it; if it returns a different certificate with the response, the browser will show a certificate error. If a site uses HSTS, any attempts to downgrade the traffic to HTTP to redirect to the captive portal will fail. HTTPS sites are therefore difficult for captive portals to intercept.
- Besides all this hijacking, there’s an RFC to introduce a new standardized DHCP option for routers to inform clients of the captive portal enforcement. The proposed option would contain the URI of the captive portal page. For networks not using a captive portal, the RFC specifies a sentinel value to use to avoid the client having to perform captive portal detection. There’s another RFC for specifying a similar HTTP-based API.
The exact URLs used for captive portal detection differ per client. The gstatic.com/generate_204
that I used in Seattle to trigger the captive portal redirect is controlled by Google, and Chrome has a similar link. iOS tries to load captive.apple.com/ when connecting. Both of these are HTTP links, so can be redirected or hijacked as necessary by the portal. If a captive portal is in place, the 204
status or success page won’t load (and may replace itself with the captive portal’s page). Another site that’s often recommended is neverssl.com
, which exists for the sole purpose of having a reliable HTTP url to load (that won’t be redirected by the browser to an upgraded HTTPS connection). I use this to force the captive portal page to load whenever Chrome’s detection doesn’t quite work.
I think my captive portal detection was hit-or-miss in the past because I would sometimes have HTTP sites open in Chrome previously, which would auto-load after starting my laptop or reconnecting to WiFi, and sometimes I’d have tabs full of HTTPS only. The latter is more frequently the case because most sites nowadays default to HTTPS. Chrome will remember which sites do support HTTPS and automatically load the HTTPS versions, even if you type in a HTTP URL. However, Chrome has a whole page on portal detection, so I’m not sure why their solution has been so spotty in the past.
Linux’s NetworkManager
also has some configuration options to automatically open captive portal pages on desktop managers that don’t automatically do so. You’ll have to write a dispatcher script to start a browser instance if NetworkManager
detects that the network’s connectivity state is a portal. NetworkManager
’s portal detection uses the ping.archlinux.org
URL, but this can be configured via the [connectivity]
option. I haven’t set this up, since just opening the browser and visiting a HTTP site has worked well enough for me.
The upshot for captive portals on the Internet is that you’ll need some sort of web browser on any device, and I’d bet this is part of why web browsers are so prevalent on devices like TVs where the interaction feels awkward. There’s a lot of complaining on various electronics community forums about one’s TV not connecting to captive networks: these browsers likely don’t implement proper portal detection, so folks have had to come to the same conclusion of putting in the portal’s URL directly, since sometimes even redirects don’t work.
Sainte-Geneviève, Meet NetworkManager
The Bibliothèque Sainte-Geneviève is one of the main public libraries in Paris. It’s right by the Panthéon, which is a nice view if you get stubbornly stuck in the hours-long queue for a spot that sometimes snakes across the end of the block1. I really like working there when I’m not in the lab — in general, high ceilings and big windows are my jam, and the BSG’s vaguely Art Nouveau roof design and vaulted ceiling with tonnes of windows checks all my boxes.
What’s more, the BSG provides both a lovely working environment and free WiFi for registered readers. WiFi is blocked behind a captive portal asking for your library card number and account password: by default your birthday in DDMMYYYY format. I was writing part of this post at the BSG, so out of curiosity I poked around at the portal page.
This page just fires an AJAX login request based on the card number and password you provide. The zoneid
variable is set by a separate zone.js
script, which seems to statically set it to 0.
Some networks, particularly those that require a login and password, do a better job at remembering users after captive portals. However, because this captive portal came up every time I visited, I wondered if I could automate login with a simple shell script so I wouldn’t have to fill the form in myself.
I learned that Linux’s NetworkManager
supports dispatcher scripts, which are run whenever a connection changes. It passes several environment variables, including the CONNECTION_ID
SSID, and two arguments: the interface name and event type (like “up”). The scripts must have certain permissions set as well. You can reference the dispatcher documentation for more details.
For my use case, a simple script wrapping a curl
command was enough. This script will run the curl
on every connection change to an “up” state on all connections with the SSID “BSG Public”2.
# in /etc/NetworkManager/dispatcher.d/bsg.sh
# run:
# $ sudo chown root:root /etc/NetworkManager/dispatcher.d/bsg.sh
# $ sudo chmod 755 /etc/NetworkManager/dispatcher.d/bsg.sh
if [ "$2" = "up" ] && [ "$CONNECTION_ID" = "BSG Public" ]; then
curl -X POST "http://portail.bsg.univ-paris3.fr:8000/api/captiveportal/access/logon/0/" \
-H "Content-Type: application/json" \
-d '{"user": "[USERNAME]", "password": "[DDMMYYYY]"}'
fi
exit $?
And with that, I no longer had to fumble for my library card to log in every time I connected. I haven’t tried poking into other captive WiFi network pages to see what they require behind the scenes, but I figure it’d be easy enough to scrape pages to find the “accept conditions” checkbox and network request and automate logins with a similar curl command.
Conclusion
TL;DR: whenever you connect to a new network, if you get certificate errors on all your HTTPS sites or can’t access anything and don’t get a captive portal page, open neverssl.com
, which should get you to the captive portal page.
I wrote this post primarily to share this neverssl.com
tip, which would have saved me a lot of time retrying free WiFi connections with no luck and resigning myself to doing work on my phone. Whenever I encounter networking problems, I tend to chalk it up to it “just being a Linux thing”, but in this case, captive portal implementations are unstandardized and slightly janky on all providers. In this case, Linux and NetworkManager
actually let you hook into network connections more easily, and as I’ve shown, we can automate portal logins relatively easily even if the underlying portal implementation doesn’t support it.
In my earlier section on captive portal implementations, I mentioned there were a couple RFCs on better standardizing captive portal interactions. While the RFCs themselves date to around five years ago, it seems like there’s been some progress on supporting them in both iOS and Android, but there’s still work to be done for adding support to other major DHCP servers and router hardware. Overall, though, providers seem to have converged on a set of detection and portal display techniques that work for most cases, so I don’t know if there’ll be much movement in this area going forward.
Diving into captive portals was fun — there’s so many other similar interactions on the Web that are ubiquitous but never something that you really think about. Getting to dig to the spec and RFC level for a small, self-contained area as well as coming up with a little artifact is a nice way to get an overview of an API or area. Writing shorter technical posts is also quite refreshing after my recent run of heavier explainers: more rabbitholes (and café free WiFi connections) to come in the future!
-
As I’m finishing this blog post, the queue is to the end of the block, so I’ve been resigned to sitting on one of the little rock benches across the street just barely in range of the WiFi. ↩︎
-
If you want to limit the script to only a specific connection (and not just all networks called “BSG Public”), you can also look into the
.nmconnection
files in/etc/NetworkManager/system-connections
to find its UUID and update the script to check the$CONNECTION_UUID
variable instead. This UUID will differ per device even on the same connection, so for the sake of making this script more universal I opted to use theCONNECTION_ID
. ↩︎