A playbook for ethical engineering on the web.
- Basic content should be accessible to all web browsers
- Basic functionality should be accessible to all web browsers
- Sparse, semantic markup contains all content
- Enhanced layout is provided by externally linked CSS
- End-user web browser preferences are respected
Everything changed in the early 2010's with the ascendency of Google's Chrome browser. Google marketed their browser aggressively and users quickly abandoned Internet Explorer, and with them developers. Finally, between Firefox and Chrome, we had two popular standards-compliant browsers that could carry the web forward. Internet Explorer support quickly dropped out of the web development metagame, and support for the browser among developers went from mandatory to being seen as a painful chore, and a barometer on whether any particular employer had long-term tech vision.
Progressive enhancement died with Internet Explorer.
With the fall of Internet Explorer and the mass adoption of standards-compliant web browsers, we could for the first time rely on consistent behavior. The least common denominator had been raised, significantly. Visions of the web that had been held back for years by Microsoft's awful browser could finally be attained, and the web developer community became drunk with power. Single page apps rose in popularity as quickly as the client side frameworks that enabled them; jQuery and MooTools became increasingly irrelevant as DOM manipulation was now reliable via native APIs; browser vendors raced to implement new web standards and APIs as fast as they could be created. The last decade has certainly been an exciting time to be a web developer!
Web browsers are bad operating systems.
Imagine an operating system where all software development must be done in QBasic; it's an operating system that downloads, installs and executes any program the moment it is encountered, without your permission, and it leaks information about your activities to anyone who wants to spy on you. No it's not Windows 10; you've basically imagined a modern web browser.
Browsers were not designed to be operating systems.
Mozilla and Google found out the expensive way that their browsers are not great as operating systems. It turns out that, despite peoples' hopes and dreams to the contrary, browsers aren't particularly good app execution environments. Again, be honest with yourself: when was the last time you saw a web app with performance or capability similar to a native app compiled for the platform it's running on? We've been promised this ideal since Progressive Web Apps became a thing, but in practice it has rarely panned out. The closest I've seen are apps like Slack or Postman running on the Electron framework, but they tend to be slow and buggy (though to be fair they do work.)
You probably don't need to build an "app."
Consider this: in the 16 years that this blog has existed in one form or another, we've seen generational advances in computing hardware that have put orders of magnitude more performance into the hands of consumers. Today we have mid-range cell phones that blow my 2004 development PC away in terms of CPU, graphics, memory and storage speed. And yet, the perceived performance I've experienced using the web on better hardware throughout the years has not increased, and in many cases is worse. I doubt I'm alone in experiencing this. So what happened?
New features carelessly expose private data.
Example 1: WebRTC IP address leakage
WebRTC is the peer-to-peer realtime communications standard that enables voice and video calls in web browsers; popular examples include Slack and Google Hangouts. This was released in browsers as far back as 2012, well before the standard was considered stable (which wasn't until 2018).
Because the flaw stemmed from a fundamental problem in WebRTC's peer negotiation API, it could not be fixed without breaking WebRTC itself and all the apps that used it. It took many years for a proper workaround to be standardized, and browsers only deployed the fix very recently with the release of Chrome 76 in August 2019, and Firefox 74 last month.
In some jurisdictions and situations, IP addresses are considered Personally Identifiable Information (PII) and are protected under privacy regulations like GDPR and CCPA. Mishandling of PII can expose a website operator to significant liability. But beyond the potential legal ramifications, the WebRTC situation was just a breach of trust and a bad look for browser vendors.
Example 2: Web MIDI API dumps a list of connected USB devices
A year ago I was playing around with the MIDI API in Chrome while working on
a software synthesizer, when
I noticed a pretty serious problem:
Chrome's implementation of the API carelessly skipped over the user agent
permission prompt, which is part of the spec.
Instead, Chrome would immediately dump a list of connected USB MIDI devices,
with no user notice or consent, whenever a call to
requestMIDIAccess was made.
When I first wrote about this, I noted that the leakage of USB devices could be used to deanonymize users, but didn't believe this was widely known or used by any bad actors. However, last week I was doing some "security research" with a more recent build of Chromium, which has fixed the permission problem, and I ran into this prompt on popular porn site xhamster.com:
The totality of exposed data can precisely identify users.
The Electronic Frontier Foundation (EFF) has a great write-up and demonstration of device fingerprinting techniques via their Panopticlick Project:
When you visit a website, you are allowing that site to access a lot of information about your computer's configuration. Combined, this information can create a kind of fingerprint — a signature that could be used to identify you and your computer. Some companies use this technology to try to identify individual computers.
While Mozilla has attempted to mitigate some common fingerprinting techniques with Firefox's Enhanced Tracking Protection, much of their approach relies on blacklists (aka "whack-a-mole"). For Chrome, Google has business incentives as the world's biggest advertiser to not give a shit about fingerprinting, as the techniques are primarly used by shady adtech players (aka their competition), and thus strengthen Google's lobbying position as "one of the good guys" that they leverage to widen the regulatory moat around their Better Ads coalition, maintaining their dominance over the entire industry.
But who is this really harming?
Later, around the time of the 2016 election, my executive director got this notice from Google while logging into his Gmail account (we were told it was Russians):
Here in the U.S., where we have laws and civil rights, malware and phishing attacks are mostly just annoying. It's never fun to have your data stolen and it's expensive to keep buying new laptops every time someone gets infected. But in countries like Iran or China, where civil rights are suppressed and activists frequently "disappear," these types of targeted attacks can have life-or-death consequences. In Edward Snowden's disclosures of National Security Agency surveillance practices, he revealed that the NSA piggybacks off U.S. ad networks to precisely target individuals. It's a fair bet that hostile foreign governments are doing the same sort of thing with less-savory foreign ad networks. This is why human rights workers must take care to protect themselves from trackers, scripts and browser fingerprinting.
Another example: Grandma gets pwned
Ethical engineering matters.
Remember — users are human beings and privacy is a human right.
While we can always write code that respects peoples' privacy preferences, it's worth noting that we can only minimize the potential for harm. Whenever we write code or people execute it, there is always a basic assumption of risk on both sides:
- that our code could be used in malicious ways we never intended; and
- that the act of running our code could expose a person to harm beyond our ability to control for.
These risks are inherent to all software and should not block us from releasing our work. However the question of "should this software even exist?" and other ethical considerations are relevant and covered in the next section.
A playbook for ethical engineering on the web
Again, ethical engineering is using the tools at our disposal to write code that minimizes harm and respects peoples' human rights. I've painted with some broad strokes above, but now focus on some actionable steps that any web developer can take to apply an ethical mindset to their work. This is not an exhaustive list, and I invite you to share your own ideas in the comments below.
1. Respect the Do Not Track (DNT) header.
If a person's browser is passing the DNT header, do not place any cookies without their explicit permission. Explicit permission would be logging into your site by previding credentials, oauth or similar; or alternatively, the person could opt-in to cookies by clicking a link or submitting a form indicating their permission.
If a person has opted-in to cookies, pass a
Tk: C response header as described
in section 6.2 of the
W3C Tracking Preference Expression draft.
You may also wish to create a
Do Not Track Policy in accordance with the
EFF's well known format. Doing so will inform other software that your domain
respects peoples' tracking preferences.
3. Avoid the use of third-party scripts, trackers, and analytics.
Embedding code from origins you do not control presents a couple of problems:
For site analytics, consider using Matomo as a locally-hosted alternative to Google Analytics.
4. If you must embed a remote script, enable Subresource Integrity (SRI).
is a security feature that enables browsers to verify that resources they
fetch (for example, from a
CDN) are delivered
without unexpected manipulation. It allows you to include a
with any remote
<script> tag, which the browser verifies on load. If the
script does not match its SRI hash, the browser blocks it.
Subresource Integrity is a great way of keeping honest origins honest. It is a safeguard against a remote origin being compromised or otherwise behaving maliciously, because a script will just be blocked if they try any funny business.
5. Learn about and defend against common security threats.
The web is a dangerous place. As developers we have a responsibility to act with a basic level of competence to protect the security of our software and the people who use it. While no one can anticipate every possible security threat, everyone can learn the basic threats that face many sites and are widely exploited across the web. These low-hanging fruits include Cross-Site Scripting (XSS), SQL Injection, and Cross-Site Request Forgery (CSRF). MDN has a great write-up on Website security which could be considered required reading for a beginner.
6. Use HTTPS!
HTTPS encrypts all traffic to and from your server. This prevents all flavors of malicious actors from intercepting and monitoring peoples' data in transit. HTTPS used to be an expensive hassle to set up, but now it's free and easy thanks to the Let's Encrypt project. It's [current year] — use HTTPS!
7. Implement a Content Security Policy (CSP).
Using a CSP is especially important if your site allows people to post content (for example a commenting system). Typically you would attempt to sanitize user-generated content (UGC) and prevent people from embedding remote scripts on your site, but you might not be able to forsee all the "creative" ways malicious individuals could try to get around your best efforts. With CSP, you can block any scripts being loaded on your page from all but a pre-approved set of origins, among many other benefits. This gives you a good layer of defense against many common attacks.
8. Consider releasing under a Free and Open Source Software (FOSS) license.
Taking the previous steps will help protect peoples' right to Privacy. Software can also protect other human rights, including the right to Free Speech.
Free Speech is often a flashpoint for controversy on the web, largely due to the platform effect of centralized social media and discussion platforms—in order to protect the goals of any sufficiently large platform that hosts user-generated content, it becomes necessary to moderate (aka censor) speech. This applies to the largest social media sites and the smallest discussion forums. For example, if you're running a gardening forum, you may not wish to allow discussions that devolve into presidential politics; it is simply not relevant to your purpose in offering the site.
- The freedom to run the program for any purpose.
- The freedom to study how the program works, and change it to make it do what you wish.
- The freedom to redistribute and make copies so you can help your neighbour.
- The freedom to improve the program, and release your improvements (and modified versions in general) to the public, so that the whole community benefits.
You may not wish to allow certain types of speech on your own servers, but by releasing Free Software, people have the freedom to run your software on their own servers for their own purposes. This protects peoples' Free Speech and also helps in the effort to re-decentralize the Internet. Look to the Mastodon Project as a perfect example of decentralized FOSS that protects Free Speech.
9. Try to minimize harm; consider whether your software should even exist.
All too often, engineers release software not because it's a particularly good idea, but simply because they can. To be an ethical engineer, it's crucial to weigh the benefits your software poses to the people who use it against the forseeable harm your software could do, both to the same people and to society at large. Only you and your personal ethics can determine where the tipping point is, where the benefits may be outweighed by the forseeable harm. But if you could consider your own software harmful, it is worth holding back and rethinking your approach.
If you want to learn more about ethics in tech, I'd highly recommend the following resources: