Hello, I’m Minh Nguyen (though I style myself Minh Nguyễn, with all the wonderful diacritics), a graduate of St. Columban School and St. Xavier High School and currently a sophomore at Stanford University. Passing by my dorm room, you might’ve seen me staring at the monitor, the monitor mutually staring back, as I type… click… type… click— blog

February 22, 2019

The Benson Street Bridge, or “Rainbow Bridge”, marks the city limit between Reading and Lockland, Ohio. Residents are fond of mentioning a sign that hangs over the bridge, proclaiming both Cincinnati suburbs to be “Where Friends Meet”. But if you talk to enough people from the surrounding area, you eventually hear whispers about a less friendly sign that used to be posted at the city limit, warning nonwhites not to set foot in Reading. My mother used to ride the bus with an African American person who still avoided that town, even though she said the whites-only sign hadn’t been up since the 1960s. You can imagine there must’ve been robust enforcement of that policy for it to have wound up on a welcome sign in the first place.

I first heard about her experience while growing up on the other side of town in Loveland. I was still too young and sheltered to take notice of the occasional weird remark or special treatment that in hindsight was almost certainly related to my ethnicity. But that story struck me, because we previously lived in Reading and we went to church and shopped there for several years after. To a child like me, Reading seemed like a perfectly normal community, not the sort of place that only a generation earlier might’ve treated nonwhites like criminals or kept us out.

Benson Street Bridge in 1951
The Benson Street Bridge is pictured in a 1951 book. A replica of this original “Where Friends Meet” sign was raised over the bridge last year. No whites-only sign is visible in the photo. If the sign was indeed posted on Benson Street, it’s possible that it could’ve been posted just after the bridge, remaining out of view, such as where a Neighborhood Watch sign is seen in a photo from 1998 or a roadside welcome sign is seen in Google Street View in 2018.

Reading’s decades-long history as a sundown town goes unmentioned in official town histories. A 1951 book commemorating Reading’s centennial tells some tall tales about Reading’s history but omits any mention of the ban on blacks. Surrounding the mayor’s embellished historical account are advertisements by local businesses praising Reading as a “progressive town”. In fact, it took me many hours of scouring local newspaper archives before I finally found a single published confirmation of the ban, from 1912.

Fear Crowd as Suspect Faces Girls
On July 5, 1912, The Cincinnati Post, a pro-labor paper, mentions Reading’s ban on blacks to explain why an armed mob of Reading villagers would invade neighboring towns to capture a black suspect falsely accused of attacking two white women. Another daily paper, The Cincinnati Enquirer, covered the same incident but downplayed the mob and mentioned nothing about a ban on blacks.

Even if the community felt little need to advertise its ban in the press, those who needed to know about it did get the message. Census records show a purely white Reading, extremely unusual for the area and for a town of its size. Until the 1970s, Reading’s black population was usually nonexistent and never broke two percent of the total population. By contrast, neighboring towns had plenty of black residents, especially Lincoln Heights, the first black-led municipality in the North.

Reading population by race, from incorporation to present
Census 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
White 1,225 1,575 ? ? 3,076 3,985 4,463 5,664 6,079 7,833 12,816 14,213 12,563 11,755 10,579 9,251
African-American 5 0 ? ? 0 0 77 59 0 0 1 60 196 172 361 756
Other 0 0 ? ? 0 0 0 0 0 3 15 30 84 111 352 378
Total 1,230 1,575 2,680 ? 3,076 3,985 4,540 5,723 6,079 7,836 12,832 14,303 12,843 12,038 11,292 10,385

Other communities have had their ugly side too. Loveland was where the KKK held regional conventions and staged photo-ops in the 1960s and 1970s. As recently as the 1990s, when I lived there, they almost put up their cross in front of City Hall, after Cincinnati put an end to their annual rallies downtown. Yet despite the problems African Americans in Loveland may have faced, they were still able to reside there in greater numbers. Reading’s demographics stood apart from its neighbors, a fact that was surely clearer in person than in a federal census report. The sundown town phenomenon was especially pernicious because it was, in a sense, uncontroversial in communities like Reading.

As I continued to search newspaper archives for additional information about Reading’s race relations, I stumbled upon articles about dozens of other towns across the country that had driven out their black residents, either violently or by threat of violence, and become sundown towns. Each time, I edited the town's Wikipedia article to acknowledge the town’s former antagonism towards blacks. In at least one instance, I found that a passage on race relations had already been added to the article, only to be removed by a local resident who perceived it to be “playing the race card”, or blowing out of proportion what was historically a fact of life. But this is not a case of judging yesterday’s norms by today’s morals. Back then, southern Democratic newspapers often mocked northerners and Republicans as hypocrites for criticizing Jim Crow laws but, on the other hand, implementing an extreme form of segregation not often found in the South. Some northern papers also noted the existence of sundown towns with regret. People knew it was wrong.

By now, such policies have been swept under the rug so thoroughly that many current residents are legitimately unaware of their hometown’s sordid past. For them, seeing it suddenly appear in the town’s Wikipedia article can feel like an attack on their own character. But silence can only perpetuate injustice. There’s at least one person who still bears the burden of that long-gone sign at the Reading city limit. She will continue to experience that stigma until the community acknowledges its past and declares a definitive end to institutional racism as publicly as it started. Mentioning sundown towns in these Wikipedia articles is the least I can do to start much-needed dialogue before it’s too late to give people like her some closure.

January 2, 2019

This April marked ten years since I started mapping for OpenStreetMap. Then as now, mapping has always been a hobby for me, even as I picked up a day job writing OSM-powered software.

I seem to have a knack for playing double duty, awkwardly. That was on full display this year, as I juggled personal and employer-sponsored engagements back-to-back at State of the Map U.S. in Detroit, then traveled to WikiConference North America in Columbus to repeat the process. Fortunately, my first trip overseas, for a workshop at the international State of the Map in Milan, was less demanding. It was also an opportunity to witness firsthand the political fault lines running through the OSM community.

For a couple years, some community members have insisted on distinguishing between hobbyists on the one hand, and commercial interests on the other hand, with the phrase “craft mapper”. Some mappers literally wear the term as a badge of pride, in response to a blog post that questioned their influence on the project’s methods and atmosphere.

Windmill

To hear it from them, craft mappers seem to be defined by angst about a future where computer vision and machine learning and import bots and teams of paid mappers run circles around hobbyists. But I prefer to focus on the motivation rather than the method: call me a “grassroots mapper”. It’s an important distinction, and not only because I’m a teetotaler.

At its best, grassroots mapping is about leading by example and having fun all the while. Freed from short-term business justifications or even the need to focus on anything at all, grassroots mappers are continually at the vanguard of what’s worth mapping. I don’t have to have a plan for mapping all the restaurants in order to map just a few that keep getting neglected by industry. There doesn’t have to be a point to mapping corn mazes, or lavishing world-class attention upon a provincial town. What looks like tilting at windmills today will before long be the domain of paid mappers and artificial intelligence, as we move onto the next set of ideas to prove out.

Yet grassroots mapping is fundamentally uneven and small-scale, which makes the results suboptimal for just about any practical use case. As a grassroots mapper, I want my mapping to benefit ordinary people. Ordinary people need guarantees around reliability and consistency before they can even consider the special touches we put on the map. It would be unfair and counterproductive to saddle grassroots mappers’ spirits with the daily grind of making the map reliable and consistent. That’s what imports, organized mapping, and bots are for. The map can achieve that level of quality, and we can still have our fun, but only as long as the OSM project remains a big tent, welcoming all the help it can muster.

September 16, 2018

Recently, I discovered that the author of several Firefox add-ons has turned some of them into spyware designed to steal passwords from unsuspecting users. This post details the classic man-in-the-browser Trojan horse attack to raise awareness about shortcomings in Firefox’s WebExtensions API and the process for approving add-ons for distribution via the Firefox Add-ons website.

A gift for Troy

A year ago, Mozilla eliminated the so-called “legacy” extension platform in favor of a new WebExtensions API largely copied from Google Chrome’s extension API. This caused my AVIM extension to become incompatible with Firefox 57 and above, inconveniencing many Vietnamese-speaking Firefox users who depended on it to browse and communicate in their native language. I remain committed to eventually rewriting AVIM as a WebExtension, but there are significant technical hurdles.

In the meantime, two developers took advantage of the void left by AVIM. One created an extension that opens their homepage, from which they served ads alongside a download link for someone else’s desktop-based IME. Another, a self-described cybersecurity practitioner who goes by “domdomrung”, ported several add-ons from Google Chrome to Firefox, including a rudimentary “Vietnamese Input Method” extension. Both authors’ extensions had a blatant SEO character, but that in itself didn’t violate Mozilla’s add-on policies on security and privacy.

Vietnamese Input MethodSafeKids
Vietnamese Input Method (left) and SafeKids (right) on the Firefox Add-ons website

In April, domdomrung published a minor update to Vietnamese Input Method that automatically went out to existing users. The previous month, domdomrung had also updated a “SafeKids” extension that ostensibly provided child monitoring functionality. Buried within both updates were five lines of malicious code that logs users’ keystrokes and sends them to domdomrung’s website. Specifically, the code injects a script onto every webpage that logged each keystroke on any webpage to local storage. Every 3.6 seconds or upon pressing the Enter or Return key, if at least five characters had been typed, the script loads a tracking pixel from domdomrung’s domain, blog.mybloggertricks.org, using an Image constructor.

The putative image’s URL includes the typed characters and the current webpage’s URL. The resource at this URL is not an image but rather an HTML document that redirects to Google. Nonetheless, like any tracking pixel, merely accessing the URL is enough to populate the server logs with the payload. In this case, the payload includes a full browsing history and is very likely to include user names and passwords.

Surprise

Mozilla’s add-on policy requires every add-on to “disclose how the add-on collects, uses, stores and shares user data in the privacy policy” and “expects that the add-on limits data collection whenever possible”. Vietnamese Input Method had no privacy policy. Collecting keystrokes in this way provides no obvious user benefit. It’s impossible to know what domdomrung is doing with the data, but it isn’t difficult to imagine the data being used for nefarious purposes. (Some popular input method extensions do send keystrokes to a server but justify it as a necessary step for predictive suggestions. Such extensions do raise security and privacy concerns but are not necessarily malicious.)

SafeKids privacy policy
Vietnamese Input Method (left) and The SafeKids extension’s privacy policy says one thing in English to get by the review process and another thing in Vietnamese to attract downloads.

Meanwhile, SafeKids has a bilingual privacy policy that attempts to justify keylogging as a legitimate function that gives parents control over their children’s browsing behavior. Parenting questions aside, it’s worth noting that only the English portion of the privacy policy says, “I and mozilla cant not view your log” [sic]. That statement can’t not be true, given how the extension phones home with browsing histories and keystrokes. Similar language is nowhere to be found in the Vietnamese portion of the privacy policy. I suspect domdomrung used this statement to deceive Mozilla Add-ons reviewers who speak English but not Vietnamese. It worked.

A bug in SafeKids makes it possible to identify at least two victims of this extension. The add-on aggressively added the keylogging script to every HTML document, failing to distinguish ordinary webpages from HTML documents serving as rich text editors, which are popular on forums. As a result, the malicious code is present verbatim in forum posts and elsewhere. Where possible, I have notified these individuals of the need to uninstall the malicious add-ons.

In July, a Firefox user left a review complaining about Vietnamese Input Method’s keylogging behavior. However, the add-on remained available for download. On September 1st, I discovered this add-on and reported it to Mozilla, and they removed it a couple days later. However, they haven’t added it to the blocklist, so users who have installed this extension continue to suffer this breach of privacy. On September 9th, I also reported SafeKids to Mozilla. It remains available for download.

Vietnamese Input Method reviews
I wasn’t the first to notice the malicious behavior in Vietnamese Input Method.

On September 16th, I filed Bugzilla bugs 1,491,716 and 1,491,717 to add Vietnamese Input Method and SafeKids to the blocklist, which will quickly disable any existing installations. I urge Mozilla to act promptly to delete SafeKids and blocklist both add-ons.

Thanks to Mozilla for deleting and blocklisting both add-ons.

Prevention and preemption

This incident underscores the privacy risk posed by keyboarding software and undermines Mozilla’s claim that WebExtensions is inherently more secure than the legacy add-on platform that it replaced. Granted, with WebExtensions, a security vulnerability such as AVIM’s 2009 eval() bug can’t as easily escalate into a full-blown attack on the local machine. Still, as long as an extension has access to the keyboard and the network simultaneously, it’s all too easy for an unscrupulous add-on author to steal personal information en masse. The irony isn’t lost on me that my own subsequent efforts to keep AVIM secure through sandboxing frequently drew scrutiny from reviewers, scrutiny that these add-ons clearly didn’t receive.

Mozilla should audit existing add-ons for undeclared tracking pixel usage and reimpose a human review process for add-on updates, just as there was before WebExtensions. A human review process can help ensure that add-on developers aren’t using “clean” first versions as cover for future malicious updates. As it is, WebExtensions and the Firefox Add-ons website give users a false sense of security.

Beyond additional scrutiny, WebExtensions needs a dedicated, secure input method API. Such an API would isolate input method logic in an environment that lacks access to the network or indeed the rest of the webpage. Network access for predictive input methods could require a separate privilege. Ideally, an input method API would work throughout the browser, including in the search bar, as AVIM does. The lack of this functionality is a frequent complaint among users of Google Chrome’s input method extensions, including Google Input Tools and AVIM “lite”.

There is clearly a need for browser-based input method editors, as seen in the former popularity of AVIM for Firefox and the continuing popularity of input method extensions for Chrome. As Mozilla eliminates what’s left of legacy extensions, users shouldn’t have to forego their privacy in order to communicate in their own language. An add-on platform that facilitates legitimate extensions such as AVIM can keep malicious add-on authors from taking advantage of these users.

September 4, 2018

Give them an inch and they’ll take a mile. For all the good that can come of the “Edit” button on open content sites like Wikipedia and OpenStreetMap – disseminating knowledge, giving a voice to marginalized communities, facilitating humanitarian initiatives – anyone deeply involved with such projects can tell you it all just barely works. It’s a daily miracle that the projects haven’t collapsed under the weight of graffiti, spam, and outright lies. To hear it from countless educators, Wikipedia simply can’t be trusted.

On Thursday, Mapbox and its customers fell victim to very prominent vandalism in OpenStreetMap, in which the label for New York City got renamed to something juvenile and offensive. A few weeks earlier, Wikimedia Maps was also affected by the same act of vandalism, which included numerous slurs, frustrating Wikipedia administrators who felt that OpenStreetMap doesn’t have its act together. Both episodes were painful for me to witness, as a longtime proponent of collaboration between the Wikipedia and OpenStreetMap communities and, obviously, as a Mapbox employee. In the days since, there’s been quite a bit of discussion in the OpenStreetMap U.S. Slack workspace about how OpenStreetMap and the consumers of its data could better prevent or mitigate such attacks. Even someone seeking credit for the attack has joined in with suggestions for improving “security”. True to form, there have been renewed calls for Wikipedia-style article protection or Google Maps–style moderation. But either approach is a poor fit for an open project built on geographic data.

Nevv York
Is it a feature or a bug that the New York City node can be so easily modified? If not for the project’s openness, it’s unlikely that this city’s name would have been translated into so many languages and that the surrounding neighborhood would have gained so much detail.

A Wikipedia article can be protected, disabling the “Edit” button for everyone except administrators, quite effectively preventing it from being vandalized. But what would the equivalent be on OpenStreetMap? Protecting a single feature, such as the New York City node, would only invite a vandal to place an asinine node right next to it. (On Wikipedia, you can create a “Nevv York City” article full of junk, but nothing would ever link to it, so the impact would be minimal.) Protecting the area around the New York City node, meanwhile, would deprive that city’s residents of the ability to contribute local knowledge to the project. OpenStreetMap is still incomplete enough that we can’t afford to lock down any portion of the map, not least a fast-changing city full of potential contributors.

The Wikipedia community has always viewed article protection as a public admission of failure, only to be used as a last resort. Given that the project’s slogan is “an encyclopedia that anyone can edit”, why should so many important articles be permanently closed to editing? Wikipedia has tried to replace after-the-fact countervandalism with several different moderation systems. The most recent was finally adopted by the German Wikipedia but was only adopted on a limited basis on a handful of articles at the English Wikipedia. Meanwhile, countless pages remain permanently protected. To say that moderation hasn’t taken off is quite an understatement.

What’s more, instituting a peer review process for OpenStreetMap would entail more than just flipping a switch. While verifying a Wikipedia edit might entail spot-checking cited sources, OpenStreetMap prizes unpublished local knowledge, so to truly review changes and stave off hoaxes could in many cases require in-person visits. Google Maps and Foursquare, two commercial, crowdsourced mapping projects, actively recruit locals to spend all their time curating and groundtruthing, yet they still suffer from rampant vandalism. OpenStreetMap already encourages groundtruthing, but any hard requirement along those lines would either be roundly ignored or lead to the immediate death of the project.

To be sure, there’s more to countervandalism than locking things down. The vandalism that propagated to Wikimedia Maps, then Mapbox, lasted less than two hours on OpenStreetMap before the community reverted the changes and banned the vandal’s user account. Mapbox has deployed increasingly sophisticated tools, including machine learning, for automatically detecting and blocking vandalism that does make it past the OpenStreetMap community. Wikipedia has done much the same for its content to good effect, which is why its administrators found it so frustrating that vandalism would still work its way in via embedded maps.

Last week’s incident was an exception, proving the adage that an adversary only has to get lucky once. That style of vandalism would probably have been caught by Wikipedia’s extensive system of blacklists and abuse filters, which prevent vandals from even saving blatantly bad edits. But a persistent vandal – this one used JOSM, the OpenStreetMap editor so advanced I steer clear of it – will eventually find a way around any blacklist or filter. And if the goal is to reduce the amount of time that a vandal’s work remains on the site, then any solution will also have the undesired effect of helping the vandal learn new tricks faster, like a virus that evolves more rapidly in a Petri dish.

For my part, I think the OpenStreetMap ecosystem places far too much emphasis on bad actions while doing very little to identify and block bad actors. As a Wikipedia administrator for the last 15 years, I’ve seen firsthand the lengths that people will go to evade content-based countervandalism. If you do battle long enough with a persistent vandal, it’s only a matter of time before well-meaning contributors give up due to the inconvenience of avoiding benign words and any article of interest. You can maintain the project’s health much more effectively by targeting the malicious individual.

At any given time, a bewildering number of IP addresses and IP address ranges are blocked from editing either temporarily or permanently. Open proxies are blocked on sight. There are even tools for ferreting out sockpuppets and sleeper accounts and blocking them proactively, along with rigorous accountability to prevent administrators from abusing ordinary users’ privacy. OpenStreetMap needs to adopt similar antiabuse tools based on user identities, if it is to have any hope against the hoards of script kiddies that now realize the map is subject to vandalism.

But more than any technical solution, the best approach I’ve found to fighting vandalism requires no software changes at all. OpenStreetMap needs to double down on building the most jaw-droppingly detailed map imaginable. At Wikipedia, articles on uncontroversial subjects suffer from less frequent vandalism as they develop from stubs into detailed articles. You’d think a more complete encyclopedia article would give the vandals more to tear apart, but in fact the opposite is true. I see this trend most clearly with articles about high schools, a favorite target of vain and profane adolescents:

High school article vandalism, September 2017–2018
This chart suggests that the most frequently vandalized Wikipedia articles about high schools are stub articles (up to around a thousand words) as opposed to more fully developed articles. Specifically, the chart counts the number of times an abuse filter was triggered within the past year, normalized by the number of page views during that time period (to account for some schools being more well-known than others). The abuse filter doesn’t track every instance of vandalism that occurs on the article; even better, it tracks edits that the wiki blocks outright, as well as those that the wiki flags for human review. I chose a time period of one year because vandalism of school articles waxes and wanes according to the school calendar.

A school article tends to be vandalized by two groups of people: the school’s own students and those of the school’s sports rivals. I suspect the former group is far less likely to vandalize an article that represents their school surprisingly well. I’m unsure why the rival schools’ students also tend to leave the article alone, but I wonder if it’s because the article’s length and detail makes it look like less of a toy to be kicked around. Maybe an article or a map needs a critical mass of credibility to stave off these acts of immaturity. If so, that’s good news for OpenStreetMap, because so many of the community’s efforts – importing building footprints, adding turn lanes and speed limits, refining the cartography of popular map renderers – are primarily about parity with user expectations and thus about building trust with the user.

OpenStreetMap would do well to ignore calls to indiscriminately lock down content or lock out good-faith contributors. Technical barriers to entry are never a sound way to grow a community-oriented project. Instead, with a modicum of well-considered, identity-based antiabuse measures, the project’s contributors can go about their business, drowning out the vandals that already make too much of their dumb luck.

December 31, 2017

2016 paved the way for a 2017 that took me in a couple new directions but mostly fell along the same themes.

This summer, I traveled to the Wikimania conference in Montréal to continue promoting closer ties between the Wikimedia and OpenStreetMap movements. There’s a long road ahead, but tight coordination between the projects is feeling more inevitable now than it did back in 2015.

Mapbox also sent me to State of the Map U.S. in Boulder to make the case that OpenStreetMap needs a mobile software ecosystem to stay relevant. I’m still busy crafting that OSM-powered map library for iOS. But business needs sent me on a detour building a turn-by-turn navigation library to complement it. Well I love detours, or longcuts, as I remind myself after forgetting to make that left turn for the dozenth time. Maybe I need a smartphone after all.

Then again, I love the fact that I can be an unabashed roadgeek and get paid for it. The U.S. road system is fantastically idiosyncratic, so the state of the art in navigation software falls quite short still. My typical hobbyist obsession with route shields and the like can ultimately benefit the motoring public through better software.

When it comes to navigation, my job formal qualifications amount to riding shotgun on the hour-long bus ride home from school – lest I get motion sickness – plus navigating from the backseat during road trips, keeping one eye on the radar detector and the other on the Watchman. But if nothing else, those experiences help me counter the Calicentrism that shows up in surprising ways in this field.

Speaking of California, OpenStreetMap’s coverage of San José is really looking up these days. As I briefly mentioned in Boulder, our coverage of points of interest is beginning to rival more established sources. To prove it, I manually counted the entries of the local phone book, thereby cementing my reputation among Mapboxers as a phone geek.

Mozilla finally killed off support for the extension platform that made Firefox a household name and kept the browser relevant during all these years of Chrome hegemony. Mozilla couched it as a speed boost, but Vietnamese speakers quickly discovered my keyboarding extension, AVIM, among the casualty list. They really have no good alternative for writing in their language. Hopefully I’ll be able to revive AVIM atop Firefox’s new extension architecture (really, Chrome’s) in the new year. It’ll involve some lobbying and mucking around in Firefox internals, which is a road I didn’t anticipate taking when I took over that extension a decade ago.

Another hobby of mine succumbed to technical debt this year: the blog you’re reading. It’s hobbling along again, thanks to a last-minute upgrade. But it’s only a matter of time before I have to move it off Movable Type. It’s been a solid 15 years or so.

Time flies. I flew a bit this year too, but not enough to shake the roadgeek out of me.

Languages

Designs


This weblog is licensed under a Creative Commons License.

Powered by Movable Type 5.2.13