Planet Topic Maps

July 29, 2016

Patrick Durusau

QRLJacking [July 28, 2016]

QRLJacking — Hacking Technique to Hijack QR Code Based Quick Login System by Swati Khandelwal.

I put today’s date in the title so several years from now when a “security expert” breathlessly reports on “terrorists” using QRLJcking, you can easily find that it has been in use for years.

For some reason, “security experts” fail to mention that governments, banks, privacy advocates and numerous others in all walks of life and business use cybersecure services. Maybe that’s not a selling point for them. You think?

In any event, Swati gives a great introduction to QRLJacking, starting with:

Do you know that you can access your WeChat, Line and WhatsApp chats on your desktop as well using an entirely different, but fastest authentication system?

It’s SQRL, or Secure Quick Response Login, a QR-code-based authentication system that allows users to quickly sign into a website without having to memorize or type in any username or password.

QR codes are two-dimensional barcodes that contain a significant amount of information such as a shared key or session cookie.

A website that implements QR-code-based authentication system would display a QR code on a computer screen and anyone who wants to log-in would scan that code with a mobile phone app.

Once scanned, the site would log the user in without typing in any username or password.

Since passwords can be stolen using a keylogger, a man-in-the-middle (MitM) attack, or even brute force attack, QR codes have been considered secure as it randomly generates a secret code, which is never revealed to anybody else.

But, no technology is immune to being hacked when hackers are motivated.

Following this post and the resources therein, you will be well prepared for when your usual targets decide to “upgrade” to SQRL, or Secure Quick Response Login.


PS: There is a well-known pattern in this attack, one that is true for other online security systems. Do you see it?

by Patrick Durusau at July 29, 2016 01:53 AM

U.S. Climate Resilience Toolkit

Bringing climate information to your backyard: the U.S. Climate Resilience Toolkit by Tamara Dickinson and Kathryn Sullivan.

From the post:

Climate change is a global challenge that will requires local solutions. Today, a new version of the Climate Resilience Toolkit brings climate information to your backyard.

The Toolkit, called for in the President’s Climate Action Plan and developed by the National Oceanic and Atmospheric Administration (NOAA), in collaboration with a number of Federal agencies, was launched in 2014. After collecting feedback from a diversity of stakeholders, the team has updated the Toolkit to deliver more locally-relevant information and to better serve the needs of its users. Starting today, Toolkit users will find:

  • A redesigned user interface that is responsive to mobile devices;
  • County-scale climate projections through the new version of the Toolkit’s Climate Explorer;
  • A new “Reports” section that includes state and municipal climate-vulnerability assessments, adaptation plans, and scientific reports; and
  • A revised “Steps to Resilience” guide, which communicates steps to identifying and addressing climate-related vulnerabilities.

Thanks to the Toolkit’s Climate Explorer, citizens, communities, businesses, and policy leaders can now visualize both current and future climate risk on a single interface by layering up-to-date, county-level, climate-risk data with maps. The Climate Explorer allows coastal communities, for example, to overlay anticipated sea-level rise with bridges in their jurisdiction in order to identify vulnerabilities. Water managers can visualize which areas of the country are being impacted by flooding and drought. Tribal nations can see which of their lands will see the greatest mean daily temperature increases over the next 100 years.  

A number of decision makers, including the members of the State, Local, and Tribal Leaders Task Force, have called on the Federal Government to develop actionable information at local-to-regional scales.  The place-based, forward-looking information now available through the Climate Explorer helps to meet this demand.

The Climate Resilience Toolkit update builds upon the Administration’s efforts to boost access to data and information through resources such as the National Climate Assessment and the Climate Data Initiative. The updated Toolkit is a great example of the kind of actionable information that the Federal Government can provide to support community and business resilience efforts. We look forward to continuing to work with leaders from across the country to provide the tools, information, and support they need to build healthy and climate-ready communities.

Check out the new capabilities today at!

I have only started to explore this resource but thought I should pass it along.

Of particular interest to me is the integration of data/analysis from this resource with other data.


by Patrick Durusau at July 29, 2016 01:28 AM

July 28, 2016

Patrick Durusau

greek-accentuation 1.0.0 Released

greek-accentuation 1.0.0 Released by James Tauber.

From the post:

greek-accentuation has finally hit 1.0.0 with a couple more functions and a module layout change.

The library (which I’ve previously written about here) has been sitting on 0.9.9 for a while and I’ve been using it sucessfully in my inflectional morphology work for 18 months. There were, however, a couple of functions that lived in the inflectional morphology repos that really belonged in greek-accentuation. They have now been moved there.

If that sounds a tad obscure, some additional explanation from an earlier post by James:

It [greek-accentuation] consists of three modules:

  • characters
  • syllabify
  • accentuation

The characters module provides basic analysis and manipulation of Greek characters in terms of their Unicode diacritics as if decomposed. So you can use it to add, remove or test for breathing, accents, iota subscript or length diacritics.

The syllabify module provides basic analysis and manipulation of Greek syllables. It can syllabify words, give you the onset, nucleus, code, rime or body of a syllable, judge syllable length or give you the accentuation class of word.

The accentuation module uses the other two modules to accentuate Ancient Greek words. As well as listing possible_accentuations for a given unaccented word, it can produce recessive and (given another form with an accent) persistent accentuations.

Another name from my past and a welcome reminder that not all of computer science is focused on recommending ephemera for our consumption.

by Patrick Durusau at July 28, 2016 09:32 PM

Free & Interactive Online Introduction to LaTeX

Free & Interactive Online Introduction to LaTeX by John Lees-Miller.

From the webpage:

Part 1: The Basics

Welcome to the first part of our free online course to help you learn LaTeX. If you have never used LaTeX before, or if it has been a while and you would like a refresher, this is the place to start. This course will get you writing LaTeX right away with interactive exercises that can be completed online, so you don’t have to download and install LaTeX on your own computer.

In this part of the course, we’ll take you through the basics of how LaTeX works, explain how to get started, and go through lots of examples. Core LaTeX concepts, such as commands, environments, and packages, are introduced as they arise. In particular, we’ll cover:

  • Setting up a LaTeX Document
  • Typesetting Text
  • Handling LaTeX Errors
  • Typesetting Equations
  • Using LaTeX Packages

In part two and part three, we’ll build up to writing beautiful structured documents with figures, tables and automatic bibliographies, and then show you how to apply the same skills to make professional presentations with beamer and advanced drawings with TikZ. Let’s get started!

Since I mentioned fonts earlier today, Learning a Manifold of Fonts, it seems only fair to post about the only typesetting language that can take full advantage of any font you care to use.

TeX was released in 1978 and it has yet to be equaled by any non-TeX/LaTeX system.

It’s almost forty (40) years old, widely used and still sui generis.

by Patrick Durusau at July 28, 2016 09:15 PM



From the webpage:

MorganaXProc is an implementation of W3C’s XProc: An XML Pipeline Language written in Java™. It is free software, released under GNU General Public License version 2.0 (GPLv2).

The current version is 0.95 (public beta). It is very close to the recommendation with all related tests of the XProc Test Suite passed.

News: MorganaXProc 0.95-11 released

You can follow <xml-project/> on Twitter: @xml_project and peruse their documentation.

I haven’t worked my way through A User’s Guide to MorganaXProc but it looks promising.


by Patrick Durusau at July 28, 2016 08:13 PM

Entropy Explained, With Sheep

Entropy Explained, With Sheep by Aatish Bhatia.

Entropy is relevant to information theory, encryption, Shannon, but I mention it here because of the cleverness of the explanation.

Aatish sets a very high bar for taking a difficult concept and creating a compelling explanation that does not involve hand-waving and/or leaps of faith on the part of the reader.

Highly recommended as a model for explanation!


by Patrick Durusau at July 28, 2016 07:34 PM

What That Election Probability Means [500 Simulated Clinton-Trump Elections]

What That Election Probability Means by Nathan Yau.

From the post:

We now have our presidential candidates, and for the next few months you get to hear about the changing probability of Hillary Clinton and Donald Trump winning the election. As of this writing, the Upshot estimates a 68% probability for Clinton and 32% for Donald Trump. FiveThirtyEight estimates 52% and 48% for Clinton and Trump, respectively. Forecasts are kind of all over the place this far out from November. Plus, the numbers aren’t especially accurate post-convention.

But the probabilities will start to converge and grow more significant.

So what does it mean when Clinton has a 68% chance of becoming president? What if there were a 90% chance that Trump wins?

Some interpret a high percentage as a landslide, which often isn’t the case with these election forecasts, and it certainly doesn’t mean the candidate with a low chance will lose. If this were the case, the Cleveland Cavaliers would not have beaten the Golden State Warriors, and I would not be sitting here hating basketball.

Fiddle with the probabilities in the graphic below to see what I mean.

As always, visualizations from Nathan are a joy to view and valuable in practice.

You need to run it several times but here’s the result I got with “FiveThirtyEight estimates 52% and 48% for Clinton and Trump, respectively.”


You have to wonder what a similar simulation for breach/no-breach would look like for your enterprise?

Would that be an effective marketing tool for cybersecurity?

Perhaps not if you are putting insecure code on top of insecure code but there are other solutions.

For example, having state legislatures prohibit the operation of escape from liability clauses in EULAs.

Assuming someone who has read one in sufficient detail to draft legislation. ;-)

That could be an interesting data project. Anyone have a pointer to a collection of EULAs?

by Patrick Durusau at July 28, 2016 07:05 PM

Saxon-JS – Beta Release (EE-License)


From the webpage:

Saxon-JS is an XSLT 3.0 run-time written in pure JavaScript. It’s designed to execute Stylesheet Export Files compiled by Saxon-EE.

The first beta release is Saxon-JS 0.9 (released 28 July 2016), for use on web browsers. This can be used with Saxon-EE or later.

The beta release has been tested with current versions of Safari, Firefox, and Chrome browsers. It is known not to work under Internet Explorer. Browser support will be extended in future releases. Please let us know of any problems.

Saxon-JS documentation.

Goodies from the documentation:

Because people want to write rich interactive client-side applications, Saxon-JS does far more than simply converting XML to HTML, in the way that the original client-side XSLT 1.0 engines did. Instead, the stylesheet can contain rules that respond to user input, such as clicking on buttons, filling in form fields, or hovering the mouse. These events trigger template rules in the stylesheet which can be used to read additional data and modify the content of the HTML page.

We’re talking here primarily about running Saxon-JS in the browser. However, it’s also capable of running in server-side JavaScript environments such as Node.js (not yet fully supported in this beta release).

Grab a copy to get ready for discussions at Balisage!

by Patrick Durusau at July 28, 2016 03:20 PM

Web Design in 4 minutes

Web Design in 4 minutes by Jeremy Thomas.

From the post:

Let’s say you have a product, a portfolio, or just an idea you want to share with everyone on your own website. Before you publish it on the internet, you want to make it look attractive, professional, or at least decent to look at.

What is the first thing you need to work on?

This is more for me than you, especially if you consider my much neglected homepage.

Over the years my blog has consumed far more of my attention than my website.

I have some new, longer material that is more appropriate for the website so this post is a reminder to me to get my act together over there!

Other web design resource suggestions welcome!

by Patrick Durusau at July 28, 2016 02:56 PM

Learning a Manifold of Fonts

Learning a Manifold of Fonts by Neill D.F. Campbell and Jan Kautz.


The design and manipulation of typefaces and fonts is an area requiring substantial expertise; it can take many years of study to become a proficient typographer. At the same time, the use of typefaces is ubiquitous; there are many users who, while not experts, would like to be more involved in tweaking or changing existing fonts without suffering the learning curve of professional typography packages.

Given the wealth of fonts that are available today, we would like to exploit the expertise used to produce these fonts, and to enable everyday users to create, explore, and edit fonts. To this end, we build a generative manifold of standard fonts. Every location on the manifold corresponds to a unique and novel typeface, and is obtained by learning a non-linear mapping that intelligently interpolates and extrapolates existing fonts. Using the manifold, we can smoothly interpolate and move between existing fonts. We can also use the manifold as a constraint that makes a variety of new applications possible. For instance, when editing a single character, we can update all the other glyphs in a font simultaneously to keep them compatible with our changes.

To get a realistic feel for this proposal, try the interactive demo!

One major caveat:

In another lifetime, I contacted John Hudson of Tyro Typeworks about the development of the SBL Font series:


The origins of that project are not reflected on the SBL webpage, but the difference between John’s work and that of non-professional typographers is obvious even to untrained readers.

Nothing against experimentation with fonts but realize that for truly professional results, you need to hire professionals who live and breath the development of high quality fonts.

by Patrick Durusau at July 28, 2016 02:43 PM

First Steps In The 30K Hillary Clinton Email Hunt

No, no tips from “Russian hackers,” but rather from the fine staff at the Wall Street Journal (WSJ).

You may have heard of the WSJ. So far as I know, they have never been accused of collaboration with Russian hackers, Putin or the KGB.

Anyway, the WSJ posted: Get and analyze Hillary Clinton’s email, which reads in part as follows:

In response to a public records request, the U.S. State Department is releasing Hillary Clinton’s email messages from her time as secretary of state. Every month, newly released messages are posted to as PDFs, with some metadata.

This collection of tools automates downloading and helps analyze the messages. The Wall Steet Journal’s interactive graphics team uses some of this code to power our Clinton inbox search interactive.

Great step-by-step instructions on getting setup to analyze Clinton’s emails, with the one caveat that I had to change:

pip install virtualenv


sudo pip install virtualenv

With that one change, everything ran flawlessly on my Ubuntu 14.04 box.

Go ahead and get setup to analyze the emails.

Tomorrow: Clues from this data set to help in the hunt for the 30K deleted Hillary Clinton emails.

by Patrick Durusau at July 28, 2016 01:37 AM

July 27, 2016

Patrick Durusau

The Right to be Forgotten in the Media: A Data-Driven Study

The Right to be Forgotten in the Media: A Data-Driven Study by , , , , .


Due to the recent “Right to be Forgotten” (RTBF) ruling, for queries about an individual, Google and other search engines now delist links to web pages that contain “inadequate, irrelevant or no longer relevant, or excessive” information about that individual. In this paper we take a data-driven approach to study the RTBF in the traditional media outlets, its consequences, and its susceptibility to inference attacks. First, we do a content analysis on 283 known delisted UK media pages, using both manual investigation and Latent Dirichlet Allocation (LDA). We find that the strongest topic themes are violent crime, road accidents, drugs, murder, prostitution, financial misconduct, and sexual assault. Informed by this content analysis, we then show how a third party can discover delisted URLs along with the requesters’ names, thereby putting the efficacy of the RTBF for delisted media links in question. As a proof of concept, we perform an experiment that discovers two previously-unknown delisted URLs and their corresponding requesters. We also determine 80 requesters for the 283 known delisted media pages, and examine whether they suffer from the “Streisand effect,” a phenomenon whereby an attempt to hide a piece of information has the unintended consequence of publicizing the information more widely. To measure the presence (or lack of presence) of a Streisand effect, we develop novel metrics and methodology based on Google Trends and Twitter data. Finally, we carry out a demographic analysis of the 80 known requesters. We hope the results and observations in this paper can inform lawmakers as they refine RTBF laws in the future.

Not collecting data prior to laws and policies seems to be a trademark of the legislative process.

Otherwise, the “Right to be Forgotten” (RTBF) nonsense that only impacts searching and then only in particular ways could have been avoided.

The article does helpfully outline how to discover delistings, of which they discovered 283 known delisted links.

Seriously? Considering that Facebook has 1 Billion+ users, much ink and electrons are being spilled over a minimum of 283 delisted links?

It’s time for the EU to stop looking for mites and mole hills to attack.

Especially since they are likely to resort to outright censorship as their next move.

That always ends badly.

by Patrick Durusau at July 27, 2016 09:55 PM

The Hillary Clinton 30K Email Hunt – Defend Your Nation’s Honor – Enter Today!

Would-be strongman (US President) Donald Trump insulted North Korean, Chinese, East European, to say nothing of American hackers today:

Donald J. Trump said Wednesday that he hoped Russia had hacked Hillary Clinton’s email, essentially encouraging an adversarial foreign power’s cyberspying on a secretary of state’s correspondence.

“Russia, if you’re listening, I hope you’re able to find the 30,000 emails that are missing,” Mr. Trump said, staring directly into the cameras. “I think you will probably be rewarded mightily by our press.”

(Donald Trump Calls on Russia to Find Hillary Clinton’s Missing Emails by Ashley Parker.)

Russia’s name has been thrown around recently, like the “usual suspects” in Casablanca, but that’s no excuse for Trump to insult other worthy hackers.

No slight to Russian hackers but an open competition between all hackers is the best way to find the 30K deleted Clinton emails.

Trump hasn’t offered a cash prize but think of the street cred you would earn for your nation/group!

Don’t limit yourself to the deleted emails.

Making Clinton’s campaign security the equivalent of an extreme string bikini results in bragging rights as well.

by Patrick Durusau at July 27, 2016 06:56 PM

July 26, 2016

Patrick Durusau

Gasp! “The Jihadists’ Digital Toolbox:…”

The Jihadists’ Digital Toolbox: How ISIS Keeps Quiet on the Web by Jett Goldsmith.

From the post:

As the world dives deeper into the digital age, jihadist groups like ISIS and the Taliban have taken increasingly diverse measures to secure their communications and espouse their actions and ideas across the planet.

Propaganda has been a key measure of any jihadist group’s legitimacy since at least 2001, when al-Qaeda operative Adam Yahiye Gadahn established the media house As-Sahab, which was intended to spread the group’s message to a regional audience throughout Pakistan and Afghanistan.

Over the years, jihadist propaganda has taken a broader and more sophisticated tone. Al-Qaeda published the first issue of its digital newsmagazine, Inspire, in June of 2010. Inspire was aimed at an explicitly Western audience, and intended to call to jihad the would-be mujahideen throughout Europe and the United States.

When ISIS first took hold in Iraq and Syria, and formally declared its caliphate in the summer of 2014, the group capitalized on the groundwork laid by its predecessors and established an expansive, highly sophisticated media network to espouse its ideology. The group established local wilayat (provincial) media hubs, and members of its civil service distributed weekly newsletters, pamphlets, and magazines to citizens living under its caliphate. Billboards were posted in major cities under its control, including in Raqqah and Mosul; FM band radio broadcasts across 13 of its provinces were set up to deliver a variety of content, from fatwas and sharia lessons to daily news, poetry, and nasheeds; and Al-Hayat Media Center distributed its digital newsmagazine, Dabiq, in over a dozen languages to followers across the world.

Jeff covers:

  • Secure Browsers
  • Proxy Servers and VPNs
  • Propaganda Apps (read cellphone apps)
  • Encrypted Email
  • Mobile Privacy Apps
  • Encrypted Messages

That Jihadists or anyone else are using these tools maybe a surprise to some Fortune or Economist readers, but every conscious person associated with IT can probably name one or more instances for each category.

I’m sure some Jihadists drive cars, ride hoverboards, or bicycles, but dramatic recitations on those doesn’t advance a discussion of Jihadists or their goals.

Privacy software is a fact of life in all walks and levels of a digital environment.

Crying “Look! Over there! Someone might be doing something we don’t like!” isn’t going to lead to any useful answers, to anything. Including Jihadists.

by Patrick Durusau at July 26, 2016 09:02 PM

July 25, 2016

Patrick Durusau

PornHub Payday! $20,000!

PornHub Pays Hackers $20,000 to Find Zero-day Flaws in its Website by Wang Wei.

From the post:

Cyber attacks get bigger, smarter, more damaging.

PornHub launched its bug bounty program two months ago to encourage hackers and bug bounty hunters to find and responsibly report flaws in its services and get rewarded.

Now, it turns out that the world’s most popular pornography site has paid its first bounty payout. But how much?

US $20,000!

Not every day that a porn site pays users!

While PHP has fixed the issue, be mindful there are plenty of unpatched versions of PHP in the wild.

Details of this attack can be found at: How we broke PHP, hacked Pornhub and earned $20,000 and Fuzzing Unserialize.

Any estimate of how many non-patched PHP installations are on sites ending in .gov or .com?

by Patrick Durusau at July 25, 2016 09:32 PM

Accessing IRS 990 Filings (Old School)

Like many others, I was glad to see: IRS 990 Filings on AWS.

From the webpage:

Machine-readable data from certain electronic 990 forms filed with the IRS from 2011 to present are available for anyone to use via Amazon S3.

Form 990 is the form used by the United States Internal Revenue Service to gather financial information about nonprofit organizations. Data for each 990 filing is provided in an XML file that contains structured information that represents the main 990 form, any filed forms and schedules, and other control information describing how the document was filed. Some non-disclosable information is not included in the files.

This data set includes Forms 990, 990-EZ and 990-PF which have been electronically filed with the IRS and is updated regularly in an XML format. The data can be used to perform research and analysis of organizations that have electronically filed Forms 990, 990-EZ and 990-PF. Forms 990-N (e-Postcard) are not available withing this data set. Forms 990-N can be viewed and downloaded from the IRS website.

I could use AWS but I’m more interested in deep analysis of a few returns than analysis of the entire dataset.

Fortunately the webpage continues:

An index listing all of the available filings is available at s3://irs-form-990/index.json. This file includes basic information about each filing including the name of the filer, the Employer Identificiation Number (EIN) of the filer, the date of the filing, and the path to download the filing.

All of the data is publicly accessible via the S3 bucket’s HTTPS endpoint at No authentication is required to download data over HTTPS. For example, the index file can be accessed at and the example filing mentioned above can be accessed at (emphasis in original).

I open a terminal window and type:


which as of today, results in:

-rw-rw-r-- 1 patrick patrick 1036711819 Jun 16 10:23 index.json

A trial grep:

grep "NATIONAL RIFLE" index.json > nra.txt

Which produces:

{“EIN”: “530116130”, “SubmittedOn”: “2014-11-25″, “TaxPeriod”: “201312”, “DLN”: “93493309004174”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201423099349300417”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2013-12-20″, “TaxPeriod”: “201212”, “DLN”: “93493260005203”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201302609349300520”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2012-12-06″, “TaxPeriod”: “201112”, “DLN”: “93493311011202”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201203119349301120”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “396056607”, “SubmittedOn”: “2011-05-12″, “TaxPeriod”: “201012”, “FormType”: “990EZ”, “LastUpdated”: “2016-06-14T01:22:09.915971Z”, “OrganizationName”: “EAU CLAIRE NATIONAL RIFLE CLUB”, “IsElectronic”: false, “IsAvailable”: false},
{“EIN”: “530116130”, “SubmittedOn”: “2011-11-09″, “TaxPeriod”: “201012”, “DLN”: “93493270005081”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201132709349300508”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2016-01-11″, “TaxPeriod”: “201412”, “DLN”: “93493259005035”, “LastUpdated”: “2016-04-29T13:40:20″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201532599349300503”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},

We have one errant result, the “EAU CLAIRE NATIONAL RIFLE CLUB,” so let’s delete that, re-order by year and the NATIONAL RIFLE ASSOCIATION OF AMERICA result reads (most recent to oldest):

{“EIN”: “530116130”, “SubmittedOn”: “2016-01-11″, “TaxPeriod”: “201412”, “DLN”: “93493259005035”, “LastUpdated”: “2016-04-29T13:40:20″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201532599349300503”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2014-11-25″, “TaxPeriod”: “201312”, “DLN”: “93493309004174”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201423099349300417”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2013-12-20″, “TaxPeriod”: “201212”, “DLN”: “93493260005203”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201302609349300520”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2012-12-06″, “TaxPeriod”: “201112”, “DLN”: “93493311011202”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201203119349301120”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2011-11-09″, “TaxPeriod”: “201012”, “DLN”: “93493270005081”, “LastUpdated”: “2016-03-21T17:23:53″, “URL”: “”, “FormType”: “990”, “ObjectId”: “201132709349300508”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},

Of course, now you want the XML 990 returns, so extract the URLs for the 990s to a file, here nra-urls.txt (I would use awk if it is more than a handful):

Back to wget:

wget -i nra-urls.txt


-rw-rw-r– 1 patrick patrick 111798 Mar 21 16:12 201132709349300508_public.xml
-rw-rw-r– 1 patrick patrick 123490 Mar 21 19:47 201203119349301120_public.xml
-rw-rw-r– 1 patrick patrick 116786 Mar 21 22:12 201302609349300520_public.xml
-rw-rw-r– 1 patrick patrick 122071 Mar 21 15:20 201423099349300417_public.xml
-rw-rw-r– 1 patrick patrick 132081 Apr 29 10:10 201532599349300503_public.xml

Ooooh, it’s in XML! ;-)

For the XML you are going to need: Current Valid XML Schemas and Business Rules for Exempt Organizations Modernized e-File, not to mention a means of querying the data (may I suggest XQuery?).

Once you have the index.json file, with grep, a little awk and wget, you can quickly explore IRS 990 filings for further analysis or to prepare queries for running on AWS (such as discovery of common directors, etc.).


by Patrick Durusau at July 25, 2016 07:36 PM

Software Heritage – Universal Software Archive – Indexing/Semantic Challenges

Software Heritage

From the homepage:

We collect and preserve software in source code form, because software embodies our technical and scientific knowledge and humanity cannot afford the risk of losing it.

Software is a precious part of our cultural heritage. We curate and make accessible all the software we collect, because only by sharing it we can guarantee its preservation in the very long term.
(emphasis in original)

The project has already collected:

Even though we just got started, we have already ingested in the Software Heritage archive a significant amount of source code, possibly assembling the largest source code archive in the world. The archive currently includes:

  • public, non-fork repositories from GitHub
  • source packages from the Debian distribution (as of August 2015, via the snapshot service)
  • tarball releases from the GNU project (as of August 2015)

We currently keep up with changes happening on GitHub, and are in the process of automating syncing with all the above source code origins. In the future we will add many more origins and ingest into the archive software that we have salvaged from recently disappeared forges. The figures below allow to peek into the archive and its evolution over time.

The charters of the planned working groups:

Extending the archive

Evolving the archive

Connecting the archive

Using the archive

on quick review did not seem to me to address the indexing/semantic challenges that searching such an archive will pose.

If you are familiar with the differences in metacharacters between different Unix programs, that is only a taste of the differences that will be faced when searching such an archive.

Looking forward to learning more about this project!

by Patrick Durusau at July 25, 2016 12:49 AM

July 23, 2016

Patrick Durusau

Wikileaks Mentions In DNC Email – .000718%. Hillary To/From Emails – .000000% (RDON)

Cryptome tweeted today:


Would you believe that Hillary Clinton is more irrelevant than Wikileaks?

Consider the evidence:

Search for at Search the DNC email database

Scrape the 533 results, as of Saturday, 23 July 2016, into a file.

Grep for and pipe that to another file.

Clean out the remaining markup, insert line returns for commas in cc: field, lowercase and sort, then uniq.


  1. – Adrienne K. Elrod
  2. – never a sender
  3. – Dennis Cheng
  4. – never a sender
  5. – Justin Klein
  6. – Josh Schwerin
  7. – Kathleen Gasperine
  8. – Lindsay Roitman
  9. – never a sender
  10. – Mary Rutherford Jennings
  11. – no author
  12. – 1 post, no sig
  13. – Zac Petkanas

That’s right! From January of 2015 until May of 2016, Hillary Clinton apparently had no emails to or from the DNC.

I find that to be unlikely to say the least.

What’s your explanation for the absence of Hillary Clinton emails to and from the DNC?

My explanation that Wikileaks is manipulating both the data and all of us.

Here’s a motto for data leaks: Raw Data Or Nothing (RDON)

Say it, repeat it, demand it – RDON!

by Patrick Durusau at July 23, 2016 09:30 PM

Yes Luis, There Is A Fuck You Emoji

Luis Miranda, Communications Director of the DNC asks:


Yes, there is a Fuck You emoji!

For example, here is the Google version:


I don’t know if Luis is still looking for an answer to that question but if so, consider it answered!

Searching the DNC email database can be amusing, even educational as the question from Luis demonstrates, I would prefer the ability to browse and to download the dataset for deeper analysis.

What have you found in the DNC email database?

by Patrick Durusau at July 23, 2016 01:29 AM

July 22, 2016

Patrick Durusau

Write Chelsea Manning

Write Chelsea Manning

From the post:

Thank you for supporting WikiLeaks whistle-blower US Army Private Chelsea (formerly Bradley) Manning! You can write her today. As of April 23, 2014, a Kansas district judge has approved PVT Manning’s request for legal name change, and you can address your envelopes to her as “Chelsea E. Manning.”

Mail must be addressed exactly as follows:


Notes regarding this address:

  • Do not include a hash (“#”) in front of Manning’s inmate number.
  • Do not include any title in front of Manning’s name, such as “Ms.,” “Mr.,” “PVT,” “PFC,” etc.
  • Do not include any additional information in the address, such as “US Army” or “US Disciplinary Barracks.”
  • Do not modify the address to conform to USPS standards, such as abbreviating “North,” “Road,” “Fort,” or “Kansas.”
  • For international mail, either “USA” or “UNITED STATES OF AMERICA” are acceptable on a separate line.

What you can send Chelsea

Chelsea Manning is currently eligible to receive mail, including birthday or holiday cards, from anyone who wishes to write. You are also permitted to mail unframed photographs. …

I contacted the project and was advised that the best gift for Chelsea is:

…money order or cashiers check made out to “Chelsea E. Manning” and mailed to her postal address. These funds will be deposited into Chelsea’s prison account. She uses this account to make phone calls, purchase stamps, and buy other small comfort items not provided by the prison.

Let Chelsea know you appreciate her bravery and sacrifice!

by Patrick Durusau at July 22, 2016 09:46 PM

July 21, 2016

Patrick Durusau

Introspection For Your iPhone (phone security)

Against the Law: Countering Lawful Abuses of Digital Surveillance by Andrew “bunnie’ Huang and Edward Snowden.

From the post:

Front-line journalists are high-value targets, and their enemies will spare no expense to silence them. Unfortunately, journalists can be betrayed by their own tools. Their smartphones are also the perfect tracking device. Because of the precedent set by the US’s “third-party doctrine,” which holds that metadata on such signals enjoys no meaningful legal protection, governments and powerful political institutions are gaining access to comprehensive records of phone emissions unwittingly broadcast by device owners. This leaves journalists, activists, and rights workers in a position of vulnerability. This work aims to give journalists the tools to know when their smart phones are tracking or disclosing their location when the devices are supposed to be in airplane mode. We propose to accomplish this via direct introspection of signals controlling the phone’s radio hardware. The introspection engine will be an open source, user-inspectable and field-verifiable module attached to an existing smart phone that makes no assumptions about the trustability of the phone’s operating system.

If that sounds great, you have to love their requirements:

Our introspection engine is designed with the following goals in mind:

  1. Completely open source and user-inspectable (“You don’t have to trust us”)
  2. Introspection operations are performed by an execution domain completely separated from the phone’s CPU (“don’t rely on those with impaired judgment to fairly judge their state”)
  3. Proper operation of introspection system can be field-verified (guard against “evil maid” attacks and hardware failures)
  4. Difficult to trigger a false positive (users ignore or disable security alerts when there are too many positives)
  5. Difficult to induce a false negative, even with signed firmware updates (“don’t trust the system vendor” – state-level adversaries with full cooperation of system vendors should not be able to craft signed firmware updates that spoof or bypass the introspection engine)
  6. As much as possible, the introspection system should be passive and difficult to detect by the phone’s operating system (prevent black-listing/targeting of users based on introspection engine signatures)
  7. Simple, intuitive user interface requiring no specialized knowledge to interpret or operate (avoid user error leading to false negatives; “journalists shouldn’t have to be cryptographers to be safe”)
  8. Final solution should be usable on a daily basis, with minimal impact on workflow (avoid forcing field reporters into the choice between their personal security and being an effective journalist)

This work is not just an academic exercise; ultimately we must provide a field-ready introspection solution to protect reporters at work.

You need to copy those eight requirements out to a file for editing. When anyone proposes a cybersecurity solution, reword as appropriate as your user requirements.

An artist conception of what protection for an iPhone might look like:


Interested in protecting reporters and personal privacy? Follow Andrew ‘bunnie’ Huang’s blog.

by Patrick Durusau at July 21, 2016 09:24 PM

An analysis of Pokémon Go types, created with R

An analysis of Pokémon Go types, created with R by David Smith.

From the post:

As anyone who has tried Pokémon Go recently is probably aware, Pokémon come in different types. A Pokémon’s type affects where and when it appears, and the types of attacks it is vulnerable to. Some types, like Normal, Water and Grass are common; others, like Fairy and Dragon are rare. Many Pokémon have two or more types.

To get a sense of the distribution of Pokémon types, Joshua Kunst used R to download data from the Pokémon API and created a treemap of all the Pokémon types (and for those with more than 1 type, the secondary type). Johnathon’s original used the 800+ Pokémon from the modern universe, but I used his R code to recreate the map for the 151 original Pokémon used in Pokémon Go.

If you or your dog:


need a break from Pokémon Go, check out this post!

You will get some much needed rest, polish up your R skills and perhaps learn something about the Pokémon API.

The Pokémon Go craze brings to mind the potential for the creation of alternative location-based games. Accessing locations which require steady nerves and social engineering skills. That definitely has potential.

Say a spy-vs-spy character at a location near a “secret” military base? ;-)

by Patrick Durusau at July 21, 2016 08:35 PM

Why You Can’t Keep Secrets (Or Be Cybersecure)

Why You Can’t Keep Secrets by William M. Arkin.

From the post:

I started thinking about this talk by polling friends in Washington to see if there were any good new jokes about secrecy. In other parts of the world, political jokes are often the purest expression of zeitgeist, so I thought a current favorite — you know, some knee slapper about the new Executive Order on classification, or one about the latest string of Bill Gertz’ leaks — would provide astute insight.

No dice though; people inside the beltway have never been renown for their humor.

In May, however, I was in Beirut, and the number of jokes about the Syrians were impressive.

Here’s my favorite.

Hafez Assad is with Bill Clinton and Jacques Chirac on the Mississippi River to negotiate Syria’s withdrawal from Lebanon. Assad drops his watch into the river and when he bend over the deck railing to look for it, snapping alligators thrust up from the deep. Clinton tells one of the Marine guards to retrieve President Assad’s watch. The Marine goes to the edge, looks over at the alligators and says to the President Mr. President, you know we live in the greatest country on earth, and therefore I can decline an unlawful order. If I jump in to retrieve Mr. Assad’s watch I would die, and besides I have a family…

So Chirac, thinking he can tweak the American nose says to a French soldier, jump in the water and retrieve Assad’s watch. The legionnaire snaps to attention and runs to dive in, but he then looks over and sees the snapping alligators, and turns to Chirac and says Monsieur President, you know our democracy is even older than America, and besides, I have a family…

So Assad whispers something in the ear of a Syrian soldier, who runs to the railing and without hesitation, jumps in the water, swims through the alligators, retrieves the watch, and returns safely to the boat. The Marine and the Legionnaire, both amazed, crowd around the Syrian to ask what Assad said.

Well, the soldier explains, I too have a family…


So what does this have to do with secrecy?

To me, it is a real world reminder that to level any kind of indictment about the evils of U.S. government secrecy is to be trivial. One only has to visit places like the Middle East to appreciate how free our system is.

Given the current events in Syria, a timely posting of a speech that Arkin made:

…twenty years ago to military and industry officers and officials at the annual U.S. Air Force National Security Leadership Course, Maxwell AFB, Alabama, delivered on 14 August 1996.

The central difficulty of secrecy and cybersecurity are both captured by the line:

Anyone knows that in order to preserve real secrets, they need to be identified.

As opposed to the blanket classification of nearly every document, memo, draft, email, etc., which is nearly the current practice in the Obama administration, you have to pick which secrets are truly worth protecting. And then protect them.

As Arkin points out, to do otherwise generates a climate where leaks are a routine part of government and generates suspicion even when the government, perhaps by accident, is telling the truth.

The same principle is true for cybersecurity. Have you identified the components of your network and the level of security appropriate to each one? Or do VPs still have write access to the accounting software?

For meaningful secrecy or cybersecurity, you must have explicit identification of what is to be secret/secure and what steps are taken to bring that about. Anything less and you won’t be able to keep secrets and/or have cybersecurity. (Ask the Office of Personnel Management (OPM) for example.)

by Patrick Durusau at July 21, 2016 08:08 PM

Twitter Nanny Says No! No!


For the other side of this story, enjoy Milo Yiannopoulos’s Twitter ban, explained by Aja Romano, where Aja is supportive of Twitter and its self-anointed role as arbiter of social values.

From my point of view, the facts are fairly simple:

Milo Yiannopoulos (formerly @Nero) has been banned from Twitter on the basis of his speech and the speech of others who agree with him.

What more needs to be said?

I have not followed, read, reposted or retweeted any tweets by Milo Yiannopoulos (formerly @Nero). And would not even if someone sent them to me.

I choose to not read that sort of material and so can anyone else. Including the people who complain in Aja’s post.

The Twitter Nanny becomes censor in insisting that no one be able to read tweets from Milo Yiannopoulos (formerly @Nero).

I’ve heard the argument that the First Amendment doesn’t apply to Twitter, which is true, but irrelevant. Only one country in the world has the First Amendment as stated in the US Constitution but that doesn’t stop critics from decrying censorship by other governments.

Or is it only censorship if you agree with the speech being suppressed?

Censorship of speech that I find disturbing, sexist, racist, misogynistic, dehumanizing, transphobic, homophobic, supporting terrorism, is still censorship.

And it is still wrong.

We only have ourselves to blame for empowering Twitter to act as a social media censor. Central point of failure and all that jazz.

Suggestions on a free speech alternative to Twitter?

by Patrick Durusau at July 21, 2016 07:36 PM

Troubling State of Security Cameras? Cybersecurity Spam

The Troubling State of Security Cameras; Thousands of Devices Vulnerable by Ali Raza.

From the post:

The recent Lizard Squad hack which resulted in a lot of CCTV cameras targeted and hijacked by a DDOS attack has highlighted the need for better security cameras. A study conducted by Protection1 shows how many security agencies do not take things seriously, Protection1 report.

The Lizard Squad hack is not the first instance of security cameras being overridden and used to spy on people. The widespread hack has brought to light once again just how many security cameras are under operation without any sort of protection, making them sitting ducks for any hacker with moderate skills. The CCTV cameras in the US that were attacked by the Lizard Squad hack were used in a wide range of areas from home security and traffic cams to cameras in banks and restaurants.

The ease of carrying out this attack prompted security company Protection1 to investigate the matter. The rising levels of sophistication of hacking tools and the incompetence of security personnel to keep in touch with hackers have made hunting much simpler for hackers. In a bid to understand just how serious the situation is, Protection1 analyzed 6,000 unsecured or open cameras all over the United States of America to find out which companies do not take your security seriously. They pulled data from the cameras using and mapped and analyzed the locations to generate results.

Ali re-uses all the graphics from the Protection1 report, which is itself written in a very summary fashion. No in depth coverage of the cameras and/or techniques to access them.

Be aware that Protection1 is a home/business security monitoring type company and not likely to interest cybersecurity fans.

As far as the “troubling state of security cameras,” that depends upon who you ask.

If you are selling security solutions, it is click-bait for customers who want to be more secure.

If you are selling surveillance, access and data collection services, such cameras are additional data sources.

by Patrick Durusau at July 21, 2016 03:49 PM

The History of Cartography

The History of Cartography

From the webpage:

The first volume of the History of Cartography was published in 1987 and the three books that constitute Volume Two appeared over the following eleven years. In 1987 the worldwide web did not exist, and since 1998 book publishing has gone through a revolution in the production and dissemination of work. Although the large format and high quality image reproduction of the printed books (see right column) are still well-suited to the requirements for the publishing of maps, the online availability of material is a boon to scholars and map enthusiasts.

On this site the University of Chicago Press is pleased to present the first three volumes of the History of Cartography in PDF format. Navigate to the PDFs from the left column. Each chapter of each book is a single PDF. The search box on the left allows searching across the content of all the PDFs that make up the first six books.

Links to the parts, which are then divided into separate PDF files of each chapter:

Volume One: Cartography in Prehistoric, Ancient, and Medieval Europe and the Mediterranean

Volume Two: Book 1: Cartography in the Traditional Islamic and South Asian Societies

Volume Two: Book 2: Cartography in the Traditional East and Southeast Asian Societies

Volume Two: Book 3: Cartography in the Traditional African, American, Arctic, Australian, and Pacific Societies

Volume Three: Cartography in the European Renaissance, Part 1

Volume Three: Cartography in the European Renaissance, Part 2

Unless you want to index the parts for yourself, remember the search box at this site that searches across all six volumes.

This can be a real time sink, deeply educational but a time sink none the less.

by Patrick Durusau at July 21, 2016 01:02 AM

What’s the “CFR” and Why Is It So Important to Me?

What’s the “CFR” and Why Is It So Important to Me? Government Printing Office (GPO) blog, GovernmentBookTalk.

From the post:

If you’re a GPO Online Bookstore regular or public official you probably know we’re speaking about the “Code of Federal Regulations.” CFRs are produced routinely by all federal departments and agencies to inform the public and government officials of regulatory changes and updates for literally every subject that the federal government has jurisdiction to manage.

For the general public these constantly updated federal regulations can spell fantastic opportunity. Farmer, lawyer, construction owner, environmentalist, it makes no difference. Within the 50 codes are a wide variety of regulations that impact citizens from all walks of life. Federal Rules, Regulations, Processes, or Procedures on the surface can appear daunting, confusing, and even may seem to impede progress. In fact, the opposite is true. By codifying critical steps to anyone who operates within the framework of any of these sectors, the CFR focused on a particular issue can clarify what’s legal, how to move forward, and how to ultimately successfully translate one’s projects or ideas into reality.

Without CFR documentation the path could be strewn with uncertainty, unknown liabilities, and lost opportunities, especially regarding federal development programs, simply because an interested party wouldn’t know where or how to find what’s available within their area of interest.

The authors of CFRs are immersed in the technical and substantive issues associated within their areas of expertise. For a private sector employer or entrepreneur who becomes familiar with the content of CFRs relative to their field of work, it’s like having an expert staff on board.

I like the CFRs but I stumbled on:

For a private sector employer or entrepreneur who becomes familiar with the content of CFRs relative to their field of work, it’s like having an expert staff on board.

I don’t doubt the expertise of the CFR authors, but their writing often requires an expert for accurate interpretation. If you doubt that statement, test your reading skills on any section of CFR Title 26, Internal Revenue.

Try your favorite NLP parser out on any of the CFRs.

The post lists a number of ways to acquire the CFRs but personally I would use the free Electronic Code of Federal Regulations unless you need to impress clients with the paper version.


by Patrick Durusau at July 21, 2016 12:40 AM

July 20, 2016

Patrick Durusau

Online Sources of Fake News

Not a guide to particular sources, although examples are mentioned, Alastair Reid sets out categories of fake news sources in The 5 sources of fake news everyone needs to look out for online.

From the post:

No, soldiers aren’t being kicked off an army base to make way for Syrian refugees. Sorry, but Ted Cruz didn’t have a Twitter meltdown and blame God for his failed presidential campaign. And that viral video of a woman being chased down a mountainside with a bear is almost definitely fake.

The internet has a fake news problem and some lies can be dangerous. A fantastic story might be entertaining, but misinformation can fundamentally change how people view the world and their fellow citizens, influencing opinions, behaviour and votes.

This isn’t really news – lies have always been part of the fabric of society, whether spoken or written – but the internet has given anyone a platform to share false information and the tools to make untruths ever harder to detect.

Understanding the origins of fake news is part of the process. So where does it come from?

I’m disappointed people are spreading the truth about Ted Cruz not blaming God for his failed campaign. Anything, lie, fact, rumor, etc., that blackens his reputation cannot be a bad thing in my view.

Let obscure history dissertations separate fact from fiction about Ted Cruz several centuries from now. Once we are certain the stake they should drive through his heart upon burial isn’t going to work loose. The important goal now is to limit his ability to harm the public.

And so it is with all “fake” news, there is some goal to be furthered by the spreading of the fake news.

“Official sources of propaganda” are the first group that Alastair mentions and somewhat typically the focus is on non-Western governments, although Western propaganda gets a nod in the last paragraph of that section.

My approach to Western (and other) government reports, statements by government actors or people who want to be government actors is as follows:

  1. They are lying.
  2. Who benefits from this lie? (Contributors, Contractors, Cronies)
  3. Who is disadvantaged by this lie? (Agency infighting, career competitors)
  4. Why lie about this now? (Relationship to other events and actors)
  5. Is this lie consistent/inconsistent with other lies?

What other purpose would statements, reports from the government have if they weren’t intended to influence you?

Do you really think any government wants you to be an independent, well-informed participant in public decision making processes? No wonder you believe fake news so often.

Don’t you find it odd that Western reports of Islamic State bombings are always referred to as “terrorist” events and yet when Allied forces kill another 56 civilians, nary a peep of the moniker “terrorist?”

Alastair’s post is a great read and help towards avoiding some forms of fake news.

There are other sources, such as the reflex to parrot Western government views on events that are more difficult to avoid.

PS: I characterize bombing of civilians as an act of terrorism. Whether the bombing is with a suicide-vest or jet aircraft, the intent is to kill, maim, in short, to terrorize those in the area.

by Patrick Durusau at July 20, 2016 03:34 PM

Is Your IP Address Leaking? – Word for the Day: Trust – Synonym for pwned.

How to See If Your VPN Is Leaking Your IP Address (and How to Stop It) by Alan Henry.

From the post:

To see if your VPN is affected:

  1. Visit a site like What Is My IP Address and jot down your actual ISP-provided IP address.
  2. Log in to your VPN, choose an exit server in another country (or use whichever exit server you prefer) and verify you’re connected.
  3. Go back to What Is My IP Address and check your IP address again. You should see a new address, one that corresponds with your VPN and the country you selected.
  4. Visit Roseler’s WebRTC test page and note the IP address displayed on the page.

If both tools show your VPN’s IP address, then you’re in the clear. However, if What Is My IP Address shows your VPN and the WebRTC test shows your normal IP address, then your browser is leaking your ISP-provided address to the world.

Attempting to conceal your IP address and at the same time leaking it (one assumes unknowingly), can lead to a false sense of security.

Follow the steps Alan outlines to test your setup.

BTW, Alan’s post includes suggestions for how to fix the leak.

If you blindly trust concealment measures and software, you may as well activate links in emails from your local bank.

Word for the Day: Trust – Synonym for pwned.

Verify your concealment on a regular basis.

by Patrick Durusau at July 20, 2016 02:37 PM

July 19, 2016

Patrick Durusau

Proofing Images Tool – GAIA

As I was writing on Alex Duner’s JuxtaposeJS, which creates a slider over two images of the same scene (think before/after), I thought of another tool for comparing photos, a blink comparator.

Blink comparators were invented to make searching photographs of sky images, taken on different nights, for novas, variable stars or planets/asteroids, more efficient. The comparator would show first one image and then the other, rapidly, and any change in the image would stand out to the user. Asteroids would appear to “jump” from one location to another. Variable stars would shrink and swell. Novas would blink in and out.

Originally complex mechanical devices using glass plates, blink comparators are now found in astronomical image processing software, such as:
GAIA – Graphical Astronomy and Image Analysis Tool.

From the webpage:

GAIA is an highly interactive image display tool but with the additional capability of being extendable to integrate other programs and to manipulate and display data-cubes. At present image analysis extensions are provided that cover the astronomically interesting areas of aperture & optimal photometry, automatic source detection, surface photometry, contouring, arbitrary region analysis, celestial coordinate readout, calibration and modification, grid overlays, blink comparison, image defect patching, polarization vector plotting and the ability to connect to resources available in Virtual Observatory catalogues and image archives, as well as the older Skycat formats.

GAIA also features tools for interactively displaying image planes from data-cubes and plotting spectra extracted from the third dimension. It can also display 3D visualisations of data-cubes using iso-surfaces and volume rendering.

It’s capabilities include:

  • Image Display Capabilities
    • Display of images in FITS and Starlink NDF formats.
    • Panning, zooming, data range and colour table changes.
    • Continuous display of the cursor position and image data value.
    • Display of many images.
    • Annotation, using text and line graphics (boxes, circles, polygons, lines with arrowheads, ellipses…).
    • Printing.
    • Real time pixel value table.
    • Display of image planes from data cubes.
    • Display of point and region spectra extracted from cubes.
    • Display of images and catalogues from SAMP-aware applications.
    • Selection of 2D or 3D regions using an integer mask.
  • Image Analysis Capabilities
    • Aperture photometry.
    • Optimal photometry.
    • Automated object detection.
    • Extended surface photometry.
    • Image patching.
    • Arbitrary shaped region analysis.
    • Contouring.
    • Polarization vector plotting and manipulation.
    • Blink comparison of displayed images.
    • Interactive position marking.
    • Celestial co-ordinates readout.
    • Astrometric calibration.
    • Astrometric grid overlay.
    • Celestial co-ordinate system selection.
    • Sky co-ordinate offsets.
    • Real time profiling.
    • Object parameterization.
  • Catalogue Capabilities
    • VO capabilities
      • Cone search queries
      • Simple image access queries
    • Skycat capabilities
      • Plot positions in your field from a range of on-line catalogues (various, including HST guide stars).
      • Query databases about objects in field (NED and SIMBAD).
      • Display images of any region of sky (Digital Sky Survey).
      • Query archives of any observations available for a region of sky (HST, NTT and CFHT).
      • Display positions from local catalogues (allows selection and fine control over appearance of positions).
  • 3D Cube Handling
    • Display of image slices from NDF and FITS cubes.
    • Continuous extraction and display of spectra.
    • Collapsing, animation, detrending, filtering.
    • 3D visualisation with iso-surfaces and volume rendering.
    • Celestial, spectral and time coordinate handling.
  • CUPID catalogues and masks
    • Display catalogues in 2 or 3D
    • Display selected regions of masks in 2 or 3D

(highlighting added)

With a blink comparator, when offered an image you can quickly “proof” it against an earlier image of the same scene, looking for any enhancements or changes.

Moreover, if you have drone-based photo-reconnaissance images, a tool like GAIA will give you the capability to quickly compare them to other images.

I am hopeful you will also use this as an opportunity to explore the processing of astronomical images, which is an innocent enough explanation for powerful image processing software on your computer.

by Patrick Durusau at July 19, 2016 09:09 PM


JuxtaposeJS Frame comparisons. Easy to make. Seamless to publish. (Northwestern University Knight Lab, Alex Duner.)

From the webpage:

JuxtaposeJS helps storytellers compare two pieces of similar media, including photos, and GIFs. It’s ideal for highlighting then/now stories that explain slow changes over time (growth of a city skyline, regrowth of a forest, etc.) or before/after stories that show the impact of single dramatic events (natural disasters, protests, wars, etc.).

It is free, easy to use, and works on all devices. All you need to get started are links to the images you’d like to compare.

Perhaps an unexpected use, but if you are stumped on a “find all the differences” pair of photos, split them and create a slider!

This isn’t a hard one but for example use these two images:

As the slider moves over a change between the two images, your eye will be drawn towards the motion. (Visit Cranium Crunches Blog for more puzzles and images like this one.)

On a more serious note, imagine the use of this app for comparison of aerial imagery (satellite, plane, drone) and using the human eye to spot changes in images. Could be more timely than streaming video for automated analysis.

Or put differently, it isn’t the person with the most intell, eventually, that wins, but the person with the best intell, in time.

by Patrick Durusau at July 19, 2016 06:17 PM

Colorblind-Friendly Graphics

Three tools to help you make colorblind-friendly graphics by Alex Duner.

From the post:

I am one of the 8% of men of Northern European descent who suffers from red-green colorblindness. Specifically, I have a mild case of protanopia (also called protanomaly), which means that my eyes lack a sufficient number of retinal cones to accurately see red wavelengths. To me some purples appear closer to blue; some oranges and light greens appear closer to yellow; dark greens and brown are sometimes indistinguishable.

Most of the time this has little impact on my day-to-day life, but as a news consumer and designer I often find myself struggling to read certain visualizations because my eyes just can’t distinguish the color scheme. (If you’re not colorblind and are interested in experiencing it, check out Dan Kaminsky’s iPhone app DanKam which uses augmented reality to let you experience the world through different color visions.)

As information architects, data visualizers and web designers, we need to make our work accessible to as many people as possible, which includes people with colorblindness.

Alex is writing from a journalism perspective but accessibility is a concern for any information delivery system.

A pair of rather remarkable tools, Vischeck, simulates colorblindness on your images and Daltonize, “corrects” images for colorblind users will be useful in vetting your graphics. Both are available at: Plugins for Photoshop (Win/Mac/ImageJ).

Loren Petrich has a collection of resources, including filters for GIMP to simulate colorblindness at: Color-Blindness Simulators.

by Patrick Durusau at July 19, 2016 02:45 PM

1960’s Flashback: Important Tor Nodes Shutting Down

Swati Khandelwal reports the departure of Lucky Green from the Tor project will result in the loss of several critical Tor nodes and require an update to Tor code. (Core Tor Contributor Leaves Project; Shutting Down Important Tor Nodes)

Here’s the Tonga (Bridge Authority) Permanent Shutdown Notice in full:

Dear friends,

Given recent events, it is no longer appropriate for me to materially contribute to the Tor Project either financially, as I have so generously throughout the years, nor by providing computing resources. This decision does not come lightly; I probably ran one of the first five nodes in the system and my involvement with Tor predates it being called “Tor” by many years.

Nonetheless, I feel that I have no reasonable choice left within the bounds of ethics, but to announce the discontinuation of all Tor-related services hosted on every system under my control.

Most notably, this includes the Tor node “Tonga”, the “Bridge Authority”, which I recognize is rather pivotal to the network

Tonga will be permanently shut down and all associated crytographic keys destroyed on 2016-08-31. This should give the Tor developers ample time to stand up a substitute. I will terminate the chron job we set up so many years ago at that time that copies over the descriptors.

In addition to Tonga, I will shut down a number of fast Tor relays, but the directory authorities should detect that shutdown quickly and no separate notice is needed here.

I wish the Tor Project nothing but the best moving forward through those difficult times,


As I mentioned in Going Dark With Whisper? Allies versus Soul-Mates it is having requirements other than success of a project that is so damaging to such efforts.

I could discover that IS is using the CIA to funnel money from the sales of drugs and conflict diamonds to fund the Tor project and it would not make any difference to me. Even if core members of the Tor project knew that and took steps to conceal it.

Whether intended or not, the only people who will benefit from Lucky’s decision will be opponents of personal privacy and the only losers will be people who need personal privacy.

Congratulations Lucky! You are duplicating a pattern of behavior that destroyed the Black Panthers, the SDS and a host of other groups and movements before and since then.

Let’s hope others don’t imitate Lucky’s “I’ll take my ball and go home” behavior.

by Patrick Durusau at July 19, 2016 01:59 PM

HyperTerm (Not Windows HyperTerm)


Tersely by Nat Torkington as:

– an open source in-browser terminal emulator.

That’s fair, but the project goals read:

The goal of the project is to create a beautiful and extensible experience for command-line interface users, built on open web standards.

In the beginning, our focus will be primarily around speed, stability and the development of the correct API for extension authors.

In the future, we anticipate the community will come up with innovative additions to enhance what could be the simplest, most powerful and well-tested interface for productivity.

JS/HTML/CSS Terminal. Visit HyperTerm for a rocking demo!

Scroll down after the demo to see more.

Looking forward to a Linux package being released!

by Patrick Durusau at July 19, 2016 02:29 AM

ApacheCon – Seville, Spain – Week of November 14th, 2016

You have relied on Apache software, read its documentation, contributed (flamed?) on its lists. Attend ApacheCon and meet other members of the Apache community, in full bandwidth, real time.

The call for papers (CFP) for this event is now open, and will remain open until September 9th.

The event is divided into two parts, each with its own CFP. The first part of the event, called Apache Big Data, focuses on Big Data projects and related technologies.



The second part, called ApacheCon Europe, focuses on the Apache Software Foundation as a whole, covering all projects, community issues, governance, and so on.



ApacheCon is the official conference of the Apache Software Foundation, and is the best place to meet members of your project and other ASF projects, and strengthen your project’s community.

If your organization is interested in sponsoring ApacheCon, contact Rich Bowen at ApacheCon is a great place to find the brightest developers in the world, and experts on a huge range of technologies.

I lifted this text from an email by


by Patrick Durusau at July 19, 2016 01:48 AM

July 18, 2016

Patrick Durusau

Going Dark With Whisper? Allies versus Soul-Mates

After posting Safe Sex and Safe Chat, I asked a close friend if they used Signal from Open Whisper Systems, thinsing it would be good to practice before security is an absolute requirement.

In response I was sent a link to: Internet privacy, funded by spooks: A brief history of the BBG by Yasha Levine.

I take that to mean they aren’t using Whisper. ;-)

Levine’s factual points about U.S. government funding of Tor, Whisper, etc., accord with my general impression of that history, but I do disagree with his concluding paragraph:

You’d think that anti-surveillance activists like Chris Soghoian, Jacob Appelbaum, Cory Doctorow and Jillian York would be staunchly against outfits like BBG and Radio Free Asia, and the role they have played — and continue to play — in working with defense and corporate interests to project and impose U.S. power abroad. Instead, these radical activists have knowingly joined the club, and in doing so, have become willing pitchmen for a wing of the very same U.S. National Security State they so adamantly oppose.

So long as privacy projects release open source code, I don’t see any source of funding as problematic. Drug cartels would have to launder their money first but even rumored drug money spends just like other. Terrorists should step up just to bother and confound the FBI, which sees informational darkness around every corner.

So long as the funding is toward the same goal, security in communication and all the work product is open source, then I see no natural limits on who can be allies of these projects.

I say allies because I mean just that, allies. Who may have their own reasons, some fair and some foul, for their participation and funding. So long as we are advancing towards a common goal, that in other arenas we have conflicts, is irrelevant.

One of the primary reasons why so many groups in the 1960’s failed is because everyone had to agree to be soul-mates on every issue. If you want a potpourri of splinter groups who spend more time fighting among themselves than with others, take that tack.

If, on the other hand, you want funded, effective research that may make a real difference to you and your allies, be more focused on the task at hand and less on the intrinsic goodness (or lack thereof) of your allies.

by Patrick Durusau at July 18, 2016 07:35 PM

July 17, 2016

Patrick Durusau

RNC 2016 – Cleveland, OH (aka, “The Mistake on The Lake”)

The Mistake on The Lake” as a nickname for Cleveland, Ohio was new to me. I remember news of the Burning River rather clearly. Polluting a river until it can burn takes effort. An impressive amount of effort.

“The Mistake on The Lake” is also a fitting nickname for the RNC convention this week in Cleveland. Some mapping resources to help as stories develop:

RNC Homepage with schedule: Despite reports to the contrary, I don’t see Lucifer on the speaking schedule. Perhaps a late addition?

Google Maps, centered on the Quicken Loans Arena: easily switching between views, although the images are static. I assume you will update those with drone/helicopter imagery. Either your own or pirated off of others.

MapQuest: To give you a non-Google alternative.

Cuyahoga County Geographical Information Systems: Yeah, I could not have called the name of the county for Cleveland either. Lots of downloadable GIS data, including ownership, Lidar, contours (think noxious substances running away from you), etc. Plus they host interactive software if you don’t have your own GIS software.

Don’t forget geo-located tweets as an information source for real time updates on locations and events.


by Patrick Durusau at July 17, 2016 09:50 PM

July 16, 2016

Patrick Durusau

Safe Sex and Safe Chat

Matthew Haeck repeats the old dodge for bothering with encrypted communications:

If I’m doing nothing wrong, it doesn’t matter

in Secure Messaging Apps for Encrypted Chat.

Most of us, outside of subscribers to the Linux Journal, never imagine that we are under surveillance by government agencies. And we may not be.

But, that doesn’t mean our friends and acquaintances aren’t under surveillance by domestic and foreign governments, corporations and others.

You should think of encrypted communications, chat in this case, just like you do safe sex.

It not only protects yourself, but your present partner and all future partners the both of you may have.

The same is true for use of encrypted chat. The immediate benefit is for your and your partner, but secure chat, denies the government and others, the use of your chats against unknown future chat partners.

If you practice safe sex, practice safe chat.

Secure Messaging Apps for Encrypted Chat is a great start towards practicing safe chat.

by Patrick Durusau at July 16, 2016 09:58 PM

BaseX 8.5.1 Released! (XQuery Texts for Smart Phone?)

BaseX – 8.5.1 Released!

From the documentation page:

BaseX is both a light-weight, high-performance and scalable XML Database and an XQuery 3.1 Processor with full support for the W3C Update and Full Text extensions. It focuses on storing, querying, and visualizing large XML and JSON documents and collections. A visual frontend allows users to interactively explore data and evaluate XQuery expressions in realtime. BaseX is platform-independent and distributed under the free BSD License (find more in Wikipedia).

Besides Priscilia Walmsley’s XQuery 2nd Edition and the BaseX documentation as a PDF file, what other XQuery resources would you store on a smart phone? (For occasional reference, leisure reading, etc.)

by Patrick Durusau at July 16, 2016 08:44 PM

July 15, 2016

Patrick Durusau

Google = No Due Process

Not new but noteworthy headline about Google: Google deletes artist’s blog and a decade of his work along with it by Ethan Chiel.

From the post:

Artist Dennis Cooper has a big problem on his hands: Most of his artwork from the past 14 years just disappeared.

It’s gone because it was kept entirely on his blog, which the experimental author and artist has maintained on the Google-owned platform Blogger since 2002 (Google bought the service in 2003). At the end of June, Cooper says he discovered he could no longer access his Blogger account and that his blog had been taken offline.

As you know without even reading Ethan’s post, Google has been not responsive to Dennis Cooper or others inquiring on his behalf.

Cooper failed to keep personal backups of his work, but when your files are stored with Google, what’s the point? Doesn’t Google keep backups? Of course they do, but that doesn’t help Cooper in this case.

The important lesson here is that as a private corporation, Google isn’t obligated to give any user notice or an opportunity to be heard before their content is blocked. Or in short, no due process.

Instead of pestering Google with new antitrust charges, the EU could require that Google maintain backups of any content it blocks and require it to deliver that content to the person posting it upon request.

Such a law should include all content hosting services and consequently, be a benefit to everyone living in the EU.

Unlike the headline grabbing antitrust charges against Google.

by Patrick Durusau at July 15, 2016 06:25 PM

FBI, Malware, Carte Blanche and Cardinal Richelieu

Graham Cluley has an amusing take on the FBI’s reaction to its Playpen NIT being characterized as “malware” in When is malware not malware? When the FBI says so, of course.

As Graham points out, the FBI has been denied the fruits of its operation of a child porn site (alleged identities of consumers of child porn), but there is a deeper issue here beyond than defining malware.

The deeper issue lies in a portion of the FBI brief that Graham quotes in part:

“Malicious” in criminal proceedings and in the legal world has very direct implications, and a reasonable person or society would not interpret the actions taken by a law enforcement officer pursuant to a court order to be malicious.

The FBI brief echoes Cardinal Richelieu in The Three Musketeers:

CARDINAL RICHELIEU. … Document three, the most important of all: A pardon — in case you get caught. It’s call a Carte Blanche. It has the force of law and is unbreakable, even by Royal fiat.

MILADY. (Reading it.) “It is by my order and for the benefit of the State that the bearer of this note has one what he has done.”

The FBI contends a court order, assuming it bothers to obtain one, operates as Carte Blanche and imposes no limits on FBI conduct.

Moreover, once a court order is obtained, reports by the FBI of guilt are sufficient for conviction. How the FBI obtained alleged evidence isn’t open to inspection.

Judges should disabuse the FBI of its delusions concerning the nature of court orders and remind it of its proper role in the criminal justice system. The courts, so far as I am aware, remain the arbiters of guilt and innocence, not the FBI.

by Patrick Durusau at July 15, 2016 03:26 PM

Neil deGrasse Tyson and the Religion of Science

The next time you see Neil deGrasse Tyson chanting “holy, holy, holy” at the altar of science, re-read The 7 biggest problems facing science, according to 270 scientists by Julia Belluz, Brad Plumer, and Brian Resnick.

From the post:

The scientific process, in its ideal form, is elegant: Ask a question, set up an objective test, and get an answer. Repeat. Science is rarely practiced to that ideal. But Copernicus believed in that ideal. So did the rocket scientists behind the moon landing.

But nowadays, our respondents told us, the process is riddled with conflict. Scientists say they’re forced to prioritize self-preservation over pursuing the best questions and uncovering meaningful truths.

Ah, a quick correction to: “So did the rocket scientists behind the moon landing.”


The post Did Politics Fuel the Space Race? points to a White House transcript that reveals politics drove the race to the moon:

James Webb – NASA Administrator, President Kennedy.

James Webb: All right, then let me say this: if I go out and say that this is the number-one priority and that everything else must give way to it, I’m going to lose an important element of support for your program and for your administration.

President Kennedy [interrupting]: By who? Who? What people? Who?

James Webb: By a large number of people.

President Kennedy: Who? Who?

James Webb: Well, particularly the brainy people in industry and in the universities who are looking at a solid base.

President Kennedy: But they’re not going to pay the kind of money to get that position that we are [who we are] spending it. I say the only reason you can justify spending this tremendous…why spend five or six billion dollars a year when all these other programs are starving to death?

James Webb: Because in Berlin you spent six billion a year adding to your military budget because the Russians acted the way they did. And I have some feeling that you might not have been as successful on Cuba if we hadn’t flown John Glenn and demonstrated we had a real overall technical capability here.

President Kennedy: We agree. That’s why we wanna put this program…. That’s the dramatic evidence that we’re preeminent in space.

The rocket to the moon wasn’t about science, it about “…dramatic evidence that we’re preeminent in space.

If you need a not so recent example, consider the competition between Edison and Westinghouse in what Wikipedia titles: War of Currents.

Science has always been a mixture of personal ambition, politics, funding, etc.

That’s not to take anything away from science but a caution to remember it is and always has been a human enterprise.

Tyson’s claims for science should be questioned and judged like all other claims.

by Patrick Durusau at July 15, 2016 12:53 AM

July 14, 2016

Patrick Durusau

Building A National FOIA Rejection Database (MuckRock)

MuckRock is launching a national database of FOIA exemptions by Joseph Licterman.

From the post:

In the 2015 fiscal year, the U.S. federal government processed 769,903 Freedom of Information requests. The government fully fulfilled only 22.6 percent of those requests; 44.9 percent of federal FOIA requests were either partially or fully denied. Even though the government denied at least part of more than 345,000 requests, it only received 14,639 administrative appeals.

In an attempt to make the FOIA appeals process easier and help reporters and others understand how and why their requests are being denied, MuckRock is on Thursday launching a project to catalog and explain the exceptions both the federal and state governments are using to deny requests.

MuckRock is a nonprofit site that helps its users file FOIA requests, and cofounder Michael Morisy said that the site is planning to create a “Google for FOIA rejections” which will help users understand why their requests were denied and learn what they can do to appeal the case.

If your FOIA request is rejected, who knows about it? You and maybe a few colleagues?

If you contribute your rejected FOIA requests to this MuckRock project, your rejected requests will join thousands of others to create a database on which the government can be held accountable for its FOIA behavior.

Don’t let your rejected FOIA request languish in filing cabinets and boxes, contribute them along with support to MuckRock!

The government isn’t the only party that can take names and keep records.

by Patrick Durusau at July 14, 2016 03:36 PM

Securing Your Cellphone For A Protest

The instructions on preparing for a demonstration in Steal This Book read in part:

Ideally you should visit the proposed site of the demonstration before it actually takes place. This way you’ll have an idea of the terrain and the type of containment the police will be using. Someone in your group should mimeograph a map of the immediate vicinity which each person should carry. Alternative actions and a rendezvous point should be worked out. Everyone should have two numbers written on their arm, a coordination center number and the number of a local lawyer or legal defense committee. You should not take your personal phone books to demonstrations. If you get busted, pigs can get mighty Nosy when it comes to phone books. Any sharp objects can be construed as weapons. Women should not wear earrings or other jewelry and should tie their hair up to tuck it under a helmet. Wear a belt that you can use as a tourniquet. False teeth and contact lenses should be left at home if possible. You can choke on false teeth if you receive a sharp blow while running. Contact lenses can complicate eye damage if gas or Mace is used.

How would you update this paragraph for the age of smart phones?

ACLU counsels protesters to secure their phones (read personal phone books) in The Two Most Important Things Protesters Can Do To Secure Their Phones.

You can do better than that, as Hoffman advises, leave your personal phone books (read smart phones) at home!

Your “whole life is on your phone.” Yes, I know. All the more reason to leave it out of the clutches of anyone interested in your “whole life.”

Buy clean burner phones in bulk.

Preset bookmarks for the protest area on Google maps, along with landmarks, rendezvous points, fall back positions, etc.

For texting during protests, create burner identities drawn from a list of characters in police shows, out of a hat. No changing, no choices. The same person should never re-use a burner identity. Patterns matter. (See the ACLU post for suggestions on secure messaging apps.)

Continue to write two phone numbers on your arm: coordination center and a local lawyer or legal defense committee.

Two reasons for these numbers on your arm: First, you may not have your cell phone when allowed to make a call from jail. Second, you should never have the number of another activist on your person.

Nothing takes the place of a site visit but technology has changed since Hoffman’s time.

High quality maps, photos, topographical (think elevation (high ground), drainage (as in running away from you)) features, not to mention reports of prior protests and police responses are available.

If my security suggestions sound extreme, recall that not all protests occur in the United States and even of those that do, not all are the “line up to be arrested” sort of events. Or are conducted in “free speech allotments,” like the upcoming Democratic and Republican political conventions this summer.

by Patrick Durusau at July 14, 2016 02:29 PM

July 13, 2016

Patrick Durusau

How-To Safely Protest on the Downtown Connector – #BLM

Atlanta doesn’t have a spotless record on civil rights but Mayor Kasim Reed agreeing to meet with #BLM leaders on July 18, 2016, is a welcome contrast to response in the police state of Baton Rouge, for example.

During this “cooling off” period, I want to address Mayor Reed’s concern for the safety of #BLM protesters and motorists should #BLM protests move onto the Downtown Connector.

Being able to protest on the Downtown Connector would be far more effective than blocking random Atlanta surface streets, by day or night. Mayor Reed’s question is how to do so safely?

Here is Google Maps’ representation of a part of the Downtown Connector:


That view isn’t helpful on the issue of safety but consider a smaller portion of the Downtown Connector as seen by Google Earth:


The safety question has two parts: How to transport #BLM protesters to a protest site on the Downtown Connector? How to create a safe protest site on the Downtown Connector?

A nearly constant element of the civil rights movement provides the answer: buses. From the Montgomery Bus Boycott, Freedom Riders, to the long experiment with busing to achieve desegregation in education.

Looking at an enlargement of an image of the Downtown Connector, you will see that ten (10) buses would fill all the lanes, plus the emergency lane and the shoulder, preventing any traffic from going around the buses. That provides safety for protesters. Not to mention transporting all the protesters safely to the protest site.

The Downtown Connector is often described as a “parking lot” so drivers are accustomed to traffic slowing to a full stop. If a group of buses formed a line across all lanes of the Downtown Connector and slowed to a stop, traffic would be safely stopped. That provides safety for drivers.

The safety of both protesters and drivers depends upon coordination between cars and buses to fill all the lanes of the Downtown Connector and then slowing down in unison, plus buses occupying the emergency lane and shoulder. Anything less than full interdiction of the highway would put both protesters and drivers at risk.

Churches and church buses have often played pivotal roles in the civil rights movement so the means for creating safe protest spaces, even on the Downtown Connector, are not out of reach.

There are other logistical and legal issues involved in such a protest but I have limited myself to offering a solution to Mayor Reed’s safety question.

PS: The same observations apply to any limited access motorway, modulo adaptation to your local circumstances.

by Patrick Durusau at July 13, 2016 08:42 PM

July 12, 2016

Patrick Durusau

New Linux Journal Subscription Benefit!

Benefits of a Linux Journal subscription you already know:

  1. Linux Journal, currently celebrating its 20th year of publication, is the original magazine of the global Linux community, delivering readers the advice and inspiration they need to get the most out of their Linux systems.”
  2. $29.50 (US) buys 12 issues and access to the Linux Journal archive.
  3. Linux Journal has columns written by regular columns written by Mick Bauer, Reuven Lerner, Dave Taylor, Kyle Rankin, Bill Childers, John Knight, James Gray, Zack Brown, Shawn Powers and Doc Searls.
  4. For more see the Linux Journal FAQ.

Now there is a new Linux Journal subscription benefit:

You are flagged as an extremist by the NSA

NSA Labels Linux Journal Readers and TOR and TAILS Users as Extremists by Dave Palmer.

End the constant worry, nagging anxiety, endless arguments with friends about who is being tracked by the NSA! For the small sum of $29.50 (US) you can buy your way into the surveillance list at the NSA.

I can’t think of a cheaper way to get on a watch list, unless you send threatening letters to the U.S. President, which is a crime, so don’t do it.

Step up and assume the mantle of “extremist” in the eyes of the NSA.

You would be hard pressed to find better company.

PS: Being noticed may not seem like a good idea. But the bigger the NSA haystack, the safer all needles will be.

by Patrick Durusau at July 12, 2016 06:58 PM

July 10, 2016

Patrick Durusau

FYI: Glossary Issues with the Chilcot Report

Anyone who is working on more accessible/useful versions of the Chilcot Report should be aware of the following issues with Annex 2 – Glossary.

First, the “glossary” appears to be a mix of acronyms, with their expansions, along with random terms or phrases for which definitions are offered. For example, “FFCD – Full, Final and Complete declaration,” immediately followed by “Five Mile Market – Area in Basra.” (at page 247)

Second, the concept of unique acronyms never occurred to the authors:

AG Adjutant General
AG Advocate General
AG Attorney General
(page 235)

AM Aftermath
AM Air Marshal
(page 236)

BCU Basic Capability Unit
BCU Basra Crimes Unit
(page 238)

BOC Basra Operational Command
BOC Basra Operations Centre
(page 238)

CG Commander General
CG Consul General
CG Consulate General (see BEO)
(page 240)

CIC Coalition Information Centre
CIC Communication and Information Centre
(page 240)

CO Cabinet Office
CO Commanding Officer
(page 241)

DCC Deputy Chief Constable
DCC Dismounted Close Combat
(page 243)

DG Diego Garcia
DG Director General
(page 244)

DIA Defence Intelligence Agency
DIA Department of Internal Affairs
(page 244)

DPA Data Protection Act
DPA Defence Procurement Agency
(page 245)

DSP Defence Strategic Plan
DSP Deployable Spares Pack
(page 245)

EP Equipment Plan
EP Equipment Programme
(page 246)

ESC Emergency Security Committee
ESC Executive Steering Committee
(page 246)

EST Eastern Standard Time
EST Essential Services Team
(page 246)

FP Force Posture
FP Force Protection
(page 247)

IA Interim Administration
IA Iraqi Army
(page 250)

ID Identification
ID (US) Infantry Division
(page 251)

ING Iraqi National Gathering
ING Iraqi National Guard
(page 252)

IO Information Operations
IO International Organisations
(page 252)

ISG Information Strategy Group
ISG Iraq Security Group
ISG Iraq Strategy Group
ISG Iraq Survey Group
(page 253)

MAS Manned Airborne Surveillance
MAS Muqtada al-Sadr
(page 256)

Op Operation
OP Operative Paragraph
(page 260)

OSD US Office of the Secretary of Defense
OSD Out of Service Date
(page 261)

PM Prime Minister
PM Protected Mobility
(page 262)

RA Research Analysts
RA Regular Army
(page 264)

RDD Radiological Dispersal Devices
RDD Required Delivery Date
(page 264)

SAF Small Arms Fire
SAF Stabilisation Aid Fund
(page 265)

SC Security Committee
SC Security Council
(page 265)

SE Scottish Executive
SE South-East
(page 266)

SFA Service Family Accommodation
SFA Strategic Framework Agreement
(page 266)

SG Secretary-General
SG Special Groups
(page 266)

SLA Scottish Lord Advocate
SLA Service Level Agreement
(page 266)

SSE Sensitive Site Exploitation
SSE Spring Supplementary Estimate
(page 267)

UNSC UN Security Council
UNSC UN Special Co-ordinator
(page 270)

Yes, seventy-four (74) items that may be mistaken in any automated processing of the text.

Third, there are items in the glossary that don’t appear in the text outside of the glossary:

H of C House of Commons page 249
HoC House of Commons page 250

The House of Commons is never referred to by “H of C” or “HoC” outside of the glossary.

Fourth, there are items in the glossary that are not specialized vocabulary, as though the glossary is also a mini-English dictionary:

de facto In fact
de jure According to law
(page 244)

Fifth, the acronyms as mis-leading. For example, if you search for “EPW – Enemy Prisoners of War” (is there another kind?), outside of the glossary there is only one (1) “hit:”

the-report-of-the-iraq-inquiry_section-061.pdf.txt:Communication] and handling of EPW [Enemy Prisoners of War]”.

If you search for the other acronym, “PW – Prisoner of War,” outside of the glossary there is only one (1) “hit:”

the-report-of-the-iraq-inquiry_section-064.pdf.txt:A mass PW [prisoner of war] problem and/or a humanitarian crisis could both

With only casual knowledge of the war in Iraq, that doesn’t sound right does it?

Try searching for “prison.” That will return 185 “hits.”

Interesting isn’t it? The official acronyms (plural) return one “hit” each and a term not in the glossary returns 185 “hits.”

Makes me wonder about the criteria for inclusion in the glossary.


If you are working with the Chilcot report I hope you find these comments useful. I working on an XML format version of the glossary that treats this as acronym -> expansion, suitable for putting the expansion markup inline.

The report randomly, from a reader’s perspective, uses acronyms and expansions. Consistently recording the acronyms and expansions will benefit readers and researchers. Two audiences ignored in the Chilcot Report.

by Patrick Durusau at July 10, 2016 07:35 PM

July 09, 2016

Patrick Durusau

“Going Dark, Going Forward:…” Everyone Is Now Dumber For Having Read It.

Homeland Security’s big encryption report wasn’t fact-checked by Violet Blue.

From the post:

This past week, everyone’s been so focused on Hillary, Trump, police shootings and Dallas that few noticed that the Majority Staff of the House Homeland Security Committee finally released its encryption report — with some pretty big falsehoods in it. “Going Dark, Going Forward: A Primer on the Encryption Debate” is a guide for Congress and stakeholders that makes me wonder if we have a full-blown American hiring crisis for fact-checkers.

The report relied on more than “100 meetings with … experts from the technology industry, federal, state, and local law enforcement, privacy and civil liberties, computer science and cryptology, economics, law and academia, and the Intelligence Community.” And just a little bit of creative license.

The first line of the report is based on flat-out incorrect information.

Do us all a favor, read Violet Blue’s summary of the report and not the report itself.

Reading “Going Dark, Going Forward: A Primer on the Encryption Debate” will leave you mis-informed, annoyed/amazed at congressional committee ignorance, despairing over the future of civilization, and dumber.

I differ from Violet because I think the report is intended to mis-inform, mis-lead and set false terms into play for a debate over encryption.

That is not an issue of fact-checking but of malice.

Consider the “big lie” that Violet quotes from the report (its opening line):

“Public engagement on encryption issues surged following the 2015 terrorist attacks in Paris and San Bernardino, particularly when it became clear that the attackers used encrypted communications to evade detection — a phenomenon known as ‘going dark.'”

Every time that claim is made and repeated in popular media, a disclaimer should immediately appear:

The claim that encrypted communications were used to evade detection in the 2015 terrorist attacks in Paris and San Bernardino is a lie. A lie told with the intend to deceive and manipulate everyone who hears it.

I know, it’s too long to be an effective disclaimer. Do you think “Lying bastards!” in closed captioning would be clear enough?

Counter false narratives like Going Dark, Going Forward: A Primer on the Encryption Debate.

Otherwise, the encryption “debate” will be held on false terms.

by Patrick Durusau at July 09, 2016 08:53 PM

Weka MOOCs – Self-Paced Courses

All three Weka MOOCs available as self-paced courses

From the post:

All three MOOCs (“Data Mining with Weka”, “More Data Mining with Weka” and “Advanced Data Mining with Weka”) are now available on a self-paced basis. All the material, activities and assessments are available from now until 24th September 2016 at:

The Weka software and MOOCs are great introductions to machine learning!

by Patrick Durusau at July 09, 2016 12:56 AM

July 08, 2016

Patrick Durusau

Donald Knuth: Literate Programming on Channel 9

Donald Knuth: Literate Programming on Channel 9.


The speaker will discuss what he considers to be the most important outcome of his work developing TeX in the 1980s, namely the accidental discovery of a new approach to programming — which caused a radical change in his own coding style. Ever since then, he has aimed to write programs for human beings (not computers) to read. The result is that the programs have fewer mistakes, they are easier to modify and maintain, and they can indeed be understood by human beings. This facilitates reproducible research, among other things.

Presentation at the R User Conference 2016.

Increase your book budget before watching this video!

by Patrick Durusau at July 08, 2016 12:49 AM

July 07, 2016

Patrick Durusau

Faking Government Transparency: The Chilcot Report

The Chilcot Report (Iraq Inquiry) is an example of faking governmental transparency.

You may protest: “But look at all the files, testimony, documents, etc. How can it be more transparent than that?”

That’s not a hard question to answer.

Preventing Shared Public Discussion

The release of the Chilcot Report as PDF files, eliminates any possibility of shared public discussion of its contents.

The report will be discussed by members of the media, experts and the public. Public comments are going to be scattered over blogs, newspapers, Twitter, Facebook and other media. And over a long period of time as well.

For example, the testimony of Mr. Jonathan Powell is likely to draw comments:

“… it was a mistake to go so far with de‑Ba’athification. It is a similar mistake the Americans made after the Second World War with de‑Nazification and they had to reverse it. Once it became clear to us, we argued with the administration to reverse it, and they did reverse it, although with difficulty because the Shia politicians in the government were very reluctant to allow it to be reversed, and at the time we were being criticised for not doing enough de‑Ba’athification.”75

75 Public hearing, 18 January 2010, page 128.

Had the report been properly published as HTML, that quote could appear as:

<blockquote id=”iraq-inquiry_volume-10-section-111-para78-powell>
“… it was a mistake to go so far with de‑Ba’athification. It is a similar mistake the Americans made after the Second World War with de‑Nazification and they had to reverse it. Once it became clear to us, we argued with the administration to reverse it, and they did reverse it, although with difficulty because the Shia politicians in the government were very reluctant to allow it to be reversed, and at the time we were being criticised for not doing enough de‑Ba’athification.”75

The primary difference is that with an official identifier for the Powell quote, then everyone discussing it can point to the same quote.

Which enables a member of the public, researcher, reporter or even a member of government, to search for: iraq-inquiry_volume-10-section-111-para78-powell and find every discussion that is indexed on the Internet, that points to that quote.

Granting that it depends on authors using that identifier but it enables public discussion and research in ways that PDF simply blocks.

Every paragraph, every quote, every list item, every map, should have a unique ID to facilitate pointing to portions of the original report.

A Lack of Hyperlinks

One of the more striking deficits of the Chilcot Report is its lack of hyperlinks. Footnote 75, which you saw above,

75 Public hearing, 18 January 2010, page 128.

is not a hyperlink to that public hearing.

Why should the public be tasked with rummaging through multiple documents when publishing all of the texts as HTML would enable point to point navigation to relevant material?

If you are thinking the lack of HTML/hyperlinks impairs the public’s use of this report is a rationale for PDF, you are right in one.

Or consider the lack of hyperlinks to other published materials:

Introduction to the Iraq Inquiry

  • The House of Commons Foreign Affairs Committee published The Decision to go to War in Iraq on 3 July 2003.
  • The Intelligence and Security Committee of Parliament published Iraqi Weapons of Mass Destruction – Intelligence and Assessments on 10 September 2003.
  • Lord Hutton published his Report of the Inquiry into the Circumstances Surrounding the Death of Dr David Kelly CMG on 28 January 2004.
  • A Committee of Privy Counsellors, chaired by Lord Butler of Brockwell, published its Review of Intelligence on Weapons of Mass Destruction on 14 July 2004. Sir John Chilcot was a member of Lord Butler’s Committee.
  • The Baha Mousa Inquiry, chaired by Sir William Gage, was established in May 2008 and published its conclusions on 8 September 2011.2

pages 2 and 3, numbered paragraph 4.

Nary a hyperlink in the lot.

But let’s just take the first one as an example:

The House of Commons Foreign Affairs Committee published The Decision to go to War in Iraq on 3 July 2003.

Where would you go to find that report?

Searching on the title finds volume 1 of that report relatively easily: House of Commons Foreign Affairs Committee The Decision to go to War in Iraq Ninth Report of Session 2002–03 Volume I.

Seeing “volume 1,” makes me suspect there is also a volume 2. Casting about a bit more we find:, of which I took the following screenshot:


(select for a larger image)

In the larger version you will see there are three volumes to The Decision to go to War in Iraq, not one. Where the other two volumes are now, your guess is probably better than mine. I tried a number of queries but did not get useful results.

Multiple those efforts by everyone in the UK who has an interest in this report and you will see the lack of hyperlinks for what it truly is, a deliberate ploy to impede the public’s use of this report.

Degree of Difficulty?

Lest anyone protest that production of HTML with hyperlinks represents an extreme burden on the Iraq Inquiry’s staff, recall the excellent use Parliament makes of the web. (I know a number of markup experts in the UK that I can recommend should the holders of the original text wish to issue a text that would be useful to the public.)

No, the publication of the Iraq Inquiry as non-hyperlinked PDF was a deliberate choice. One designed to impede its use for reasons best known to those making that decision. Unsavory reasons I have no doubt.

PS: In the future, do not accept reports with footnotes/endnotes represented in layout. As logical elements, footnotes/endnotes are much easier to manage.

by Patrick Durusau at July 07, 2016 06:24 PM


Open source perks

With the move to GitHub, several perks of being an open source project came to light:


travis-mascot-200pxGitHub has a nice integration with Travis-CI, which offers free continuous integration for open source projects. Every push to a branch or pull-request branch can lead to a build and test of the project. The configuration of the build process is contained in the repository so that each branch may determine it’s own testing parameters.

We’ve enabled the Travis-CI functionality for the main Ontopia repository, and the results are publically available:

Screen Shot 2016-07-07 at 14.45.54

The automated building and testing will assist the developers with determining if branches or pull requests can be merged into the master branch or more work should be done first.


SyZuYA39_400x400Codacy offers code analysis and measurements in a cloud service model. These measures can uncover possible improvements of the project. Improvements such as coding style, performance and security threats.

As with Travis, use of Codacy is free for open source projects. The analysis and measurements of Ontopia are publically available.

Some of the issues Codacy reports are a good practice exercise for people that would like to contribute to the project without needed a full in-depth understanding of all the code. Feel free to open merge requests referencing the issues you resolve.

by Q. Siebers at July 07, 2016 01:03 PM

July 06, 2016

Patrick Durusau

Unicode® Standard, Version 9.0

Unicode® Standard, Version 9.0

From the webpage:

Version 9.0 of the Unicode Standard is now available. Version 9.0 adds exactly 7,500 characters, for a total of 128,172 characters. These additions include six new scripts and 72 new emoji characters.

The new scripts and characters in Version 9.0 add support for lesser-used languages worldwide, including:

  • Osage, a Native American language
  • Nepal Bhasa, a language of Nepal
  • Fulani and other African languages
  • The Bravanese dialect of Swahili, used in Somalia
  • The Warsh orthography for Arabic, used in North and West Africa
  • Tangut, a major historic script of China

Important symbol additions include:

  • 19 symbols for the new 4K TV standard
  • 72 emoji characters such as the following

Why they choose to omit the bacon emoji from the short list is a mystery to me:


Get your baking books out! I see missing bread emojis. ;-)

by Patrick Durusau at July 06, 2016 08:40 PM

Chilcot Report – Collected PDFs, Converted to Text

I didn’t see a bulk download option for the chapters of the Chilcot Report at: The Iraq Inquiry Report page so I have collected those files and bundled them up for download as Iraq-Inquiry-Report-All-Volumes.tar.gz.

I wrote about Apache PDFBox recently so I also converted all of those files to text and have bundled them up as a Iraq-Inquiry-Report-Text-Conversion.tar.gz.

Some observations on the text files:

  • Numbered paragraphs have the format: digit(one or more)-period-space
  • Footnotes are formatted: digit(1 or more)-space-text
  • Page numbers: digit(1 or more)-space-no following text

Suggestions on other processing steps?

by Patrick Durusau at July 06, 2016 08:19 PM

The Iraq Inquiry (Chilcot Report) [4.5x longer than War and Peace]

The Iraq Inquiry

To give a rough sense of the depth of the Chilcot Report, the executive summary runs 150 pages. The report appears in twelve (12) volumes, not including video testimony, witness transcripts, documentary evidence, contributions and the like.

Cory Doctorow reports a Guardian project to crowd source collecting facts from the 2.6 million word report. The Guardian observes the Chilcot report is “…almost four-and-a-half times as long as War and Peace.”

Manual reading of the Chilcot report is doable, but unlikely to yield all of the connections that exist between participants, witnesses, evidence, etc.

How would you go about making the Chilcot report and its supporting evidence more amenable to navigation and analysis?

The Report

The Evidence

Other Material

Unfortunately, sections within volumes were not numbered according to their volume. In other words, volume 2 starts with section 3.3 and ends with 3.5, whereas volume 4 only contains sections beginning with “4.,” while volume 5 starts with section 5 but also contains sections 6.1 and 6.2. Nothing can be done for it but be aware that section numbers don’t correspond to volume numbers.

by Patrick Durusau at July 06, 2016 07:41 PM

When AI’s Take The Fifth – Sign Of Intelligence?

Taking the fifth amendment in Turing’s imitation game by Kevin Warwick and Huma Shahb.


In this paper, we look at a specific issue with practical Turing tests, namely the right of the machine to remain silent during interrogation. In particular, we consider the possibility of a machine passing the Turing test simply by not saying anything. We include a number of transcripts from practical Turing tests in which silence has actually occurred on the part of a hidden entity. Each of the transcripts considered here resulted in a judge being unable to make the ‘right identification’, i.e., they could not say for certain which hidden entity was the machine.

A delightful read about something never seen in media interviews: silence of the person being interviewed.

Of the interviews I watch, which is thankfully a small number, most people would seem more intelligent by being silent more often.

I take author’s results as a mark in favor of Fish’s interpretative communities because “interpretation” of silence falls squarely on the shoulders of the questioner.

If you don’t know the name Kevin Warwick, you should.

As of today, footnote 1 correctly points to the Fifth Amendment text at Cornell but mis-quotes it. In relevant part the Fifth Amendment reads, “…nor shall be compelled in any criminal case to be a witness against himself….”

by Patrick Durusau at July 06, 2016 01:52 PM

July 05, 2016

Patrick Durusau

Everything You Wanted to Know about Book Sales (But Were Afraid to Ask)

Everything You Wanted to Know about Book Sales (But Were Afraid to Ask) by Lincoln Michel.

From the post:

Publishing is the business of creating books and selling them to readers. And yet, for some reason we aren’t supposed to talk about the latter.

Most literary writers consider book sales a half-crass / half-mythological subject that is taboo to discuss.
While authors avoid the topic, every now and then the media brings up book sales — normally to either proclaim, yet again, the death of the novel, or to make sweeping generalizations about the attention spans of different generations. But even then, the data we are given is almost completely useless for anyone interested in fiction and literature. Earlier this year, there was a round of excited editorials about how print is back, baby after industry reports showed print sales increasing for the second consecutive year. However, the growth was driven almost entirely by non-fiction sales… more specifically adult coloring books and YouTube celebrity memoirs. As great as adult coloring books may be, their sales figures tell us nothing about the sales of, say, literary fiction.

Lincoln’s account mirrors my experience (twice) with a small press decades ago.

While you (rightfully) think that every sane person on the planet will forego the rent in order to purchase your book, sadly your publisher is very unlikely to share that view.

One of the comments to this post reads:

…Writing is a calling but publishing is a business.

Quite so.

Don’t be discouraged by this account but do allow it to influence your expectations, at least about the economic rewards of publishing.

Just in case I get hit with the publishing bug again, good luck to us all!

by Patrick Durusau at July 05, 2016 09:57 PM

Free Programming Books – Update

Free Programming Books by Victor Felder.

From the webpage:

This list initially was a clone of stackoverflow – List of Freely Available Programming Books by George Stocker. Now updated, with dead links gone and new content.

Moved to GitHub for collaborative updating.

Great listing of resources!

But each resource stands alone as its own silo. It can (and many do) refer to other materials, even with hyperlinks, but if you want to explore any of them, you must explore them separately. That’s what being in a silo means. You have to start over at the beginning. Every time.

That is complicated by the existence of thousands of slideshows and videos on programming topics not listed here. Search for your favorite programming language at Slideshare and Youtube. There are other repositories of slideshows and videos, those are just examples.

Each one of those slideshows and/or videos is also a silo. Not to mention that with video you need a time marker if you aren’t going to watch every second of it to find relevant material.

What if you could traverse each of those silos, books, posts, slideshows, videos, documentation, source code, seamlessly?

Making that possible for C/C++ now, given the backlog of material, would have a large upfront cost before it could be useful.

Making that possible for languages with shorter histories, well, how useful would it need to be to justify its cost?

And how would you make it possible for others to easily contribute gems that they find?

Something to think about as you wander about in each of these separate silos.


by Patrick Durusau at July 05, 2016 08:30 PM

Using A Shared Password Is A Crime (9th Circuit, U.S. v. Nosal) Full Text of Opinion

U.S. appeals court rejects challenge to anti-hacking law by Jonathan Stempel.

From the post:

A divided federal appeals court on Tuesday gave the U.S. Department of Justice broad leeway to police password theft under a 1984 anti-hacking law, upholding the conviction of a former Korn/Ferry International executive for stealing confidential client data.

The 9th U.S. Circuit Court of Appeals in San Francisco said David Nosal violated the Computer Fraud and Abuse Act in 2005 when he and two friends, who had also left Korn/Ferry, used an employee’s password to access the recruiting firm’s computers and obtain information to help start a new firm.

Writing for a 2-1 majority, Circuit Judge Margaret McKeown said Nosal acted “without authorization” even though the employee, his former secretary, had voluntarily provided her password.

The full text of the decision (plus dissent) in U.S. v. Nosal, No. 14-10037.

This case has a long history, which I won’t try to summarize now.

by Patrick Durusau at July 05, 2016 07:48 PM

Hillary Clinton Email Archive

Hillary Clinton Email Archive by Wikileaks.

From the webpage:

On March 16, 2016 WikiLeaks launched a searchable archive for 30,322 emails & email attachments sent to and from Hillary Clinton’s private email server while she was Secretary of State. The 50,547 pages of documents span from 30 June 2010 to 12 August 2014. 7,570 of the documents were sent by Hillary Clinton. The emails were made available in the form of thousands of PDFs by the US State Department as a result of a Freedom of Information Act request. The final PDFs were made available on February 29, 2016.

“Truthers” may be interested in this searchable archive of Clinton’s emails while Secretary of State.

“Truthers” because the FBI’s recommendation of no charges effectively ends this particular approach to derail Clinton’s run for the presidency.

Many wish the result were different but when the last strike is called, arguing about it isn’t going to change the score of the game.

New evidence and new facts, on the other hand, are unknown factors and could make a difference whereas old emails will not.

Are you going to be looking for new evidence and facts or crying over calls in a game already lost?

by Patrick Durusau at July 05, 2016 06:33 PM