Planet Topic Maps

June 25, 2016

Patrick Durusau

Speaking of Wasted Money on DRM / WWW EME Minus 2 Billion Devices

Just earlier today I was scribbling about wasting money on DRM saying:


I feel sorry for content owners. Their greed makes them easy prey for people selling patented DRM medicine for the delivery of their content. In the long run it only hurts themselves (the DRM tax) and users. In fact, the only people making money off of DRM are the people who deliver content.

This evening I ran across: Chrome Bug Makes It Easy to Download Movies From Netflix and Amazon Prime by Michael Nunez.

Nunez points out an exploit in the open source Chrome browser enables users to save movies from Netflix and Amazon Prime.

Even once a patch appears, others can compile the code without the patch, to continue downloading, illegally, movies from Netflix and Amazon Prime.

Even more amusing:


Widevine is currently used in more than 2 billion devices worldwide and is the same digital rights management technology used in Firefox and Opera browsers. Safari and Internet Explorer, however, use different DRM technology.

Widevine plus properly configured device = broken DRM.

When Sony and others calculate their ROI from DRM, be sure to subtract 2 billion+ devices that probably won’t honor the no-record DRM setting.

by Patrick Durusau at June 25, 2016 12:50 AM

Visions of a Potential Design School

With cautions:

design-school-460

The URL that appears in the image: http://di16.rca.ac.uk/project/the-school-of-___/.

It’s not entirely clear to me if Chrome and/or Mozilla on Ubuntu are displaying these pages correctly. I am unable to scroll within the displayed windows of text. Perhaps that is intentional.

The caution is about the quote from Twitter:

“…deconstruct the ways that they have been inculcated….”

It does not promise you will be able to deconstruct the new narrative that enables you to “deconstruct” the old one.

That is we never stand outside of all narratives, but in a different narrative than the one we have under deconstruction. (sorry)

by Patrick Durusau at June 25, 2016 12:17 AM

June 24, 2016

Patrick Durusau

…possibly biased? Try always biased.

Artificial Intelligence Has a ‘Sea of Dudes’ Problem by Jack Clark.

From the post:


Much has been made of the tech industry’s lack of women engineers and executives. But there’s a unique problem with homogeneity in AI. To teach computers about the world, researchers have to gather massive data sets of almost everything. To learn to identify flowers, you need to feed a computer tens of thousands of photos of flowers so that when it sees a photograph of a daffodil in poor light, it can draw on its experience and work out what it’s seeing.

If these data sets aren’t sufficiently broad, then companies can create AIs with biases. Speech recognition software with a data set that only contains people speaking in proper, stilted British English will have a hard time understanding the slang and diction of someone from an inner city in America. If everyone teaching computers to act like humans are men, then the machines will have a view of the world that’s narrow by default and, through the curation of data sets, possibly biased.

“I call it a sea of dudes,” said Margaret Mitchell, a researcher at Microsoft. Mitchell works on computer vision and language problems, and is a founding member—and only female researcher—of Microsoft’s “cognition” group. She estimates she’s worked with around 10 or so women over the past five years, and hundreds of men. “I do absolutely believe that gender has an effect on the types of questions that we ask,” she said. “You’re putting yourself in a position of myopia.”

Margaret Mitchell makes a pragmatic case for diversity int the workplace, at least if you want to avoid male biased AI.

Not that a diverse workplace results in an “unbiased” AI, it will be a biased AI that isn’t solely male biased.

It isn’t possible to escape bias because some person or persons has to score “correct” answers for an AI. The scoring process imparts to the AI being trained, the biases of its judge of correctness.

Unless someone wants to contend there are potential human judges without biases, I don’t see a way around imparting biases to AIs.

By being sensitive to evidence of biases, we can in some cases choose the biases we want an AI to possess, but an AI possessing no biases at all, isn’t possible.

AIs are, after all, our creations so it is only fair that they be made in our image, biases and all.

by Patrick Durusau at June 24, 2016 09:24 PM

Hardening the Onion [Other Apps As Well?]

Tor coders harden the onion against surveillance by Paul Ducklin.

From the post:

A nonet of security researchers are on the warpath to protect the Tor Browser from interfering busybodies.

Tor, short for The Onion Router, is a system that aims to help you be anonymous online by disguising where you are, and where you are heading.

That way, nation-state content blockers, law enforcement agencies, oppressive regimes, intelligence services, cybercrooks, Lizard Squadders or even just overly-inquisitive neighbours can’t easily figure out where you are going when you browse online.

Similarly, sites you browse to can’t easily tell where you came from, so you can avoid being traced back or tracked over time by unscrupulous marketers, social engineers, law enforcement agencies, oppressive regimes, intelligence services, cybercrooks, Lizard Squadders, and so on.

Paul provides a high-level view of Selfrando: Securing the Tor Browser against De-anonymization Exploits by Mauro Conti, et al.

The technique generalizes beyond Tor to GNU Bash 4.3, GNU less 4.58 Nginx 1.8.0, Socat 1.7.3.0, Thttpd 2.26, and, Google’s Chromium browser.

Given the spend at which defenders play “catch up,” there is much to learn here that will be useful for years to come.

Enjoy!

by Patrick Durusau at June 24, 2016 08:04 PM

Pride Goeth Before A Fall – DMCA & Security Researchers

Cory Doctorow has written extensively on the problems with present plans to incorporate DRM in HTML5:

W3C DRM working group chairman vetoes work on protecting security researchers and competition – June 18, 2016.

An Open Letter to Members of the W3C Advisory Committee – May 12, 2016.

Save Firefox: The W3C’s plan for worldwide DRM would have killed Mozilla before it could start – May 11, 2016.

Interoperability and the W3C: Defending the Future from the Present – March 29, 2016.

among others.

In general I agree with Cory’s reasoning but I don’t see:

…Once DRM is part of a full implementation of HTML5, there’s a real risk to security researchers who discover defects in browsers and want to warn users about them…. (from Cory’s latest post)

Do you remember the Sony “copy-proof” CDs? Sony “copy-proof” CDs cracked with a marker pen Then, just as now, Sony is about to hand over bushels of cash to the content delivery crowd.

When security researchers discover flaws in the browser DRM, what prevents them from advising users?

Cory says the anti-circumvention provisions of the DMCA prevent security researchers from discovering and disclosing such flaws.

That’s no doubt true, if you want to commit a crime (violate the DMCA) and publish evidence of that crime with your name attached to it on the WWW.

Isn’t that a case of pride goeth before a fall?

If I want to alert other users to security defects in their browsers, possibly equivalent to the marker pen for Sony CDs, I post that to the WWW anonymously.

Or publish code to make that defect apparent to even a casual user.

What I should not do is put my name on either a circumvention bug report or code to demonstrate it. Yes?

That doesn’t answer Cory’s points about impairing innovation, etc. but once Sony realizes it has been had, again, by the content delivery crowd, what’s the point of more self-inflicted damage?

I feel sorry for content owners. Their greed makes them easy prey for people selling patented DRM medicine for the delivery of their content. In the long run it only hurts themselves (the DRM tax) and users. In fact, the only people making money off of DRM are the people who deliver content.

Should DRM appear as proposed in HTML5, any suggestions for a “marker pen” logo to be used by hackers of a Content Decryption Module?

PS: Another approach to opposing DRM would be to inform shareholders of Sony and other content owners they are about to be raped by content delivery systems.

PPS: In private email Cory advised me to consider the AACS encryption key controversy, where public posting of an encryption key was challenged with take down requests. However, in the long run, such efforts only spread the key more widely, not the effect intended by those attempted to limit its spread.

And there is the Dark Web, ahem, where it is my understanding that non-legal content and other material can be found.

by Patrick Durusau at June 24, 2016 07:18 PM

June 23, 2016

Patrick Durusau

SEC Warning: Hackers, Limit Fraud to Traditional Means

U.S. SEC accuses U.K. man of hacking, fraudulent trades by Jonathan Stempel.

From the post:

The U.S. Securities and Exchange Commission sued a U.K. man it said hacked into online brokerage accounts of several U.S. investors, placed unauthorized stock trades, and within minutes made profitable trades in the same stocks in his own account.

“We will swiftly track down hackers who prey on investors as we allege Mustapha did, no matter where they are operating from and no matter how sophisticated their technology,” Robert Cohen, co-chief of the SEC enforcement division’s market abuse unit, said in a statement.

The case is SEC v Mustapha, U.S. District Court, Southern District of New York, No. 16-04805.

I can’t find the record in PACER. Perhaps it is too recent?

In any event, hackers be warned that the SEC will swiftly move to track you down should you commit fraud on investors using “sophisticated” technology.

Salting of news sources, insider trading, other, more traditional means of defrauding investors, will continue to face lackadaisical enforcement efforts.

You don’t have to take my word for it. See: Report: SEC Filed a Record Number of Enforcement Actions in FY 2015, Aggregate Fines and Penalties Declined by Kevin LaCroix.

Kevin not only talks about the numbers but also provides links to the original report, a novelty for some websites.

The lesson here is to not distinguish yourself by using modern means to commit securities fraud. The SEC is more likely to pursue you.

Is that how you read this case? ;-)

by Patrick Durusau at June 23, 2016 09:48 PM

Bots, Won’t You Hide Me?

Emerging Trends in Social Network Analysis of Terrorism and Counterterrorism, How Police Are Scanning All Of Twitter To Detect Terrorist Threats, Violent Extremism in the Digital Age: How to Detect and Meet the Threat, Online Surveillance: …ISIS and beyond [Social Media “chaff”] are just a small sampling of posts on the detection of “terrorists” on social media.

The last one is my post illustrating how “terrorist” at one time = “anti-Vietnam war,” “civil rights,” and “gay rights.” Due to the public nature of social media, avoiding government surveillance isn’t possible.

I stole the title, Bots, Won’t You Hide Me? from Ben Bova’s short story, Stars, Won’t You Hide Me?. It’s not very long and if you like science fiction, you will enjoy it.

Bova took verses in the short story from Sinner Man, a traditional African spiritual, which was recorded by a number of artists.

All of that is a very round about way to introduce you to a new Twitter account: ConvJournalism:

All you need to know about Conversational Journalism, (journalistic) bots and #convcomm by @martinhoffmann.

Surveillance of groups on social media isn’t going to succeed, The White House Asked Social Media Companies to Look for Terrorists. Here’s Why They’d #Fail by Jenna McLaughlin bots can play an important role in assisting in that failure.

Imagine not only having bots that realistically mimic the chatter of actual human users but who follow, unfollow, etc., and engage in apparent conspiracies, with other bots. Entirely without human direction or very little.

Follow ConvJournalism and promote bot research/development that helps all of us hide. (I’d rather have the bots say yes than Satan.)

by Patrick Durusau at June 23, 2016 08:52 PM

Index on Censorship Big Debate: Journalism or fiction?

Index on Censorship Big Debate: Journalism or fiction? by Josie Timms.

From the webpage:

The Index on Censorship Big Debate took place at the 5th annual Leeds Big Bookend Festival this week, where journalists and authors were invited to discuss which has the biggest impact: journalism or fiction. Index’s magazine editor Rachael Jolley was joined by assistant features editor of The Yorkshire Post Chris Bond, Yorkshire-based journalist and author Yvette Huddleston and author of the award- winning Promised Land Anthony Clavane to explore which medium is more influential and why, as part of a series of Time To Talk debates held by Eurozine. Audio from the debate will be available at Time to Talk or listen below.

Highly entertaining discussion but “debate” is a bit of a stretch.

No definition of “impact” was offered, although an informal show of hands was reported to have the vast majority remembering a work of fiction that influenced them and only a distinct minority remembering a work of journalism.

Interesting result because Dickens, a journalist, was mentioned as an influential writer of fiction. At the time, fiction was published in serialized formats (newspapers, magazines) Victorian Serial Novels, spreading the cost of a work of fiction over months, if not longer.

Dickens is a good example to not make too much of the distinction, if any, between journalism and fiction. Both are reports of the past, present or projected future from a particular point of view.

At their best, journalism and fiction inform us, enlighten us, show us other points of view, capture events and details we did not witness ourselves.

That doesn’t accord with the 0 or 1 reality of our silicon servants, but I have no desire to help AIs become equal to humans by making humans dumber.

Enjoy!

by Patrick Durusau at June 23, 2016 07:52 PM

The Infinite Jukebox

The Infinite Jukebox

From the FAQ:

  • What is this? For when your favorite song just isn’t long enough. This web app lets you upload a favorite MP3 and will then generate a never-ending and ever changing version of the song. It does what Infinite Gangnam Style did but for any song.
  • It never stops? – That’s right. It will play forever.
  • How does it work? – We use the Echo Nest analyzer to break the song into beats. We play the song beat by beat, but at every beat there’s a chance that we will jump to a different part of song that happens to sound very similar to the current beat. For beat similarity we look at pitch, timbre, loudness, duration and the position of the beat within a bar. There’s a nifty visualization that shows all the possible transitions that can occur at any beat.
  • Are there any ways to control the song? Yes – here are some keys:
    • [space] – Start and stop playing the song
    • [left arrow] – Decrement the current play velocity by one
    • [right arrow] – Increment the current play velocity by one
    • [Down arrow] – Sets the current play velocity to zero
    • [control] – freeze on the current beat
    • [shift] – bounce between the current beat and all of the similar sounding beats. These are the
      branch points.

    • ‘h’ – Bring it on home – toggles infinite mode off/on.
  • What do the colored blocks represent? Each block represents a beat in the song. The colors are related
    to the timbre of the music for that beat.

That should be enough to get you started. ;-)

There’s a post on the Infinite Jukebox at Music Machinery.

I have mixed feelings about the Infinite Jukebox. While I appreciate its artistry and ability to make the familiar into something familiar, yet different, I also have a deep appreciation for the familiar.

Compare: While My Guitar Gently Weeps by the Beatles to Somebody to Love by Jefferson Airplane at the Infinite Jukebox.

The heart rending vocals of Grace Slick, on infinite play, become overwhelming.

I need to upload Lather. Strictly for others. I’m quite happy with the original.

Enjoy!

by Patrick Durusau at June 23, 2016 01:04 AM

June 22, 2016

Patrick Durusau

Shallow Reading (and Reporting)

Stefano Bertolo tweets:

bertolo-01-460

From the Chicago Tribune post:

On June 4, the satirical news site the Science Post published a block of “lorem ipsum” text under a frightening headline: “Study: 70% of Facebook users only read the headline of science stories before commenting.”

Nearly 46,000 people shared the post, some of them quite earnestly — an inadvertent example, perhaps, of life imitating comedy.

Now, as if it needed further proof, the satirical headline’s been validated once again: According to a new study by computer scientists at Columbia University and the French National Institute, 59 percent of links shared on social media have never actually been clicked: In other words, most people appear to retweet news without ever reading it.

The missing satire link:

Study: 70% of Facebook users only read the headline of science stories before commenting, from the satirical news site Science Post.

The passage:

According to a new study by computer scientists at Columbia University and the French National Institute, 59 percent of links shared on social media have never actually been clicked: In other words, most people appear to retweet news without ever reading it.

should have included a link to: Social Clicks: What and Who Gets Read on Twitter?, by Maksym Gabielkov, Arthi Ramachandran, Augustin Chaintreau, Arnaud Legout.

Careful readers, however, would have followed the link to Social Clicks: What and Who Gets Read on Twitter?, only to discover that Dewey mis-reported the original article.

Here’s how to identify the mis-reporting:

First, as technical articles often do, the authors started with definitions. Definitions that will influence everything you read in that article.


In the rest of this article, we will use the following terms to describe a given URL or online article.

Shares. Number of times a URL has been published in tweets. An original tweet containing the URL or a retweet of this tweet are both considered as a new share.
…(emphasis in the original)

The important point is to remember: Every tweet counts as a “share.” If I post a tweet that is never retweeted by anyone, it goes into the share bucket and is one of the shares that was never clicked on.

That is going to impact our counting of “shares” that were never “clicked on.”

In section 3.3 Blockbusters and the share button, the authors write:


First, 59% of the shared URLs are never clicked or, as we call them, silent. Note that we merged URLs pointing to the same article, so out of 10 articles mentioned on Twitter, 6 typically on niche topics are never clicked 10.

Because silent URLs are so common, they actually account for a significant fraction (15%) of the whole shares we collected, more than one out of seven. An interesting paradox is that there seems to be vastly more niche content that users are willing to mention in Twitter than the content that they are actually willing to click on.
… (emphasis in the original)

To re-write that with the definition of shared inserted:

“…59% of the URLs published in a tweet or re-tweet are never clicked…”

That includes:

  1. Tweet with a URL and no one clicks on the shortened URL in bit.ly
  2. Re-tweet with a URL and a click on the shortened URL in bit.ly

Since tweets and re-tweets are lumped together (they may not be in the data, I haven’t seen it, yet), it isn’t possible to say how many re-tweets occurred without corresponding clicks on the shortened URLs.

I’m certain people share tweets without visiting URLs but this article isn’t authority for percentages on that claim.

Not only should you visit URLs but you should also read carefully what you find, before re-tweeting or reporting.

by Patrick Durusau at June 22, 2016 08:39 PM

The No-Value-Add Of Academic Publishers And Peer Review

Comparing Published Scientific Journal Articles to Their Pre-print Versions by Martin Klein, Peter Broadwell, Sharon E. Farb, Todd Grappone.

Abstract:

Academic publishers claim that they add value to scholarly communications by coordinating reviews and contributing and enhancing text during publication. These contributions come at a considerable cost: U.S. academic libraries paid $1.7 billion for serial subscriptions in 2008 alone. Library budgets, in contrast, are flat and not able to keep pace with serial price inflation. We have investigated the publishers’ value proposition by conducting a comparative study of pre-print papers and their final published counterparts. This comparison had two working assumptions: 1) if the publishers’ argument is valid, the text of a pre-print paper should vary measurably from its corresponding final published version, and 2) by applying standard similarity measures, we should be able to detect and quantify such differences. Our analysis revealed that the text contents of the scientific papers generally changed very little from their pre-print to final published versions. These findings contribute empirical indicators to discussions of the added value of commercial publishers and therefore should influence libraries’ economic decisions regarding access to scholarly publications.

The authors have performed a very detailed analysis of pre-prints, 90% – 95% of which are published as open pre-prints first, to conclude there is no appreciable difference between the pre-prints and the final published versions.

I take “…no appreciable difference…” to mean academic publishers and the peer review process, despite claims to the contrary, contribute little or no value to academic publications.

How’s that for a bargaining chip in negotiating subscription prices?

by Patrick Durusau at June 22, 2016 02:33 AM

June 21, 2016

Patrick Durusau

Tapping Into The Terror Money Stream

Can ISIS Take Down D.C.? by Jeff Stein.

From the post:


If the federal government is good at anything, however, it’s throwing money at threats. Since 2003, taxpayers have contributed $1.3 billion to the feds’ BioWatch program, a network of pathogen detectors deployed in D.C. and 33 other cities (plus at so-called national security events like the Super Bowl), despite persistent questions about its need and reliability. In 2013, Republican Representative Tim Murphy of Pennsylvania, chairman of the House Energy and Commerce Committee’s Oversight and Investigations subcommittee, called it a “boondoggle.” Jeh Johnson, who took over the reins of the Department of Homeland Security (DHS) in late 2013, evidently agreed. One of his first acts was to cancel a planned third generation of the program, but the rest of it is still running.

“The BioWatch program was a mistake from the start,” a former top federal emergency medicine official tells Newsweek on condition of anonymity, saying he fears retaliation from the government for speaking out. The well-known problems with the detectors, he says, are both highly technical and practical. “Any sort of thing can blow into its filter papers, and then you are wrapping yourself around an axle,” trying to figure out if it’s real. Of the 149 suspected pathogen samples collected by BioWatch detectors nationwide, he reports, “none were a threat to public health.” A 2003 tularemia alarm in Texas was traced to a dead rabbit.

Michael Sheehan, a former top Pentagon, State Department and New York Police Department counterterrorism official, echoes such assessments. “The technology didn’t work, and I had no confidence that it ever would,” he tells Newsweek. The immense amounts of time and money devoted to it, he adds, could’ve been better spent “protecting dangerous pathogens stored in city hospitals from falling into the wrong hands.” When he sought to explore that angle at the NYPD, the Centers for Disease Control and Prevention “initially would not tell us where they were until I sent two detectives to Atlanta to find out,” he says. “And they did, and we helped the hospitals with their security—and they were happy for the assistance.”

Even if BioWatch performed as touted, Sheehan and others say, a virus would be virtually out of control and sending scores of people to emergency rooms by the time air samples were gathered, analyzed and the horrific results distributed to first responders. BioWatch, Sheehan suggests, is a billion-dollar hammer looking for a nail, since “weaponizing biological agents is incredibly hard to do,” and even ISIS, which theoretically has the scientific assets to pursue such weapons, has shown little sustained interest in them. Plus, extremists of all denominations have demonstrated over the decades that they like things that go boom (or tat-tat-tat, the sound of an assault rifle). So the $1.1 billion spent on BioWatch is way out of proportion to the risk, critics argue. What’s really driving programs like BioWatch, Sheehan says—beside fears of leaving any potential threat uncovered, no matter how small—is the opportunity it gives members of Congress to lard out pork to research universities and contractors back home.

Considering that two people, one rifle, terrorized the D.C. area for 23 days, The Beltway Snipers, Part 1, The Beltway Snipers, Part 2, I would have to say yes, ISIS can take down D.C.

Even if they limit themselves to “…things that go boom (or tat-tat-tat, the sound of an assault rifle).” (You have to wonder about the quality of their “terrorist” training.)

But in order to get funding, you have to discover a scenario that isn’t fully occupied by contractors.

Quite recently I read of an effort to detect the possible onset of terror attacks based on social media traffic. Except there is no evidence that random social media group traffic picks up before a terrorist attack. Yeah, well, there is that but that won’t come up for years.

Here’s a new terror vector. Using Washington, D.C. as an example, how would you weaponize open data found at: District of Columbia Open Data?

Data.gov reports there are forty states (US), forty-eight counties and cities (US), fifty-two international countries (what else would they be?), and one-hundred and sixty-four international regions with open data portals.

That’s a considerable amount of open data. Data that could be combined together to further ends not intended to improve public health and well-being.

Don’t allow the techno-jingoism of posts like: How big data can terrorize global terrorism lull you in to a false sense of security.

Anyone who can think beyond being a not-so-smart bomb or tat-tat-tat can access and use open data with free tools. Are you aware of the danger that poses?

by Patrick Durusau at June 21, 2016 09:44 PM

Driving While Black (DWB) Stops Affirmed By Supreme Court [Hacker Tip]

Justice Sotomayor captures the essence of Utah v. Strieff when she writes:

The Court today holds that the discovery of a warrant for an unpaid parking ticket will forgive a police officer’s violation of your Fourth Amendment rights. Do not be soothed by the opinion’s technical language: This case allows the police to stop you on the street, demand your identification, and check it for outstanding traffic warrants—even if you are doing nothing wrong. If the officer discovers a warrant for a fine you forgot to pay, courts will now excuse his illegal stop and will admit into evidence anything he happens to find by searching you after arresting you on the warrant. Because the Fourth Amendment should prohibit, not permit, such misconduct, I dissent.

The facts are easy enough to summarize, Edward Strieff was seen visiting a home that had been reported (but not confirmed) as a site of drug sales. Officer Frackwell, with no suspicions that Strieff had committed a crime, detained Strieff, requested his identification and was advised of a traffic warrant for his arrest. Frackwell arrested Strieff and while searching him, discovered “a baggie of methamphetamine and drug paraphernalia.”

Frackwell moved to suppress the “a baggie of methamphetamine and drug paraphernalia” since Officer Frackwell lacked even a pretense for the original stop. The Utah Supreme Court correctly agreed but the Supreme Court in this decision, written by “Justice” Thomas, disagreed.

The “exclusionary rule” has a long history but for our purposes, it suffices to say that it removes any incentive for police officers to stop people without reasonable suspicion and demand their ID, search them, etc.

It does so by excluding any evidence of a crime they discover as a result of such a stop. Or at least it did prior to Utah v. Strieff. Police officers were forced to make up some pretext for a reasonable suspicion in order to stop any given individual.

No reasonable suspicion for stop = No evidence to be used in court.

That was the theory, prior to Utah v. Strieff

Sotomayor makes clear in her dissent, this was a suspicionless stop:


This case involves a suspicionless stop, one in which the officer initiated this chain of events without justification. As the Justice Department notes, supra, at 8, many innocent people are subjected to the humiliations of these unconstitutional searches. The white defendant in this case shows that anyone’s dignity can be violated in this manner. See M. Gottschalk, Caught 119–138 (2015). But it is no secret that people of color are disproportionate victims of this type of scrutiny. See M. Alexander, The New Jim Crow 95–136 (2010). For generations, black and brown parents have given their children “the talk”— instructing them never to run down the street; always keep your hands where they can be seen; do not even think of talking back to a stranger—all out of fear of how an officer with a gun will react to them. See, e.g., W. E. B. Du Bois, The Souls of Black Folk (1903); J. Baldwin, The Fire Next Time (1963); T. Coates, Between the World and Me (2015).

By legitimizing the conduct that produces this double consciousness, this case tells everyone, white and black, guilty and innocent, that an officer can verify your legal status at any time. It says that your body is subject to invasion while courts excuse the violation of your rights. It implies that you are not a citizen of a democracy but the subject of a carceral state, just waiting to be cataloged.

We must not pretend that the countless people who are routinely targeted by police are “isolated.” They are the canaries in the coal mine whose deaths, civil and literal, warn us that no one can breathe in this atmosphere. See L. Guinier & G. Torres, The Miner’s Canary 274–283 (2002). They are the ones who recognize that unlawful police stops corrode all our civil liberties and threaten all our lives. Until their voices matter too, our justice system will continue to be anything but. (emphasis in original)

New rule: Police can stop you at any time, for no reason, demand identification, check your legal status, if you are arrested as a result of that check, any evidence seized can be used against you in court.

Police officers were very good at imagining reasonable cause for stopping people, but now even that tissue of protection has been torn away.

You are subject to arbitrary and capricious stops with no disincentive for the police. They can go fishing for evidence and see what turns up.

For all of that, I don’t see the police as our enemy. They are playing by rules as defined by others. If we want better play, such as Fourth Amendment rights, then we need enforcement of those rights.

It isn’t hard to identify the enemies of the people in this decision.


Hackers, you too can be stopped at anytime. Hackers should never carry incriminating USB drives, SIM cards, etc. If possible, everything even remotely questionable should not be in a location physically associated with you.

Remote storage of your code, booty, etc., protects it from clumsy physical seizure of local hardware and, if you are very brave, enables rapid recovery from such seizures.

by Patrick Durusau at June 21, 2016 02:36 PM

June 20, 2016

Patrick Durusau

Cryptome – Happy 20th Anniversary!

cryptome-01-460

Cryptome marks 20 years, June 1996-2016, 100K dox thanx to 25K mostly anonymous doxers.



Donate $100 for the Cryptome Archive of 101,900 files from June 1996 to 25 May 2016 on 1 USB  (43.5GB). Cryptome public key.
(Search site with Google, or WikiLeaks for most not all.)

Bitcoin: 1P11b3Xkgagzex3fYusVcJ3ZTVsNwwnrBZ

Additional items on https://twitter.com/Cryptomeorg


Interesting post on fake Cryptome torrents: http://www.joshwieder.net/2015/07/cryptome-torrents-draw-concerns.html

$100 is a real bargain for the Cryptome Archive, plus you will be helping a worthy cause.

Repost the news of Cryptome 20th anniversary far and wide!

Thanks!

by Patrick Durusau at June 20, 2016 09:09 PM

Clojure Gazette – New Format – Looking for New Readers

Clojure Gazette by Eric Normand.

From the end of this essay:

Hi! The Clojure Gazette has recently changed from a list of curated links to an essay-style newsletter. I’ve gotten nothing but good comments about the change, but I’ve also noticed the first negative growth of readership since I started. I know these essays aren’t for everyone, but I’m sure there are people out there who would like the new format who don’t know about it. Would you do me a favor? Please share the Gazette with your friends!

The Biggest Waste in Our Industry is the title of the essay I link to above.

From the post:

I would like to talk about two nasty habits I have been party to working in software. Those two habits are 1) protecting programmer time and 2) measuring programmer productivity. I’m talking from my experience as a programmer to all the managers out there, or any programmer interested in process.

You can think of Eric’s essay as an update to Peopleware: Productive Projects and Teams by Tom DeMarco and Timothy Lister.

Peopleware was first published in 1987, second edition in 1999 (8 new chapters), third edition in 2013 (5 more pages than 1999 edition?).

Twenty-nine (29) years after the publication of Peopleware, managers still don’t “get” how to manage programmers (or other creative workers).

Disappointing, but not surprising.

It’s not uncommon to read position ads that describe going to lunch en masse, group activities, etc.

You would think they were hiring lemmings rather than technical staff.

If your startup founder is that lonely, check the local mission. Hire people for social activities, lunch, etc. Cheaper than hiring salaried staff. Greater variety as well. Ditto for managers with the need to “manage” someone.

by Patrick Durusau at June 20, 2016 08:24 PM

Tufte-inspired LaTeX (handouts, papers, and books)

Tufte-LaTeX – A Tufte-inspired LaTeX class for producing handouts, papers, and books.

From the webpage:

As discussed in the Book Design thread of Edward Tufte’s Ask E.T Forum, this site is home to LaTeX classes for producing handouts and books according to the style of Edward R. Tufte and Richard Feynman.

Download the latest release, browse the source, join the mailing list, and/or submit patches. Contributors are welcome to help polish these classes!

Some examples of the Tufte-LaTeX classes in action:

  • Some papers by Jason Catena using the handout class
  • A handout for a math club lecture on volumes of n-dimensional spheres by Marty Weissman
  • A draft copy of a book written by Marty Weissman using the new Tufte-book class
  • An example handout (source) using XeLaTeX with the bidi class option for the ancient Hebrew by Kirk Lowery

Caution: A Tufte-inspired LaTeX class is no substitute for professional design advice and assistance. It will help you do “better,” for some definition of “better,” but professional design is in a class of its own.

If you are interested in TeX/LaTeX tips, follow: TexTips. One of several excellent Twitter feeds by John D. Cook.

by Patrick Durusau at June 20, 2016 07:51 PM

Machine Learning Yearning [New Book – Free Draft – Signup By Friday June 24th (2016)

Machine Learning Yearning by Andrew Ng.

About Andrew Ng:

Andrew Ng is Associate Professor of Computer Science at Stanford; Chief Scientist of Baidu; and Chairman and Co-founder of Coursera.

In 2011 he led the development of Stanford University’s main MOOC (Massive Open Online Courses) platform and also taught an online Machine Learning class to over 100,000 students, leading to the founding of Coursera. Ng’s goal is to give everyone in the world access to a great education, for free. Today, Coursera partners with some of the top universities in the world to offer high quality online courses, and is the largest MOOC platform in the world.

Ng also works on machine learning with an emphasis on deep learning. He founded and led the “Google Brain” project which developed massive-scale deep learning algorithms. This resulted in the famous “Google cat” result, in which a massive neural network with 1 billion parameters learned from unlabeled YouTube videos to detect cats. More recently, he continues to work on deep learning and its applications to computer vision and speech, including such applications as autonomous driving.

Haven’t you signed up yet?

OK, What You Will Learn:

The goal of this book is to teach you how to make the numerous decisions needed with organizing a machine learning project. You will learn:

  • How to establish your dev and test sets
  • Basic error analysis
  • How you can use Bias and Variance to decide what to do
  • Learning curves
  • Comparing learning algorithms to human-level performance
  • Debugging inference algorithms
  • When you should and should not use end-to-end deep learning
  • Error analysis by parts

Free drafts of a new book on machine learning projects, not just machine learning, by one of the leading world experts on machine learning.

Now are you signed up?

If you are interested in machine learning, following Andrew Ng on Twitter isn’t a bad place to start.

Be aware, however, that even machine learning experts can be mistaken. For example, Andrew tweeted, favorably, How to make a good teacher from the Economist.


Instilling these techniques is easier said than done. With teaching as with other complex skills, the route to mastery is not abstruse theory but intense, guided practice grounded in subject-matter knowledge and pedagogical methods. Trainees should spend more time in the classroom. The places where pupils do best, for example Finland, Singapore and Shanghai, put novice teachers through a demanding apprenticeship. In America high-performing charter schools teach trainees in the classroom and bring them on with coaching and feedback.

Teacher-training institutions need to be more rigorous—rather as a century ago medical schools raised the calibre of doctors by introducing systematic curriculums and providing clinical experience. It is essential that teacher-training colleges start to collect and publish data on how their graduates perform in the classroom. Courses that produce teachers who go on to do little or nothing to improve their pupils’ learning should not receive subsidies or see their graduates become teachers. They would then have to improve to survive.

The author conflates “demanding apprenticeship” with “teacher-training colleges start to collect and publish data on how their graduates perform in the classroom,” as though whatever data we collect has some meaningful relationship with teaching and/or the training of teachers.

A “demanding apprenticeship” no doubt weeds out people who are not well suited to be teachers, there is no evidence that it can make a teacher out of someone who isn’t suited for the task.

The collection of data is one of the ongoing fallacies about American education. Simply because you can collect data is no indication that it is useful and/or has any relationship to what you are attempting to measure.

Follow Andrew for his work on machine learning, not so much for his opinions on education.

by Patrick Durusau at June 20, 2016 02:48 PM

Concealing the Purchase of Government Officials

Fredreka Schouten reports in House approves Koch-backed bill to shield donors’ names the US House of Representatives, has passed a measure to conceal the purchase of government officials.

From the post:

The House approved a bill Tuesday that would bar the IRS from collecting the names of donors to tax-exempt groups, prompting warnings from campaign-finance watchdogs that it could lead to foreign interests illegally infiltrating American elections.

The measure, which has the support of House Speaker Paul Ryan, R-Wis., also pits the Obama administration against one of the most powerful figures in Republican politics, billionaire industrialist Charles Koch. Koch’s donor network channels hundreds of millions of dollars each year into groups that largely use anonymous donations to shape policies on everything from health care to tax subsidies. Its leaders have urged the Republican-controlled Congress to clamp down on the IRS, citing free-speech concerns.

The names of donors to politically active non-profit groups aren’t public information now, but the organizations still have to disclose donor information to the IRS on annual tax returns. The bill, written by Rep. Peter Roskam, R-Ill., would prohibit the tax agency from collecting names, addresses or any “identifying information” about donors.

Truth be told, however, “the House” didn’t vote in favor of H.R.5053 – Preventing IRS Abuse and Protecting Free Speech Act.

Rather, two-hundred and forty (240) identified representatives voted in favor of H.R.5053.

Two-hundred and forty representatives purchased by campaign contributions who now wish to keep their contributors secret.

Two-hundred and forty representatives who are as likely as not, guilty of criminal, financial/sexual or other forms of misconduct, that could result in their replacement.

Two-hundred and forty representatives who continue in office only so long as they are not exposed to law enforcement and the public.

Where are you going to invest your time and resources?

Showing solidarity on issues where substantive change isn’t going to happen, or taking back your government from its current purchasers?

PS: In case you think “substantive change” is possible on gun control, consider the unlikely scenario that “assault weapons” are banned from sale. So what? The ones in circulation number in the millions. Net effect of your “victory” would be exactly zero.

by Patrick Durusau at June 20, 2016 12:55 PM

June 19, 2016

Patrick Durusau

How do you skim through a digital book?

How do you skim through a digital book? by Chloe Roberts.

From the post:

We’ve had a couple of digitised books that proved really popular with online audiences. Perhaps partly reflecting the interests of the global population, they’ve been about prostitutes and demons.

I’ve been especially interested in how people have interacted with these popular digitised books. Imagine how you’d pick up a book to look at in a library or bookshop. Would you start from page one, laboriously working through page by page, or would you flip through it, checking for interesting bits? Should we expect any different behaviour when people use a digital book?

We collect data on aggregate (nothing personal or trackable to our users) about what’s being asked of our digitised items in the viewer. With such a large number of views of these two popular books, I’ve got a big enough dataset to get an interesting idea of how readers might be using our digitised books.

Focusing on ‘Compendium rarissimum totius Artis Magicae sistematisatae per celeberrimos Artis hujus Magistros. Anno 1057. Noli me tangere’ (the 18th century one about demons) I’ve mapped the number of page views (horizontal axis) against page number (vertical axis, with front cover at the top), and added coloured bands to represent what’s on those pages.

Chole captured and then analyzed the reading behavior of readers on two very popular electronic titles.

She explains her second observation:

Observation 2: People like looking at pictures more than text

by suggesting the text being in Latin and German may explain the fondness for the pictures.

Perhaps, but I have heard the same observation made about Playboy magazine. ;-)

From a documentation/training perspective, Chole’s technique, for digital training materials, could provide guidance on:

  • Length of materials
  • Use of illustrations
  • Organization of materials
  • What material is habitually unread?

If critical material isn’t being read, exhorting newcomers to read more carefully, is not the answer.

If security and/or on-boarding reading isn’t happening, as shown by reader behavior, that’s your fault, not the readers.

Your call, successful staff and customers or failing staff and customers you can blame for security faults and declining sales.

Choose carefully.

by Patrick Durusau at June 19, 2016 09:40 PM

Electronic Literature Organization

Electronic Literature Organization

From the “What is E-Lit” page:

Electronic literature, or e-lit, refers to works with important literary aspects that take advantage of the capabilities and contexts provided by the stand-alone or networked computer. Within the broad category of electronic literature are several forms and threads of practice, some of which are:

  • Hypertext fiction and poetry, on and off the Web
  • Kinetic poetry presented in Flash and using other platforms
  • Computer art installations which ask viewers to read them or otherwise have literary aspects
  • Conversational characters, also known as chatterbots
  • Interactive fiction
  • Literary apps
  • Novels that take the form of emails, SMS messages, or blogs
  • Poems and stories that are generated by computers, either interactively or based on parameters given at the beginning
  • Collaborative writing projects that allow readers to contribute to the text of a work
  • Literary performances online that develop new ways of writing

The ELO showcase, created in 2006 and with some entries from 2010, provides a selection outstanding examples of electronic literature, as do the two volumes of our Electronic Literature Collection.

The field of electronic literature is an evolving one. Literature today not only migrates from print to electronic media; increasingly, “born digital” works are created explicitly for the networked computer. The ELO seeks to bring the literary workings of this network and the process-intensive aspects of literature into visibility.

The confrontation with technology at the level of creation is what distinguishes electronic literature from, for example, e-books, digitized versions of print works, and other products of print authors “going digital.”

Electronic literature often intersects with conceptual and sound arts, but reading and writing remain central to the literary arts. These activities, unbound by pages and the printed book, now move freely through galleries, performance spaces, and museums. Electronic literature does not reside in any single medium or institution.

I was looking for a recent presentation by Allison Parrish on bots when I encountered Electronic Literature Organization (ELO).

I was attracted by the bot discussion at a recent conference but as you can see, the range of activities of the ELO is much broader.

Enjoy!

by Patrick Durusau at June 19, 2016 09:04 PM

“invisible entities having arcane but gravely important significances”

Allison Parrish tweeted:

https://t.co/sXt6AqEIoZ the “Other, Format” unicode category, full of invisible entities having arcane but gravely important significances

I just could not let a tweet with:

“invisible entities having arcane but gravely important significances”

pass without comment!

As of today, one-hundred and fifty (150) such entities. All with multiple properties.

How many of these “invisible entities” are familiar to you?

by Patrick Durusau at June 19, 2016 08:23 PM

Formal Methods for Secure Software Construction

Formal Methods for Secure Software Construction by Ben Goodspeed.

Abstract:

The objective of this thesis is to evaluate the state of the art in formal methods usage in secure computing. From this evaluation, we analyze the common components and search for weaknesses within the common workflows of secure software construction. An improved workflow is proposed and appropriate system requirements are discussed. The systems are evaluated and further tools in the form of libraries of functions, data types and proofs are provided to simplify work in the selected system. Future directions include improved program and proof guidance via compiler error messages, and targeted proof steps.

George chose Idris for this project saying:

The criteria for selecting a language for this work were expressive power, theorem proving ability (sufficient to perform universal quantification), extraction/compilation, and performance. Idris has sufficient expressive power to be used as a general purpose language (by design) and has library support for many common tasks (including web development). It supports machine verified proof and universal quantification over its datatypes and can be directly compiled to produce efficiently sized executables with reasonable performance (see section 10.1 for details). Because of these characteristics, we have chosen Idris as the basis for our further work. (at page 57)

The other contenders were Coq, Agda, Haskell, and Isabelle.

Ben provides examples of using Idris and his Proof Driven Development (PDD), but stops well short of solving the problem of secure software construction.

While waiting upon the arrival of viable methods for secure software construction, shouldn’t formal methods be useful in uncovering and documenting failures in current software?

Reasoning the greater specificity and exactness of formal methods will draw attention to gaps and failures concealed by custom and practice.

Akin to the human eye eliding over mistakes such as “When the the cat runs.”

The average reader “auto-corrects” for the presence of the second “the” in that sentence, even knowing there are two occurrences of the word “the.”

Perhaps that is a better way to say it: Formal methods avoid the human tendency to auto-correct or elide over unknown outcomes in code.

by Patrick Durusau at June 19, 2016 03:51 PM

PSA – Misleading Post On Smartphone Security

If you happen across Your smartphone could be hacked without your knowledge by Jennifer Schlesinger and Andrea Day, posted on CNBC, don’t bother to read it. Dissuade others from reading it.

The three threats as listed by the authors:

  • Unsecure Wi-Fi
  • Operating system flaws
  • Malicious apps

What’s missing?

Hmmm, can you say SS7 vulnerability?

The omission of SS7 vulnerability is particularly disturbing because in some ways, it has the easiest defense.

Think about it for a moment. What do I need as the premise for most (not all) successful SS7 hacks?

Your smartphone number.

Yes, information you give away with every email, contact information listing, website registration, etc. Not only given away, but archived and available to search engines.

If you don’t believe me, try running a web search on your smartphone number.

I understand that your smartphone number is as useful as it is widespread. I’m just pointing out how many times you have tied a noose around your own neck.

The best (partial) defense to SS7 attacks?

Limit the distribution of your smartphone number.

When someone omits a root problem of smartphone security, in a listing of smartphone security issues, how much trust can you put in the rest of their analysis?

by Patrick Durusau at June 19, 2016 02:48 PM

Palantir Hack Report – What’s Missing?

How Hired Hackers Got “Complete Control” Of Palantir by William Alden.

From the post:

Palantir Technologies has cultivated a reputation as perhaps the most formidable data analysis firm in Silicon Valley, doing secretive work for defense and intelligence agencies as well as Wall Street giants. But when Palantir hired professional hackers to test the security of its own information systems late last year, the hackers found gaping holes that left data about customers exposed.

Palantir, valued at $20 billion, prides itself on an ability to guard important secrets, both its own and those entrusted to it by clients. But after being brought in to try to infiltrate these digital defenses, the cybersecurity firm Veris Group concluded that even a low-level breach would allow hackers to gain wide-ranging and privileged access to the Palantir network, likely leading to the “compromise of critical systems and sensitive data, including customer-specific information.”

This conclusion was presented in a confidential report, reviewed by BuzzFeed News, that detailed the results of a hacking exercise run by Veris over three weeks in September and October last year. The report, submitted on October 19, has been closely guarded inside Palantir and is described publicly here for the first time. “Palantir Use Only” is plastered across each page.

It is not known whether Palantir’s systems have ever been breached by real-world intruders. But the results of the hacking exercise — known as a “red team” test — show how a company widely thought to have superlative ability to safeguard data has struggled with its own data security.

The red team intruders, finding that Palantir lacked crucial internal defenses, ultimately “had complete control of PAL’s domain,” the Veris report says, using an acronym for Palantir. The report recommended that Palantir “immediately” take specific steps to improve its data security.

“The findings from the October 2015 report are old and have long since been resolved,” Lisa Gordon, a Palantir spokesperson, said in an emailed statement. “Our systems and our customers’ information were never at risk. As part of our best practices, we conduct regular reviews and tests of our systems, like every other technology company does.”

Alden gives a lengthy summary of the report, but since Palantir claims the reported risks “…have long since been resolved” where is the Veris report?

Describing issues in glittering generalities isn’t going to improve anyone’s cybersecurity stance.

So I have to wonder, is How Hired Hackers Got “Complete Control” Of Palantir an extended commercial for Veris? Is it an attempt to sow doubt and uncertainty among Palantir customers?

End of the day, Alden’s summary can be captured in one sentence:

Veris attackers took and kept control of Palantir’s network from day one to the end of the exercise, evading defenders all the way.

How useful is that one sentence summary in improving your cybersecurity stance?

That’s what I thought as well.

PS: I’m interested in pointers to any “leaked” copies of the Veris report on Palantir.

by Patrick Durusau at June 19, 2016 01:31 PM

June 18, 2016

Patrick Durusau

IRS E-File Bucket – Internet Archive

IRS E-File Bucket courtesy of Carl Malamud and Public.Resource.Org.

From the webpage:

This bucket contains a mirror of the IRS e-file release as of June 16, 2016. You may access the source files at https://aws.amazon.com/public-data-sets/irs-990/. The present bucket may or may not be updated in the future.

To access this bucket, use the download links.

Note that tarballs is image scans from 2002-2015 are also available in this IRS 990 Forms collection.

Many thanks to the Internal Revenue Service for making this information available. Here is their announcement on June 16, 2016. Here is a statement from Public.Resource.Org congratulating the IRS on a job well done.

As I noted in IRS 990 Filing Data (2001 to date):

990* disclosures aren’t detailed enough to pinch but when combined with other data, say leaked data, the results can be remarkable.

It’s up to you to see that public disclosures pinch.

by Patrick Durusau at June 18, 2016 09:41 PM

Where Has Sci-Hub Gone?

While I was writing about the latest EC idiocy (link tax), I was reminded of Sci-Hub.

Just checking to see if it was still alive, I tried http://sci-hub.io/.

404 by standard DNS service.

If you are having the same problem, Mike Masnick reports in Sci-Hub, The Repository Of ‘Infringing’ Academic Papers Now Available Via Telegram, you can access Sci-Hub via:

I’m not on Telegram, yet, but that may be changing soon. ;-)

BTW, while writing this update, I stumbled across: The New Napster: How Sci-Hub is Blowing Up the Academic Publishing Industry by Jason Shen.

From the post:


This is obviously piracy. And Elsevier, one of the largest academic journal publishers, is furious. In 2015, the company earned $1.1 billion in profits on $2.9 billion in revenue [2] and Sci-hub directly attacks their primary business model: subscription service it sells to academic organizations who pay to get access to its journal articles. Elsevier filed a lawsuit against Sci-Hub in 2015, claiming Sci-hub is causing irreparable injury to the organization and its publishing partners.

But while Elsevier sees Sci-Hub as a major threat, for many scientists and researchers, the site is a gift from the heavens, because they feel unfairly gouged by the pricing of academic publishing. Elsevier is able to boast a lucrative 37% profit margin because of the unusual (and many might call exploitative) business model of academic publishing:

  • Scientists and academics submit their research findings to the most prestigious journal they can hope to land in, without getting any pay.
  • The journal asks leading experts in that field to review papers for quality (this is called peer-review and these experts usually aren’t paid)
  • Finally, the journal turns around and sells access to these articles back to scientists/academics via the organization-wide subscriptions at the academic institution where they work or study

There’s piracy afoot, of that I have no doubt.

Elsevier:

  • Relies on research it does not sponsor
  • Research results are submitted to it for free
  • Research is reviewed for free
  • Research is published in journals of value only because of the free contributions to them
  • Elsevier makes a 37% profit off of that free content

There is piracy but Jason fails to point to Elsevier as the pirate.

Sci-Hub/Alexandra Elbakyan is re-distributing intellectual property that was stolen by Elsevier from the academic community, for its own gain.

It’s time to bring Elsevier’s reign of terror against the academic community to an end. Support Sci-Hub in any way possible.

by Patrick Durusau at June 18, 2016 08:24 PM

A Plausible Explanation For The EC Human Brain Project

I have puzzled for years over how to explain the EC’s Human Brain Project. See The EC Brain if you need background on this ongoing farce.

While reading Reject Europe’s Plans To Tax Links and Platforms by Jeremy Malcolm, I suddenly understood the motivation for the Human Brain Project!

From the post:

A European Commission proposal to give new copyright-like veto powers to publishers could prevent quotation and linking from news articles without permission and payment. The Copyright for Creativity coalition (of which EFF is a member) has put together an easy survey and answering guide to guide you through the process of submitting your views before the consultation for this “link tax” proposal winds up on 15 June.

Since the consultation was opened, the Commission has given us a peek into some of the industry pressures that have motivated what is, on the face of it, otherwise an inexplicable proposal. In the synopsis report that accompanied the release of its Communication on Online Platforms, it writes that “Right-holders from the images sector and press publishers mention the negative impact of search engines and news aggregators that take away some of the traffic on their websites.” However, this claim is counter-factual, as search engines and aggregators are demonstrably responsible for driving significant traffic to news publishers’ websites. This was proved when a study conducted in the wake of introduction of a Spanish link tax resulted in a 6% decline in traffic to news websites, which was even greater for the smaller sites.

There is a severe shortage of human brains at the European Commission! The Human Brain Project is a failing attempt to remedy that shortage of human brains.

Before you get angry, Europe is full of extremely fine brains. But that isn’t the same thing as saying they found at the European Commission.

Consider for example, the farcical request for comments, having previously decided the outcome as cited above. EC customary favoritism and heavy-handedness.

I would not waste electrons submitting comments to the EC on this issue.

Spend your time mining EU news sources and making fair use of their content. Every now and again, gather up your links and send them to the publications and copy the EC. So publications can see the benefits of your linking versus the overhead of the EC.

As the Spanish link tax experience proves, link taxes may deceive property cultists into expecting a windfall, in truth their revenue will decrease and what revenue is collected, will go to the EC.

There’s the mark of a true EC solution:

The intended “beneficiary” is worse off and the EC absorbs what revenue, if any, results.

by Patrick Durusau at June 18, 2016 07:24 PM

Online Surveillance: …ISIS and beyond [Social Media “chaff”]

If you ever doubted “anti-terror group surveillance tools” should always be called titled “group surveillance tools,” New online ecology of adversarial aggregates: ISIS and beyond. Science, 2016; 352 (6292): 1459 DOI: 10.1126/science.aaf0675 by N. F. Johnson, et al., puts those to rest.

Unintentionally no doubt, but the “…ISIS and beyond” part of the title signals this technique is not limited to ISIS.

Consider the abstract:

Support for an extremist entity such as Islamic State (ISIS) somehow manages to survive globally online despite considerable external pressure and may ultimately inspire acts by individuals having no history of extremism, membership in a terrorist faction, or direct links to leadership. Examining longitudinal records of online activity, we uncovered an ecology evolving on a daily time scale that drives online support, and we provide a mathematical theory that describes it. The ecology features self-organized aggregates (ad hoc groups formed via linkage to a Facebook page or analog) that proliferate preceding the onset of recent real-world campaigns and adopt novel adaptive mechanisms to enhance their survival. One of the predictions is that development of large, potentially potent pro-ISIS aggregates can be thwarted by targeting smaller ones.

Here’s the abstract re-written for the anti-war movement of the 1960’s:

Support for an extremists such as the anti-Vietnam War movement somehow manages to survive nationally online despite considerable external pressure and may ultimately inspire acts by individuals having no history of extremism, membership in a anti-war faction, or direct links to leadership. Examining longitudinal records of online activity, we uncovered an ecology evolving on a daily time scale that drives online support, and we provide a mathematical theory that describes it. The ecology features self-organized aggregates (ad hoc groups formed via linkage to a Facebook page or analog) that proliferate preceding the onset of recent real-world campaigns and adopt novel adaptive mechanisms to enhance their survival. One of the predictions is that development of large, potentially potent pro-anti-War aggregates can be thwarted by targeting smaller ones.

Here’s the abstract re-written for the civil rights movement of the 1960’s:

Support for an extremists such as SNCC somehow manages to survive nationally online despite considerable external pressure and may ultimately inspire acts by individuals having no history of extremism, membership in a SNCC faction, or direct links to leadership. Examining longitudinal records of online activity, we uncovered an ecology evolving on a daily time scale that drives online support, and we provide a mathematical theory that describes it. The ecology features self-organized aggregates (ad hoc groups formed via linkage to a Facebook page or analog) that proliferate preceding the onset of recent real-world campaigns and adopt novel adaptive mechanisms to enhance their survival. One of the predictions is that development of large, potentially potent SNCC aggregates can be thwarted by targeting smaller ones.

Here’s the abstract re-written for the gay rights movement:

Support for an extremists such as gay rights somehow manages to survive nationally online despite considerable external pressure and may ultimately inspire acts by individuals having no history of extremism, membership in a gay rights faction, or direct links to leadership. Examining longitudinal records of online activity, we uncovered an ecology evolving on a daily time scale that drives online support, and we provide a mathematical theory that describes it. The ecology features self-organized aggregates (ad hoc groups formed via linkage to a Facebook page or analog) that proliferate preceding the onset of recent real-world campaigns and adopt novel adaptive mechanisms to enhance their survival. One of the predictions is that development of large, potentially potent gay rights aggregates can be thwarted by targeting smaller ones.

The government has admitted to the use of surveillance against all three, civil rights, anti-Vietnam war, and gay rights, which in the words of Justice Holmes, “…was an outrage which the Government now regrets….”

I mention those cases so the current fervor against “terrorists” doesn’t blind us to the need for counters to every technique for disrupting “terrorists.”

“Terrorists” being a label applied to people with who some group or government disagrees. Frequently almost entirely fictional, as in the case of the United States. The FBI recruits the mentally ill in order to provide some credence to its hunt for terrorists in the US.

One obvious counter to the aggregate analysis proposed by the authors would be a series of AI-driven aggregates that are auto-populated and supplied with content derived from human users.

Defeating suppression with a large number of “fake” aggregates. Think of it as social media “chaff.”

If you think about it, separating wheat from chaff is a subject identity issue. ;-)

Production of social media “chaff” and influencing papers such as this one, is a open research subject.

If you have a cause, I have some time.

by Patrick Durusau at June 18, 2016 06:26 PM

Modelling Stems and Principal Part Lists (Attic Greek)

Modelling Stems and Principal Part Lists by James Tauber.

From the post:

This is part 0 of a series of blog posts about modelling stems and principal part lists, particularly for Attic Greek but hopefully more generally applicable. This is largely writing up work already done but I’m doing cleanup as I go along as well.

A core part of the handling of verbs in the Morphological Lexicon is the set of terminations and sandhi rules that can generate paradigms attested in grammars like Louise Pratt’s The Essentials of Greek Grammar. Another core part is the stem information for a broader range of verbs usually conveyed in works like Pratt’s in the form of lists of principal parts.

A rough outline of future posts is:

  • the sources of principal part lists for this work
  • lemmas in the Pratt principal parts
  • lemma differences across lists
  • what information is captured in each of the lists individually
  • how to model a merge of the lists
  • inferring stems from principal parts
  • stems, terminations and sandhi
  • relationships between stems
  • ???

I’ll update this outline with links as posts are published.

(emphasis in original)

A welcome reminder of projects that transcend the ephemera that is social media.

Or should I say “modern” social media?

The texts we parse so carefully were originally spoken, recorded and copied, repeatedly, without the benefit of modern reference grammars and/or dictionaries.

Enjoy!

by Patrick Durusau at June 18, 2016 12:47 AM

Volumetric Data Analysis – yt

One of those rotating homepages:

Volumetric Data Analysis – yt

yt is a python package for analyzing and visualizing volumetric, multi-resolution data from astrophysical simulations, radio telescopes, and a burgeoning interdisciplinary community.

Quantitative Analysis and Visualization

yt is more than a visualization package: it is a tool to seamlessly handle simulation output files to make analysis simple. yt can easily knit together volumetric data to investigate phase-space distributions, averages, line integrals, streamline queries, region selection, halo finding, contour identification, surface extraction and more.

Many formats, one language

yt aims to provide a simple uniform way of handling volumetric data, regardless of where it is generated. yt currently supports FLASH, Enzo, Boxlib, Athena, arbitrary volumes, Gadget, Tipsy, ART, RAMSES and MOAB. If your data isn’t already supported, why not add it?

From the non-rotating part of the homepage:

To get started using yt to explore data, we provide resources including documentation, workshop material, and even a fully-executable quick start guide demonstrating many of yt’s capabilities.

But if you just want to dive in and start using yt, we have a long list of recipes demonstrating how to do various tasks in yt. We even have sample datasets from all of our supported codes on which you can test these recipes. While yt should just work with your data, here are some instructions on loading in datasets from our supported codes and formats.

Professional astronomical data and tools like yt put exploration of the universe at your fingertips!

Enjoy!

by Patrick Durusau at June 18, 2016 12:23 AM

June 17, 2016

Patrick Durusau

Hacking Any Facebook Account – SS7 Weakness

How to Hack Someones Facebook Account Just by Knowing their Phone Numbers by Swati Khandelwal.

From the post:

Hacking Facebook account is one of the major queries on the Internet today. It’s hard to find — how to hack Facebook account, but researchers have just proven by taking control of a Facebook account with only the target’s phone number and some hacking skills.

Yes, your Facebook account can be hacked, no matter how strong your password is or how much extra security measures you have taken. No joke!

Hackers with skills to exploit the SS7 network can hack your Facebook account. All they need is your phone number.

The weaknesses in the part of global telecom network SS7 not only let hackers and spy agencies listen to personal phone calls and intercept SMSes on a potentially massive scale but also let them hijack social media accounts to which you have provided your phone number.

Swati’s post has the details and a video of the hack in action.

Of greater interest than hacking Facebook accounts, however, is the weakness in the SS7 network. Hacking Facebook accounts is good for intelligence gathering, annoying the defenseless, etc., but fundamental weaknesses in telecom network is something different.

Swaiti quotes a Facebook clone as saying:

“Because this technique [SSL exploitation] requires significant technical and financial investment, it is a very low risk for most people,”

Here’s the video from Swati’s post (2:42 in length):

Having watched it, can you point out the “…significant technical and financial investment…” involved in that hack?

What investment would you make for a hack that opens up Gmail, Twitter, WhatsApp, Telegram, Facebook, any service that uses SMS, to attack?

Definitely a hack for your intelligence gathering toolkit.

by Patrick Durusau at June 17, 2016 02:12 PM

Visualizing your Titan graph database:…

Visualizing your Titan graph database: An update by Marco Liberati.

From the post:

Last summer, we wrote a blog with our five simple steps to visualizing your Titan graph database with KeyLines. Since then TinkerPop has emerged from the Apache Incubator program with TinkerPop3, and the Titan team have released v1.0 of their graph database:

  • TinkerPop3 is the latest major reincarnation of the graph proje­­­ct, pulling together the multiple ventures into a single united ecosystem.
  • Titan 1.0 is the first stable release of the Titan graph database, based on the TinkerPop3 stack.

We thought it was about time we updated our five-step process, so here’s:

Not exactly five (5) steps because you have to acquire a KeyLines trial key, etc.

A great endorsement of much improved installation process for TinkerPop3 and Titan 1.0.

Enjoy!

by Patrick Durusau at June 17, 2016 01:42 PM

RSA Cybersecurity Poverty Index [Safety in Numbers?]

RSA Research: 75% of Organizations are at Significant Risk of Cyber Incidents

Highlights from the post:

  • For the second straight year, 75% of survey respondents have a significant cybersecurity risk exposure
  • Organizations that report more business-impacting security incidents are 65% more likely to have advanced cyber maturity capabilities
  • Half of those surveyed assess their incident response capabilities as either “ad hoc” or “nonexistent”
  • Less mature Organizations continue to mistakenly implement more perimeter technologies as a stop gap measure to prevent incidents from occurring
  • Government and Energy ranked lowest among industries in cyber preparedness
  • American entities continue to rank themselves behind both APJ and EMEA in overall cyber maturity 

Relying on cybersecurity poverty making others more likely targets, is like increasing the size of a herd of sheep to reduce the odds of a wolf carrying off any particular one.

505px-Flock_of_sheep-460

That works, but is of little consolation to the sheep that is carried off.

Are you depending on other sheep being carried off?

by Patrick Durusau at June 17, 2016 12:47 AM

June 16, 2016

Patrick Durusau

Are Non-AI Decisions “Open to Inspection?”

Ethics in designing AI Algorithms — part 1 by Michael Greenwood.

From the post:

As our civilization becomes more and more reliant upon computers and other intelligent devices, there arises specific moral issue that designers and programmers will inevitably be forced to address. Among these concerns is trust. Can we trust that the AI we create will do what it was designed to without any bias? There’s also the issue of incorruptibility. Can the AI be fooled into doing something unethical? Can it be programmed to commit illegal or immoral acts? Transparency comes to mind as well. Will the motives of the programmer or the AI be clear? Or will there be ambiguity in the interactions between humans and AI? The list of questions could go on and on.

Imagine if the government uses a machine-learning algorithm to recommend applications for student loan approvals. A rejected student and or parent could file a lawsuit alleging that the algorithm was designed with racial bias against some student applicants. The defense could be that this couldn’t be possible since it was intentionally designed so that it wouldn’t have knowledge of the race of the person applying for the student loan. This could be the reason for making a system like this in the first place — to assure that ethnicity will not be a factor as it could be with a human approving the applications. But suppose some racial profiling was proven in this case.

If directed evolution produced the AI algorithm, then it may be impossible to understand why, or even how. Maybe the AI algorithm uses the physical address data of candidates as one of the criteria in making decisions. Maybe they were born in or at some time lived in poverty‐stricken regions, and that in fact, a majority of those applicants who fit these criteria happened to be minorities. We wouldn’t be able to find out any of this if we didn’t have some way to audit the systems we are designing. It will become critical for us to design AI algorithms that are not just robust and scalable, but also easily open to inspection.

While I can appreciate the desire to make AI algorithms that are “…easily open to inspection…,” I feel compelled to point out that human decision making has resisted such openness for thousands of years.

There are the tales we tell each other about “rational” decision making but those aren’t how decisions are made, rather they are how we justify decisions made to ourselves and others. Not exactly the same thing.

Recall the parole granting behavior of israeli judges that depended upon the proximity to their last meal. Certainly all of those judges would argue for their “rational” decisions but meal time was a better predictor than any other. (Extraneous factors in judicial decisions)

My point being that if we struggle to even articulate the actual basis for non-AI decisions, where is our model for making AI decisions “open to inspection?” What would that look like?

You could say, for example, no discrimination based on race. OK, but that’s not going to work if you want to purposely setup scholarships for minority students.

When you object, “…that’s not what I meant! You know what I mean!…,” well, I might, but try convincing an AI that has no social context of what you “meant.”

The openness of AI decisions to inspection is an important issue but the human record in that regard isn’t encouraging.

by Patrick Durusau at June 16, 2016 09:46 PM

IRS 990 Filing Data (2001 to date)

IRS 990 Filing Data Now Available as an AWS Public Data Set

From the post:

We are excited to announce that over one million electronic IRS 990 filings are available via Amazon Simple Storage Service (Amazon S3). Filings from 2011 to the present are currently available and the IRS will add new 990 filing data each month.

(image omitted)

Form 990 is the form used by the United States Internal Revenue Service (IRS) to gather financial information about nonprofit organizations. By making electronic 990 filing data available, the IRS has made it possible for anyone to programmatically access and analyze information about individual nonprofits or the entire nonprofit sector in the United States. This also makes it possible to analyze it in the cloud without having to download the data or store it themselves, which lowers the cost of product development and accelerates analysis.

Each electronic 990 filing is available as a unique XML file in the “irs-form-990” S3 bucket in the AWS US East (N. Virginia) region. Information on how the data is organized and what it contains is available on the IRS 990 Filings on AWS Public Data Set landing page.

Some of the forms and instructions that will help you make sense of the data reported:

990 – Form 990 Return of Organization Exempt from Income Tax, Annual Form 990 Requirements for Tax-Exempt Organizations

990-EZ – 2015 Form 990-EZ, Instructions for IRS 990 EZ – Internal Revenue Service

990-PF – 2015 Form 990-PF, 2015 Instructions for Form 990-PF

As always, use caution with law related data as words may have unusual nuances and/or unexpected meanings.

These forms and instructions are only a tiny part of a vast iceberg of laws, regulations, rulings, court decisions and the like.

990* disclosures aren’t detailed enough to pinch but when combined with other data, say leaked data, the results can be remarkable.

by Patrick Durusau at June 16, 2016 09:21 PM

Securing Captured Intelligence

In Intelligence Gathering… [Capturing Intelligence] I closed with the thought that securing of captured intelligence wasn’t discussed in Intelligence Gathering & Its Relationship to the Penetration Testing Process by Dimitar Kostadinov.

Security wasn’t Dimitar’s focus so the omission was understandable, but I can’t recall seeing any discussion of securing the results of intelligence gathering. Can you?

Are intelligence results by default subject to the same (lack of) security that most of us practice on our computers?

That’s ironic given that the goal of intelligence gathering is the penetration of other computers.

If you first response is that you have encrypted your hard drive, consider Indefinite prison for suspect who won’t decrypt hard drives, feds say by David Kravets.

I agree that the suspect in that case has the far better argument (and case law), but on the other hand, you will note he has been in prison for seven months while the government argues it “knows” he is guilty.

The government’s claim of knowledge is puzzling because if they have proof of his guilt, why not proceed to trial? Ah, yes, that is an inconvenient question for the prosecution.

As I said, the case law appears to be on the side of the suspect but the prosecution has still cost him months of his life and depending on the decision of the Third Circuit, that could stretch into years.

An encrypted hard drive and refusal to unlock it may save you, at least for a while, from prosecution for hacking, but how much time do you want to spend in jail just for having an encrypted drive?

I’m not saying an encrypted drive is a bad idea, nice first line of defense but it isn’t a slam dunk when it comes to concealing information.

Within an encrypted drive, my concealment of captured hacking intelligence should meet the following requirements:

  1. The captured hacking intelligence should be concealed in plain sight. That is a casual observer should not be able to distinguish the captured hacking intelligence file from any other file of a similar nature.
  2. Even if the captured hacking intelligence file is identified, it should not be possible for a prosecutor to prove specified content was in fact recorded in that file.
  3. As a counter to whatever fanciful claims by prosecutors, it should be possible to produce an innocent text from the captured intelligence file in a repeatable way. One that does not enable prosecutors to do the same thing with specified content.
  4. Finally, it must be possible to effectively use and supplement the captured hacking intelligence content.

Notice that brevity is not a requirement. Storage space is virtually unlimited so unless you are creating an encyclopedia for one hacking job, I don’t see that as an issue.

Other requirements?

Suggestions for solutions that meet the requirements I outlined above?

by Patrick Durusau at June 16, 2016 08:12 PM

Intelligence Gathering… [Capturing Intelligence]

Intelligence Gathering & Its Relationship to the Penetration Testing Process by Dimitar Kostadinov.

From the post:

Penetration testing simulates real cyber-attacks, either directly or indirectly, to circumvent security systems and gain access to a company’s information assets. The whole process, however, is more than just playing automated tools and then proceed to write down a report, submit it and collect the check.

The Penetration Testing Execution Standard (PTES) is a norm adopted by leading members of the security community as a way to establish a set of fundamental principles of conducting a penetration test. Seven phases lay the foundations of this standard: Pre-engagement Interactions, Information Gathering, Threat Modeling, Exploitation, Post Exploitation, Vulnerability Analysis, Reporting.

Penetration1-640

Intelligence gathering is the first stage in which direct actions against the target are taken. One of the most important ability a pen tester should possess is to know how to learn as much as possible about a targeted organization without the test has even begun – for instance, how this organization operates and its day-to-day business dealings – but most of all, he should make any reasonable endeavor to learn more about its security posture and, self-explanatory, how this organization can be attacked effectively. So, every piece of information that a pen tester can gather will provide invaluable insights into essential characteristics of the security systems in place.

Great introduction to intelligence gathering with links to some of the more obvious tools and coverage of common techniques.

As your tradecraft improves, so will your list of tools and techniques.

My only reservation is that Dimitar doesn’t mention how you capture the intelligence you have gathered.

Text document edited in Emacs?

Word document (shudder) under control of a SharePoint (shudder, shudder) server?

Spreadsheet?

Graph/Topic Map?

Intelligence gathering results in non-linear discovery arbitrary relationships and facts. Don’t limit yourself to a linear capture methodology, however necessary linear reports are for others.

My vote is with graphs/topic maps.

Since he didn’t mention recording your intelligence, Dimitar also doesn’t discuss how you secure your captured intelligence. But that’s a topic for another post.

by Patrick Durusau at June 16, 2016 02:19 PM

A Taste of the DNC

GUCCIFER 2.0 DNC’S SERVERS HACKED BY A LONE HACKER by Guccifer2.

From the post:

Worldwide known cyber security company CrowdStrike announced that the Democratic National Committee (DNC) servers had been hacked by “sophisticated” hacker groups.

I’m very pleased the company appreciated my skills so highly))) But in fact, it was easy, very easy.

Guccifer may have been the first one who penetrated Hillary Clinton’s and other Democrats’ mail servers. But he certainly wasn’t the last. No wonder any other hacker could easily get access to the DNC’s servers.

Shame on CrowdStrike: Do you think I’ve been in the DNC’s networks for almost a year and saved only 2 documents? Do you really believe it?

Here are just a few docs from many thousands I extracted when hacking into DNC’s network.

A taste of what was liberated from the DNC servers, including:

  • Donald Trump Report.
  • DNC donor lists (compare to FEC records).
  • A secret document from Clinton’s days as Secretary of State.
  • A scattering of other documents.

The main part of the papers were given to Wikileaks.

Sigh.

Hopefully that won’t mean sanitized documents but we will have to wait and see. Remember the Afghan War Diaries? Edited so as to not discomfort the U.S. government too much.

by Patrick Durusau at June 16, 2016 01:24 AM

If You Believe In OpenAccess, Do You Practice OpenAccess?

CSC-OpenAccess LIBRARY

From the webpage:

CSC Open-Access Library aim to maintain and develop access to journal publication collections as a research resource for students, teaching staff, researchers and industrialists.

You can see a complete listing of the journals here.

Before you protest these are not Science or Nature, remember that Science and Nature did not always have the reputations they do today.

Let the quality of your work bolster the reputations of open access publications and attract others to them.

by Patrick Durusau at June 16, 2016 12:48 AM

I’ll See You The FBI’s 411.9 million images and raise 300 million more, per day

FBI Can Access Hundreds of Millions of Face Recognition Photos by Jennifer Lynch.

From the post:

Today the federal Government Accountability Office (GAO) finally published its exhaustive report on the FBI’s face recognition capabilities. The takeaway: FBI has access to hundreds of millions more photos than we ever thought. And the Bureau has been hiding this fact from the public—in flagrant violation of federal law and agency policy—for years.

According to the GAO Report, FBI’s Facial Analysis, Comparison, and Evaluation (FACE) Services unit not only has access to FBI’s Next Generation Identification (NGI) face recognition database of nearly 30 million civil and criminal mug shot photos, it also has access to the State Department’s Visa and Passport databases, the Defense Department’s biometric database, and the drivers license databases of at least 16 states. Totaling 411.9 million images, this is an unprecedented number of photographs, most of which are of Americans and foreigners who have committed no crimes.

I understand and share the concern over the FBI’s database of 411.9 million images from identification sources, but let’s be realistic about the FBI’s share of all the image data.

Not an exhaustive list but:

Facebook alone is equaling the FBI photo count every 1.3 days. Moreover, Facebook data is tied to both Facebook and very likely, other social media data, unlike my driver’s license.

Instagram takes a little over 5 days to exceed the FBI image count. but like the little engine that could, it keeps trying.

I’m not sure how to count YouTube’s 300 hours of video every minute.

No reliable counts are available for porn images, which streamed from Pornhub in 2015, accounted for 1,892 petabytes of data.

The Pornhub data stream includes a lot of duplication but finding non-religious and reliable stats on porn is difficult. Try searching for statistics on porn images. Speculation, guesses, etc.

Based on those figures, it’s fair to say the number of images available to the FBI is somewhere North of 100 billion and growing.

Oh, you think non-public photos off-limits to the FBI?

Hmmm, so is lying to federal judges, or so they say.

The FBI may say they are following safeguards, etc., but once a agency develops a culture of lying “in the public’s interest,” why would you ever believe them?

If you believe the FBI now, shouldn’t you say: Shame on me?

by Patrick Durusau at June 16, 2016 12:29 AM

June 15, 2016

Patrick Durusau

Judicial Decision Making, Pulling Back the Curtain (Miranda v. Arizona)

Miranda v. Arizona: Exploring Primary Sources Behind the Supreme Court Case by Stephen Wesson.

From the post:

You have the right to remain silent….” These words, and the rest of the legal warning that follows, are so well-known that they’ve almost become a synonym for “You’re under arrest.” They occupy such a familiar place in popular culture that it might seem as though they’d been part of U.S. law for centuries. However, the now-ubiquitous Miranda warning only came into being fifty years ago, when the Supreme Court ruled that the rights of a criminal suspect, Ernesto Miranda, had been violated because he had not been informed of his Constitutional protections against self-incrimination.

The Library of Congress is marking this landmark anniversary with the launch of Miranda v. Arizona: The Rights to Justice, an online presentation of historical documents that shed light on the arguments around, and the reaction to, the Miranda ruling of 1966. These documents, which include papers written by and for several Supreme Court justices, allow students to explore the issues discussed by the justices as they considered the ramifications of the case. In addition, letters from law enforcement officers and members of the public illuminate the contentious public debate that erupted after the ruling.

One particularly powerful document for students to analyze is a page from a memorandum that associate justice William Brennan sent to chief justice Earl Warren about the case. Acknowledging that his 21-page response is lengthy, Brennan explains, “this will be one of the the most important opinions of our time…”

He then focuses on two words from Warren’s opinion that he says go “to the basic thrust of the approach to be taken.” He expounds,

An important collection of documents, not only as background to Miranda v. Arizona but also as insight into decision making in the Supreme Court.

Decisions are announced by the media in sound-bite sized chunks, which fail to portray the complexity of Court opinions, much less the process by which they are created.

I can think of any number of cases that merit this sort of treatment or even deeper, inter-linked collections of documents.

Enjoy!

by Patrick Durusau at June 15, 2016 06:11 PM

Mis-Direction: Possible What3Words App

Take a minute to visit https://map.what3words.com/ or my post Wrigley Field: 1060 W Addison St, Chicago, IL or digits.bucked.talent? (3-Word Addresses), or this post won’t be as useful as it could be.

In a nutshell, https://map.what3words.com/ has created a 3 by 3 meter grid on the Earth’s surface and assigned each block a three-word name. For the convenience of people accustomed to more conventional addresses, where available, you can submit an address and get the three-word name for that block back.

Excellent potential for a project name “Mis-Direction,” that needs an innocent name as a smartphone app.

You send someone a three-word block name and when displayed on their smartphone, it maps to the “canonical” location. Anyone using your phone will get that result.

However, if when the location is displayed, without a prompt or signal, if you enter a 5-digit code, the actual location intended by the sender is revealed.

Would require a mapping table between 3-word name as sent and 3-word name as intended, and the locations have to be plausible to any third party who might be tracking the communication or using your phone.

I would suggest allowing 5 tries to get the correct number because locations for demonstrations and other activities need to be operationally secure for only a matter of hours.

After that, anyone can follow the trail of emergency vehicles to a location that was a closely held secret only hours before.

It isn’t clear if the uptake on What3Words will be broad enough to have an impact at large political gatherings in the United States this year but the same re-mapping principle with password applies to more conventional mapping techniques as well.

by Patrick Durusau at June 15, 2016 05:24 PM

Investigative journalism tools

Investigative journalism tools by Markus Mandalka.

From the webpage:

Free software for journalists: Tutorials, bookmarks and open source tools for journalistic research, investigations and privacy and other digital tools for investigative journalism and data driven journalism or datajournalism:

Numerous resources organized under the following broad categories:

  • Databases, digital archives, data management systems, document management systems and content management systems
  • Data visualization
  • Extract data or convert data
  • Graphs and social network analysis (SNA)
  • Import and transform or convert data
  • Media monitoring, news filtering, news pipes and alerts
  • Privacy, security, safety and encryption
  • Reconcilation and merging
  • Search engines for fulltext search and discovery
  • Statistics and analytics
  • Tagging and annotation
  • Text mining, text analysis and document mining
  • Tutorials and tips: How to use open source research tools for investigative journalism
  • Universal open source toolset

A very useful site that is also available in Deutsch.

Suggestion: It’s easy to get overwhelmed by tool listings. Outline what you want from a tool in X category and go over the tools in that category with a view of selecting only one.

Use it long enough to see if it meets your current requirements. It may not be the latest or most talked about tool, but if it fits your needs and work flow, what more would you want?

That’s not to blind you to better tools, which do appear from time to time, but time spent on tool mastery is time not spent on research, writing and reporting.

by Patrick Durusau at June 15, 2016 03:41 PM

June 14, 2016

Patrick Durusau

Mapping Media Freedom

Mapping Media Freedom

From the webpage:

Journalists and media workers are confronting relentless pressure simply for doing their job. Mapping Media Freedom identifies threats, violations and limitations faced by members of the press throughout European Union member states, candidates for entry and neighbouring countries.

My American readers should not be mis-led by the current map image:

media-map-2-cropped-460

If it is true the United States is free from press suppression, something I seriously doubt, it won’t be long before it starts to rack up incidents on this site.

Just today, Newt Gingrich, a truly unpleasant waste of human skin, proposed re-igniting the witch hunt committees of the 1950’s. Newt Gingrich Suggests Reforming House Un-American Committee In Wake Of Orlando Shooting.

The so-called “presumptive” candidates for President, Clinton and Trump, have called for tech companies to aid in the suppression of jihadist content and even the closing off of parts of the internet.

At least once a week, visit the Mapping Media Freedom and do what you can to support the media everywhere.

by Patrick Durusau at June 14, 2016 11:51 PM

More Censorship Is Coming – To The USA

Hillary Clinton says tech companies need to ‘step up’ fight against ISIS propaganda by Amar Toor.

From the post:

Hillary Clinton said this week that if elected president, she would work with major technology companies to “step up” counter-terrorism efforts, including surveillance of social media and campaigns to combat jihadist propaganda online. As Reuters reports, the presumptive Democratic presidential nominee made the comments in a speech in Cleveland Monday, one day after a gunman killed 49 people and left 53 wounded at a gay nightclub in Orlando.

Clinton did not provide details on how she would work with tech companies, though her comments add to the ongoing debate over privacy and national security, which has intensified following recent terrorist attacks in both the US and Europe. In her speech, the former secretary of state called for an “intelligence surge,” saying that security agencies “need better intelligence to discover and disrupt terrorist plots before they can be carried out.” She also called on the government and tech companies to “use all our capabilities to counter jihadist propaganda online.”

“As president, I will work with our great tech companies from Silicon Valley to Boston to step up our game,” Clinton said. “We have to [do] a better job intercepting ISIS’ communications, tracking and analyzing social media posts and mapping jihadist networks, as well as promoting credible voices who can provide alternatives to radicalization.”

What does it mean to “counter jihadist propaganda online?”

Does it include factual reports about the aims of jihadists and the abuses they seek to correct?

For example, the Declaration of Independence was once considered “propaganda.”

Does it include factual reports of terrorist bombings by coalition forces on jihadists positions?

Question: Who do you root for in the Star Wars movies, the tiny band of rebels or the empire?

Does it include calling on young people to actively resist corrupt and oppressive governments?

Let’s see…

Even anarchy itself, that bugbear held up by the tools of power (though truly to be deprecated) is infinitely less dangerous to mankind than arbitrary government. Anarchy can be but of short duration; for when men are at liberty to pursue that course which is most conducive to their own happiness, they will soon come into it, and for the rudest state of nature, order and good government must soon arise. But tyranny, when once established, entails its curse on a nation to the latest period of time; unless some daring genius, inspired by Heaven, shall unappalled by danger, bravely form and execute the arduous design of restoring liberty and life to his enslaved, murdered country.” [AN ORATION DELIVERED MARCH 6, 1775, AT THE… Joseph Warren (1741-1775) Boston: Printed by Messieurs Edes and Gill, and by J. Greenleaf, 1775 E297 W54, Fighting Words, a collection at Utah State University.]

Updated to use modern language, would that qualify?

As I remember the First Amendment, all of those qualify as protected free speech.

Clinton and her separated-at-birth twin, Donald Trump, can try to impose censorship on the legitimate speech of jihadists.

Let’s all lend the jihadists a hand and repeat their legitimate speech on a regular basis.

I for one would like to hear what the jihadists have to say for themselves.

Wouldn’t you?

by Patrick Durusau at June 14, 2016 02:32 PM

ClojureBridge…beginner friendly alternative to the official Clojure docs

Get into Clojure with ClojureBridge

Welcome to ClojureBridge CommunityDocs, the central location for material supporting and extending the core ClojureBridge curriculum (https://github.com/ClojureBridge/curriculum). Our goal is to provide additional labs and explanations from coaches to address the needs of attendees from a wide range of cultural and technical backgrounds.

Arne Brasseur ‏tweeted earlier today:

Little known fact, the ClojureBridge community docs are a beginner friendly alternative to the official Clojure docs

Pass this along and contribute to the “beginner friendly alternative” so this becomes a well known fact.

Enjoy!

by Patrick Durusau at June 14, 2016 01:40 PM

Wrigley Field: 1060 W Addison St, Chicago, IL or digits.bucked.talent? (3-Word Addresses)

Elwood Blues says in The Blues Brothers that he falsified his drivers license renewal and listed:

“1060 W. Addision”

as his home address, somehow

digits.bucked.talent

doesn’t carry the same impact. Yes?

Mongolia has places as familiar as Wrigley Field is to Americans but starting next month, all locations in Mongolia are going to have three-word phrase addresses. Mongolia is changing all its addresses to three-word phrases by Joon Ian Wong.

From the post:

Mongolia will become a global pioneer next month, when its national post office starts referring to locations by a series of three-word phrases instead of house numbers and street names.

The new system is devised by a British startup called What3Words, which has assigned a three-word phrase to every point on the globe. The system is designed to solve the an often-ignored problem of 75% of the earth’s population, an estimated 4 billion people, who have no address for mailing purposes, making it difficult to open a bank account, get a delivery, or be reached in an emergency. In What3Words’ system, the idea is that a series of words is easier to remember than the strings of number that make up GPS coordinates. Each unique phrase corresponds to a specific 9-square-meter spot on the map.

For example, the White House, at 1600 Pennsylvania Avenue, becomes sulk.held.raves; the Tokyo Tower is located at fans.helpless.collects; and the Stade de France is at reporter.smoked.received.

Mongolians will be the first to use the system for government mail delivery, but organizations including the United Nations, courier companies, and mapping firms like Navmii already use What3Words’ system.

The most remarkable aspect of the https://map.what3words.com/ is revealed if you try for:

Gandan Monastery (Gandantegchinlen Khiid), Gandan Monastery District, Ulaanbaatar 16040 (011 36 0354).

Use this URL: https://map.what3words.com/Gandantegchenlen+Khiid+Ulaanbaatar+Mongolia

mongolia-map-460

Now try changing languages (upper-right).

Three-word phrase addresses for the Gandan Monastery:

  • picturing.backfired.riverside (English)
  • schneller.juwelen.schaffen (German)
  • aislados.grifo.acuerde (Spanish)
  • nuageux.lémurien.rejouer (French)
  • turbato.fotografate.tinozza (Italian)
  • chinelo.politicar.molhada (Portugese
  • matte.skivar.kasta (Swedish)
  • vücudu.ırmak.peşini (Turkish)
  • карьера.слог.шелка (Russian)

I have only had time to spot check the site but did find retraced.loudest.teaspoon for Yap Island in Micronesia.

More obscure places to try?

You can find a wealth of additional information, yes, including an API at: http://what3words.com/.

A great opportunity for topic maps as previous ways of identifying locations are not going to wink out of existence. If 3-word addresses catch on, use of other locators may dwindle but that will be over generations. We are facing a very long transition period.

Thoughts on weaponizing 3-word addresses. First, using the wrong 3-word addresses to mis-lead agents of the state. Second, creating new 3-word addresses that can be embedded prose, song, without the dot separators.

Not to mention a server with proper authentication, returns the “correct” map location for a 3-word address, otherwise, you get the standard one.

Enjoy!

by Patrick Durusau at June 14, 2016 01:51 AM

June 13, 2016

Patrick Durusau

Microsoft Giveth, Microsoft Taketh Away

Microsoft Revoking Free Fallout 4 Copies Grabbed Due to Xbox Store Error by Ron Witaker.

From the post:

Yesterday afternoon, Fallout 4‘s Deluxe Edition Bundle showed up on the Xbox Store for a very attractive price – $0.00. As you can imagine, word of the error spread quickly, and while no numbers are available, you can bet that many people took advantage of the deal to grab a copy for their Xbox One. That version of the game typically runs $109.99, and includes the Season Pass for all the DLC.

Ron goes on to point out that Microsoft is revoking all licenses obtained due to this error.

With some exceptions, a sale is a completed act and not subject to revocation by only one of the parties.

Would be a stronger case if Fallout 4‘s Deluxe Edition Bundle had listed a price of at least $0.01. Can you say why?

Would costing $0.01 when purchased with other games make a difference?

Keep an eye out for litigation!

by Patrick Durusau at June 13, 2016 05:43 PM

How Do I Become A Censor?

You read about censorship or efforts at censorship on a daily basis.

But none of those reports answers the burning question of the social media age: How Do I Become A Censor?

I mean, what’s the use of reading about other people censoring your speech if you aren’t free to censor theirs? Where the fun in that?

Andrew Golis has an answer for you in: Comments are usually garbage. We’re adding comments to This.!.

Three steps to becoming a censor:

  1. Build a social media site that accepts comments
  2. Declare a highly subjective ass-hat rules
  3. Censor user comments

There being no third-party arbiters, you are now a censor! Feel the power surging through your fingers. Crush dangerous thoughts, memes or content with a single return. The safety and sanity of your users is now your responsibility.

Heady stuff. Yes?

If you think this is parody, check out the This. Community Guidelines for yourself:


With that in mind, This. is absolutely not a place for:

Violations of law. While this is expanded upon below, it should be clear that we will not tolerate any violations of law when using our site.

Hate speech, malicious speech, or material that’s harmful to marginalized groups. Overtly discriminating against an individual belonging to a minority group on the basis of race, ethnicity, national origin, religion, sex, gender, sexual orientation, age, disability status, or medical condition won’t be tolerated on the site. This holds true whether it’s in the form of a link you post, a comment you make in a conversation, a username or display name you create (no epithets or slurs), or an account you run.

Harassment; incitements to violence; or threats of mental, emotional, cyber, or physical harm to other members. There’s a line between civil disagreement and harassment. You cross that line by bullying, attacking, or posing a credible threat to members of the site. This happens when you go beyond criticism of their words or ideas and instead attack who they are. If you’ve got a vendetta against a certain member, do not police and criticize that member’s every move, post, or comment on a conversation. Absolutely don’t take this a step further and organize or encourage violence against this person, whether through doxxing, obtaining dirt, or spreading that person’s private information.

Violations of privacy. Respect the sanctity of our members’ personal information. Don’t con them – or the infrastructure of our site – to obtain, post, or disseminate any information that could threaten or harm our members. This includes, but isn’t limited to, credit card or debit card numbers; social or national security numbers; home addresses; personal, non-public email addresses or phone numbers; sexts; or any other identifying information that isn’t already publicly displayed with that person’s knowledge.

Sexually-explicit, NSFW, obscene, vulgar, or pornographic content. We’d like for This. to be a site that someone can comfortably scroll through in a public space – say a cafe, or library. We’re not a place for sexually-explicit or pornographic posts, comments, accounts, usernames, or display names. The internet is rife with spaces for you to find people who might share your passion for a certain Pornhub video, but This. isn’t the place to do that. When it comes to nudity, what we do allow on our site is informative or newsworthy – so, for example, if you’re sharing this article on Cameroon’s breast ironing tradition, that’s fair game. Or a good news or feature article about Debbie Does Dallas. But, artful as it may be, we won’t allow actual footage of Debbie Does Dallas on the site. (We understand that some spaces on the internet are shitty at judging what is and isn’t obscene when it comes to nudity, so if you think we’ve pulled your post off the site because we’re a bunch of unreasonable prudes, we’ll be happy to engage.)

Excessively violent content. Gore, mutilation, bestiality, necrophilia? No thanks! There’s a distinction between a potentially upsetting image that’s worth consuming (think of some of the best war photography) and something you’d find in a snuff film. It’s not always an easy distinction to make – real life is pretty brutal, and some of the images we probably need to see are the hardest to stomach – but we also don’t want to create an overwhelmingly negative experience for anyone who visits the site and happens to stumble upon a painful image.

Promotion of self-harm, eating disorders, alcohol or drug abuse, or similar forms of destructive behavior. The internet is, sadly, also rife with spaces where people get off on encouraging others to hurt themselves. If you’d like to do that, get off our site and certainly seek help.

Username squatting. Dovetailing with that, we reserve the right to take back a username that is not being actively used and give it to someone else who’d like it it – especially if it’s, say, an esteemed publication, organization, or person. We’re also firmly against attempts to buy or sell stuff in exchange for usernames.

Use of the This. brand, trademark, or logo without consent. You also cannot use the This. name or anything associated with the brand without our consent – unless, of course, it’s a news item. That means no creating accounts, usernames, or display names that use our brand.

Spam. Populating the site with spammy accounts is antithetical to our mission – being the place to find the absolute best in media. If you’ve created accounts that are transparently selling, say, “installation help for Macbooks” or some other suspicious form tech support, or advertising your “viral video” about Justin Bieber that’s got a suspiciously low number of views, you don’t belong on our site. That contradicts why we exist as a platform – to give members a noise-free experience they can’t find elsewhere on the web.

Impersonation of others. Dovetailing with that – though we’d all like to be The New York Times or Diana Ross, don’t pretend to be them. Don’t create an identity on the site in the likeness of a company or person who isn’t you. If you decide, for some reason, to create a parody account of a public figure or organization – though we can think of better sites to do that on, frankly – make sure you make that as clear as possible in your display name, avatar, and bio.

Infringement of copyright or intellectual property rights. Don’t post copyrighted works without the permission of its original owner or creator. This extends, for example, to copying and pasting a copyrighted set of words into a comment and passing it off as your own without credit. If you think someone has unlawfully violated your own copyright, please follow the DMCA procedures set forth in our Terms of Service.

Mass or automated registration and following. We’ve worked hard to build the site’s infrastructure. If you manipulate that in any way to game your follow count or register multiple spam accounts, we’ll have to terminate your account.

Exploits, phishing, resource abuse, or fraudulent content. Do not scam our members into giving you money, or mislead our members through misrepresenting a link to, say, a virus.

Exploitation of minors. Do not post any material regarding minors that’s sexually explicit, violent, or harmful to their safety. Don’t solicit or request their private or personally identifiable information. Leave them alone.

So how do we take punitive action against anyone who violates these? Depends on the severity of the offense. If you’re a member with a good track record who seems to have slipped up, we’ll shoot you an email telling you why your content was removed. If you’ve shared, written, or done something flagrantly and recklessly violating one of these rules, we’ll ban you from the site through deleting your account and all that’s associated with it. And if we feel it’s necessary or otherwise believe it is required, we will work with law enforcement to handle any risk to one of our members, the This. community in general, or to public safety.

To put it plainly – if you’re an asshole, we’ll kick you off the site.

Let’s make that a little more concrete.

I want to say: “Former Vice-President Dick Cheney should be tortured for a minimum of thirty (30) years and be kept alive for that purpose, as a penalty for his war crimes.”

I can’t say that on This. because:

  • “incitement to violence” If torture is ok, then so it other violence.
  • “harmful to marginalized group” If you think of sociopaths as a marginalized group.
  • “harassment” Cheney is a victim too. He didn’t start life as a moral leper.
  • “excessively violence content” Assume I illustrate the torture Cheney should suffer.

Rules broken vary by the specific content of my speech.

Remind me to pass this advice along to: Jonathan “I Want To Be A Twitter Censor” Weisman. All he needs to do is build a competitor to Twitter and he can censor to his heart’s delight.

The build your own platform isn’t just my opinion. This. confirms my advice:

If you don’t like these rules, feel free to create your own platform! There are a lot of awesome, simple ways to do that. That’s what’s so lovely about the internet.

by Patrick Durusau at June 13, 2016 05:22 PM

How to Read a Legal Opinion:… (Attn: Bloggers, Posters, Reporters)

How to Read a Legal Opinion: A Guide for New Law Students by Orin S. Kerr.

If I would require one rule for reporting on courts and legislatures it would be: No story will be published without links to the bill, law or decision being reported.

How hard is that?

Yet every day posting appear where you must guess to find an opinion or legislative material.

Links won’t keep you mis-reporting laws and opinions but it will enable your readers to spot such mistakes more easily. (Is that the reason links are so often omitted?)

If you want to improve your skills at reading opinions, take a look at Kerr’s How to Read a Legal Opinion: A Guide for New Law Students.

Black’s Law Dictionary is a great help, but don’t use an “original” or out-dated version. The law is stable, but not that stable. There is an iPhone version.

Bear in mind that Black’s doesn’t record every nuance for every term defined by a statute or used by a court. It is a general guide only.

by Patrick Durusau at June 13, 2016 03:29 PM

Scientists reportedly close to finding a use for LinkedIn

Scientists reportedly close to finding a use for LinkedIn

Following a four-year multinational, interdisciplinary, cross-disciplinary study involving social scientists, computer scientists, algorithm developers, statisticians, mathematicians, programmers, engineers and clairvoyants, reports are circulating that there may be a breakthrough in the search for a use for the LinkedIn social network website.

“We don’t want to get people’s hopes up too much”, said Prof. Don Key of Stanford West University, “but we feel we are nearly there”.

“We have partnered with IBM and have used several hundred racks of their BlueGene/Q platform for the past four years and the results will almost certainly be out by next Friday”, said Prof. Key.

;-)

One serious use of LinkedIn is to collect images for your facial recognition cameras.

LinkedIn is one of the many “leaky” public sources of data. Even without breaching its security.

You can find stories similar to this one at: The allium.

PS: I just saw this news scrolling across the screen: “Study confirms the wicked get 63% less rest.”

by Patrick Durusau at June 13, 2016 01:09 PM

The Symptom of Many Formats

Distro.Mic: An Open Source Service for Creating Instant Articles, Google AMP and Apple News Articles

From the post:

Mic is always on the lookout for new ways to reach our audience. When Facebook, Google and Apple announced their own native news experiences, we jumped at the opportunity to publish there.

While setting Mic up on these services, David Björklund realized we needed a common article format that we could use for generating content on any platform. We call this format article-json, and we open-sourced parsers for it.

Article-json got a lot of support from Google and Apple, so we decided to take it a step further. Enter DistroMic. Distro lets anyone transform an HTML article into the format mandated by one of the various platforms.

Sigh.

While I applaud the DistroMic work, I am saddened that it was necessary.

From the DistroMic page, here is the same article in three formats:

Apple:

{
“article”: [
{
“text”: “Astronomers just announced the universe might be expanding up to 9% faster than we thought.\n”,
“additions”: [
{
“type”: “link”,
“rangeStart”: 59,
“rangeLength”: 8,
“URL”: “http://hubblesite.org/newscenter/archive/releases/2016/17/text/”
}
],
“inlineTextStyles”: [
{
“rangeStart”: 59,
“rangeLength”: 8,
“textStyle”: “bodyLinkTextStyle”
}
],
“role”: “body”,
“layout”: “bodyLayout”
},
{
“text”: “It’s a surprising insight that could put us one step closer to finally figuring out what the hell dark energy and dark matter are. Or it could mean that we’ve gotten something fundamentally wrong in our understanding of physics, perhaps even poking a hole in Einstein’s theory of gravity.\n”,
“additions”: [
{
“type”: “link”,
“rangeStart”: 98,
“rangeLength”: 28,
“URL”: “http://science.nasa.gov/astrophysics/focus-areas/what-is-dark-energy/”
}
],
“inlineTextStyles”: [
{
“rangeStart”: 98,
“rangeLength”: 28,
“textStyle”: “bodyLinkTextStyle”
}
],
“role”: “body”,
“layout”: “bodyLayout”
},
{
“role”: “container”,
“components”: [
{
“role”: “photo”,
“URL”: “bundle://image-0.jpg”,
“style”: “embedMediaStyle”,
“layout”: “embedMediaLayout”,
“caption”: {
“text”: “Source: \n NASA\n \n”,
“additions”: [
{
“type”: “link”,
“rangeStart”: 13,
“rangeLength”: 4,
“URL”: “http://www.nasa.gov/mission_pages/hubble/hst_young_galaxies_200604.html”
}
],
“inlineTextStyles”: [
{
“rangeStart”: 13,
“rangeLength”: 4,
“textStyle”: “embedCaptionTextStyle”
}
],
“textStyle”: “embedCaptionTextStyle”
}
}
],
“layout”: “embedLayout”,
“style”: “embedStyle”
}
],
“bundlesToUrls”: {
“image-0.jpg”: “http://bit.ly/1UFHdpf”
}
}

Facebook:

<article>
<p>Astronomers just announced the universe might be expanding
<a href=”http://hubblesite.org/newscenter/archive/releases/2016/17/text/”>up to 9%</a> faster than we thought.</p>
<p>It’s a surprising insight that could put us one step closer to finally figuring out what the hell
<a href=”http://science.nasa.gov/astrophysics/focus-areas/what-is-dark-energy/”>
dark energy and dark matter</a> are. Or it could mean that we’ve gotten something fundamentally wrong in our understanding of physics, perhaps even poking a hole in Einstein’s theory of gravity.</p>
<figure data-feedback=”fb:likes,fb:comments”>
<img src=”http://bit.ly/1UFHdpf”></img>
<figcaption><cite>
Source: <a href=”http://www.nasa.gov/mission_pages/hubble/hst_young_
galaxies_200604.html”>NASA</a>
</cite></figcaption>
</figure>
</article>

Google:

<article>
<p>Astronomers just announced the universe might be expanding
<a href=”http://hubblesite.org/newscenter/archive/releases/2016/17/text/”>up to 9%</a> faster than we thought.</p> <p>It’s a surprising insight that could put us one step closer to finally figuring out what the hell
<a href=”http://science.nasa.gov/astrophysics/focus-areas/what-is-dark-energy/”> dark energy and dark matter</a> are. Or it could mean that we’ve gotten something fundamentally wrong in our understanding of physics, perhaps even poking a hole in Einstein’s theory of gravity.</p>
<figure>
<amp-img width=”900″ height=”445″ layout=”responsive” src=”http://bit.ly/1UFHdpf”></amp-img>
<figcaption>Source:
<a href=”http://www.nasa.gov/mission_pages/hubble/hst_young_
galaxies_200604.html”>NASA</a>
</figcaption>
</figure>
</article>

All starting from the same HTML source:

<p>Astronomers just announced the universe might be expanding
<a href=”http://hubblesite.org/newscenter/archive/releases/2016/17/text/”>up to 9%</a> faster than we thought.</p><p>It’s a surprising insight that could put us one step closer to finally figuring out what the hell
<a href=”http://science.nasa.gov/astrophysics/focus-areas/what-is-dark-energy/”>
dark energy and dark matter</a> are. Or it could mean that we’ve gotten something fundamentally wrong in our understanding of physics, perhaps even poking a hole in Einstein’s theory of gravity.</p>
<figure>
<img width=”900″ height=”445″ src=”http://bit.ly/1UFHdpf”>
<figcaption>Source: 
<a href=”http://www.nasa.gov/mission_pages/hubble/hst_young_
galaxies_200604.html”>NASA</a>
</figcaption>
</figure>

Three workflows based on what started life in one common format.

Three workflows that have their own bugs and vulnerabilities.

Three workflows that duplicate the capabilities of each other.

Three formats that require different indexing/searching.

This is not the cause of why we can’t have nice things in software, but it certainly is a symptom.

The next time someone proposes a new format for a project, challenge them to demonstrate a value-add over existing formats.

by Patrick Durusau at June 13, 2016 12:56 PM

Tracking News Repos

@newsnerdrepos tweets every time one of 85 news github accounts posts a new repo.

Just started but what an excellent idea!

by Patrick Durusau at June 13, 2016 12:36 AM

Ten Simple Rules for Effective Statistical Practice

Ten Simple Rules for Effective Statistical Practice by Robert E. Kass, Brian S. Caffo, Marie Davidian, Xiao-Li Meng, Bin Yu, Nancy Reid (Ciation: Kass RE, Caffo BS, Davidian M, Meng X-L, Yu B, Reid N (2016) Ten Simple Rules for Effective Statistical Practice. PLoS Comput Biol 12(6): e1004961. doi:10.1371/journal.pcbi.1004961)

From the post:

Several months ago, Phil Bourne, the initiator and frequent author of the wildly successful and incredibly useful “Ten Simple Rules” series, suggested that some statisticians put together a Ten Simple Rules article related to statistics. (One of the rules for writing a PLOS Ten Simple Rules article is to be Phil Bourne [1]. In lieu of that, we hope effusive praise for Phil will suffice.)

I started to copy out the “ten simple rules,” sans the commentary but that would be a disservice to my readers.

Nodding past a ten bullet point listing isn’t going to make your statistics more effective.

Re-write the commentary on all ten rules to apply them to every project. The focusing of the rules on your work will result in specific advice and examples for your field.

Who knows? Perhaps you will be writing a ten simple rule article in your specific field, sans Phil Bourne as a co-author. (Do be sure and cite Phil.)

PS: For the curious: Ten Simple Rules for Writing a PLOS Ten Simple Rules Article by Harriet Dashnow, Andrew Lonsdale, Philip E. Bourne.

by Patrick Durusau at June 13, 2016 12:17 AM

June 12, 2016

Patrick Durusau

Art and the Law: [UK Focused]

Art and the Law: Guides to the legal framework and its impact on artistic freedom of expression by Jodie Ginsberg, chief executive, Index on Censorship.

From the post:

Freedom of expression is essential to the arts. But the laws and practices that protect and nurture free expression are often poorly understood both by practitioners and by those enforcing the law.

As part of Index on Censorship’s work on art and offence, Index has published a series of law packs intended to address questions about legal limits related to free expression and the arts.

We intend them as “living” documents, to be enhanced and developed in partnership with arts groups so that artistic freedom is nurtured and nourished.

This work builds on an earlier study by Index on Censorship, Taking the Offensive, which showed how self-censorship manifests itself in arts organisations and institutions.

Descriptions of:

Child Protection: PDF | web

Counter Terrorism: PDF | web

Obscene Publications: PDF | web

Public Order: PDF | web

Race and Religion: PDF | web

along with numerous other resources appear on this page.

Realize these are UK specific and the laws on such matters vary widely. That’s not a criticism but an observation for the safety of readers. Check your local laws with qualified legal advisers.

Unlike Jonathan “I Want To Be A Twitter Censor” Weisman, my advice for when you find offensive content, is to look away.

What other people choose to create, publish, perform, listen to, view, read, etc., is their business and certainly none of yours.

Criminal acts against other people, children in particular, are already unlawful and censorship isn’t required outlaw them.

by Patrick Durusau at June 12, 2016 09:29 PM

APIs.guru Joins Growing List of API Indexes [Index of Indexes Anyone?]

APIs.guru Joins Growing List of API Indexes by Benjamin Young.

From the post:

APIs.guru is the latest entry into the API definition indexing, curation, and discovery space.

The open source (MIT-licensed) community curated index currently includes 236 API descriptions which cover 6,271 endpoints. APIs.guru is focused on becoming the "Wikipedia for REST APIs."

APIs.guru is entering an increasingly crowded market with other API indexing sites including The API Stack, API Commons, APIs.io, AnyAPI, and older indexes such as ProgrammableWeb's API Directory. These API indexes share a common goal says APIEvangelist.com blogger Kin Lane:

Developers around the world are using these definitions in their work, and modern API tooling and service providers are using them to define the value they bring to the table. To help the API sector reach the next level, we need you to step up and share the API definitions you have with API Stack, APIs.io, or APIs.guru, and if you have the time and skills, we could use your help crafting other new API definitions for popular services available today.

The APIs.guru content is curated primarily by its creator, Ivan Goncharov. According to a DataFire Blog entry, the initial content was populated "using a combination of automated scraping and human curation to crawl the web for machine-readable API definitions."

The empirical evidence from Spain indicates that the more places that link to you, the more traffic you enjoy. Even for news sites.

From that perspective, a multitude of over-lapping, duplicative API indexes is a good thing.

From my perspective, that is a one-stop shop for APIs, it’s a nightmare.

Which one you see depend on your use case.

Enjoy!

by Patrick Durusau at June 12, 2016 07:51 PM

Playpen (porn) and Tamper-Proof NITs (Chain of Custody)

Dr. Christopher Soghoian’s affidavit in UNITED STATES OF AMERICA v. EDWARD JOSEPH MATISH, III, Criminal No. 4:16cr16, Document 83-1, is a highly readable account of why the lack of encryption for the Playpen Network Investigative Technique (NIT) is fatal to the FBI’s case.

In a nutshell, the lack of encryption means that the FBI cannot prove that data from a point of origin was not changed before it reached the FBI’s computer. Anywhere along the network transmission, some third party could have changed or even inserted new content.

In legal speak, it’s call “…the chain of custody.”

Say for example a defendant is charged with illegal possession of a firearm. At trial, the state must product the firearm alleged to be in his possession at the time of his arrest. Moreover, as part of that proof, the state must prove “custody” of that gun at every step of the way.

The arresting officer testifies to the arrest and identifies the gun retrieved from the defendant. They then testify they put that gun into a bag with a label, noting the serial number and then signing the bag after sealing it. Next a crime room technician will testify they received bag # with the officer’s signature and logged it into their evidence log. And so on, up until the officer opens the bag in court and says: “This is the gun I took off of the defendant.”

Break that chain of custody and the evidence isn’t admissible.

The chain of custody doesn’t exist in the Playpen cases because the lack of encryption means the data in question could have been changed at any number of points along the way and the FBI cannot prove otherwise.

Think of it as an affirmative burden of proof. No proof of chain of custody and the evidence is not admissible.

Even a first year FBI trainee should know that rule.

Which makes the FBI’s desire to get D- quality work approved all the more puzzling.

Why not follow the rules and do good work? What so daunting about that?

Suggestions?

PS: Should the FBI need advice on following the rules on cyber-evidence matters, don’t contact the Justice Department. They have an unsavory reputation for lying to judges and just as likely would lie to the FBI. Check around for ex-U.S. attorneys with cyberlaw experience.

by Patrick Durusau at June 12, 2016 07:19 PM

How to Run a Russian Hacking Ring [Just like Amway, Mary Kay … + Career Advice]

How to Run a Russian Hacking Ring by Kaveh Waddell.

From the post:

A man with intense eyes crouches over a laptop in a darkened room, his face and hands hidden by a black ski mask and gloves. The scene is lit only by the computer screen’s eerie glow.

Exaggerated portraits of malicious hackers just like this keep popping up in movies and TV, despite the best efforts of shows like Mr. Robot to depict hackers in a more realistic way. Add a cacophony of news about data breaches that have shaken the U.S. government, taken entire hospital systems hostage, and defrauded the international banking system, and hackers start to sound like omnipotent super-villains.

But the reality is, as usual, less dramatic. While some of the largest cyberattacks have been the work of state-sponsored hackers—the OPM data breach that affected millions of Americans last year, for example, or the Sony hack that revealed Hollywood’s intimate secrets​—the vast majority of the world’s quotidian digital malice comes from garden-variety hackers.

What a downer this would be at career day at the local high school.

Yes, you too can be a hacker but it’s as dull as anything you have seen in Dilbert.

Your location plays an important role in whether Russian hacking ring employment is in your future. Kaveh reports:


Even the boss’s affiliates, who get less than half of each ransom that they extract, make a decent wage. They earned an average of 600 dollars a month, or about 40 percent more than the average Russian worker.

$600/month is ok, if you are living in Russia, not so hot if you aspire to Venice Beach. (It’s too bad the beach cam doesn’t pan and zoom.)

The level of technical skills required for low-lying fruit hacking is falling, meaning more competitors for the low-end. Potential profits are going to fall even further.

The no liability for buggy software will fall sooner rather than later and skilled hackers (I mean security researchers) will find themselves in demand by both plaintiffs and defendants. You will earn more money if you can appear in court, some expert witnesses make $600/hour or more. (Compare the $600/month in Russia.)

Even if you can’t appear in court, for reasons that seem good to you, fleshing out the details of hacks is going to be on demand from all sides.

You may start at the shallow end of the pool but resolve to not stay there. Read deeply, practice everyday, start current on new developments and opportunities, contribute to online communities.

by Patrick Durusau at June 12, 2016 05:41 PM

Vermont Trumps (sorry) Feds?

Signed By the Governor: Sweeping Vermont Privacy Law Will Hinder Several Federal Surveillance Programs by Mike Maharrey.

From the post:

Vermont Gov. Peter Shumlin has signed a sweeping bill that establishes robust privacy protections in the state into law. It not only limits warrantless surveillance and helps ensure electronic privacy in Vermont, it will also hinder several federal surveillance programs that rely on cooperation and data from state and local law enforcement.

The new law bans warrantless use of stingray devices to track the location of phones and sweep up electronic communications, restricts the use of drones for surveillance by police, and generally prohibits law enforcement officers from obtaining electronic data from service providers without a warrant or a judicially issued subpoena.

Some random examples of federal government lying:

So, Mike would have us believe that Vermont (drum roll) passing a bill and the governor signing into law is going to interfere with federal surveillance programs in what way?

But, but…, it’s a law!” (in a shocked tone of voice).

And you think that means what? Exactly?

Laws don’t enforce themselves. I know that comes as a surprise but there it is.

As Andrew Jackson once remarked, of Chief Justice John Marshall, “John Marshall had made his decision, now let him enforce it.” (For constitutional history buffs, that’s Cherokee Indian Cases (1830s).)

If the police, state and federal, ignore this new Vermont state law and no one will prosecute them, how much hindering of Federal surveillance programs do you see?

My multiple-choice survey questionnaire has only one response for that question:

None.

If we disagree, the missing piece maybe that the executive branch consists of the people who put laws into effect.

When the executive branch ignores the law, the judicial and legislative branches become distractions, nothing more.

by Patrick Durusau at June 12, 2016 02:03 PM

June 11, 2016

Patrick Durusau

Jonathan Weisman: Don’t Let The Door Hit You

Jonathan Weisman has decided to leave Twitter for reasons he sets forth (at length) at: Why I Quit Twitter — and Left Behind 35,000 Followers.

The essence of his complaint: Twitter failed to censor the speech of other Twitter users.

I offer no defense for the offensive and crude tweets Weisman received via Twitter.

However, as the The Times’s deputy Washington editor, Weisman had the resources to filter his Twitter stream to remove such posts on his own.

But avoiding the offensive tweets wasn’t his goal.

Weisman’s goal is to silence others and to enlist Twitter in that task.

I have to agree that Twitter’s use of its “terms of service” is arbitrary and capricious, not to mention lacking transparency, but that’s all the more reason to discard content rules from “terms of service,” not make them more onerous.

Weisman’s parting shot is to describe Twitter as a “…cesspoll of hate….”

Humanity has a number of such cesspools as well as large swaths of people who fall somewhere between there and sainthood. No reason to expect social media that reflects society to be any different.

Jonathan Weisman leaving Twitter is the loss of another advocate for censorship and there can never be too few of those.

PS: The New York Times needs to seriously think about why it employs a censorship advocate as its deputy Washington Editor.

by Patrick Durusau at June 11, 2016 03:44 PM