Planet Topic Maps

January 20, 2017

Patrick Durusau

Trump Inauguration Police Tactics/Blockades – 10:30 AM EST

Unicorn Riot is live streaming protests, including checkpoint blockades, from Washington, D.C.

An interesting variation on the police formation I detailed in Defeating Police Formations – Parallel Distributed Protesting, the police are breaching the blockade single file to create a path for people who want to attend the inauguration.

An odd reverse of the “surge and arrest” tactic to “surge and enable passage.”

The inauguration is still two hours out.

Join Unicorn Riot, Democracy Now! or one of the other live streams covering protests.

Personally I have no interest in the “official” ceremonies and will be skipping those.

PS: A tweet as of 35 minutes ago reports (unconfirmed) that 6 of 12 inauguration entrances have been completely shut down and traffic at others slowed to a “trickle.”

by Patrick Durusau at January 20, 2017 03:30 PM

Why I Tweet by Donald Trump

David Uberti and Pete Vernon in The coming storm for journalism under Trump capture why Donald Trump tweets:


As Trump explained the retention of his personal Twitter handle to the Sunday Times recently: “I thought I’d do less of it, but I’m covered so dishonestly by the press—so dishonestly—that I can put out Twitter…I can go bing bing bing and I just keep going and they put it on and as soon as I tweet it out—this morning on television, Fox: Donald Trump, we have breaking news.

In order for Trump tweets to become news, two things are required:

  1. Trump tweets (quite common)
  2. Media evaluates the tweets to be newsworthy (should be less common)

Reported as newsworthy tweets are unlikely to match the sheer volume of Trump’s tweeting.

You have all read:

trump-on-sat-night-460

Is Trump’s opinion, to which he is entitled, about Saturday Night Live newsworthy?

Trump on television is as trustworthy as the “semi-literate one-legged man” Dickens quoted for the title “Our Mutual Friend” is on English grammar. (Modern American Usage by William Follett, edited by Jacques Barzum. Under the entry for “mutual friend.”)

Other examples abound but suffice it to say the media needs to make its own judgments about newsworthy or not.

Otherwise the natters of another semi-literate become news by default for the next four years.

by Patrick Durusau at January 20, 2017 01:02 AM

January 19, 2017

Patrick Durusau

ScriptSource [Fonts but so much more]

ScriptSource

From the about page:

ScriptSource is a dynamic, collaborative reference to the writing systems of the world, with detailed information on scripts, characters, languages – and the remaining needs for supporting them in the computing realm. It is sponsored, developed and maintained by SIL International. It currently contains only a skeleton of information, and so depends on your participation in order to grow and assist others.

The need for information on Writing Systems

In today’s expanding global community, designers, linguists and computer professionals are called upon more frequently to support the myriad writing systems around the world. A key to this development is consistent, trustworthy, complete and organised information on the alphabets and scripts used to write the world’s languages. The development of Writing System Implementations (WSIs) depends on the availability of this information, so a lack of it can hinder the cultural, economic and intellectual development of communities that communicate in minority languages and scripts.

ssctypes

The information needed varies widely, and can include:

  • Design information and guidelines – both for alphabets and for specific letters/glyphs
  • Linguistic information – how the script is used for specific languages
  • Encoding details – particularly Unicode, including new Unicode proposals
  • Script behaviour – how letters change shape and position in context
  • Keyboarding conventions – including information on data entry tools
  • Testing tools and sample texts – so developers can test their software, fonts, keyboards

Some of this information is available, but is scattered around among a variety of web sites that have different purposes and structures, and often lies undocumented in the minds of individual script experts, or hidden in library books.

This information is also often segregated by audience. A font designer may be frustrated to find that available resources on a script address the spoken/written language relationship, but not the background and visual rules of the letterforms. A linguist may find information on encoding the script – such as the information in The Unicode Standard – but not important details of which languages use which symbols. An application developer may find a long writeup on the development and use of the script, but nothing to tell them what script behaviours are required.

There are also relatively few opportunities for experts from these fields to cooperate and work together. What interaction does exist often happens at conferences, on various mailing lists and forums, and through personal email. There are few experts who have the time to participate in these exchanges, and those that do may be frustrated to find that the same questions keep coming up again and again. Until now, there has been no place where this knowledge can be captured, organised and maintained.

The purpose of ScriptSource

ScriptSource exists to provide this information and bridge the gap between the designer, developer, linguist and user. It seeks to document the writing systems of the world and help those wanting to implement them on computers and other devices.

The initial content is relatively sparse, but includes basic information on all scripts in the ISO 15924 standard. It will grow dynamically through public submissions, expert content development and live linkages with other web sites. Rather than being just another web site about writing systems, ScriptSource provides a single hub of information where both old and new content can be found.

A truly remarkable resource on writing systems by SIL International.

You can think of ScriptSource as a way to locate fonts, but you may be drawn into complexities others rarely see!

Enjoy!

by Patrick Durusau at January 19, 2017 09:00 PM

Permitted Trump Protesters Will Be Ignored

I wish my headline was some of the “fake news” Democrats complain about but Alexandra Rosemann proves the truth of that headline in:

Ignoring anti-Trumpers: Why we can expect media blackout of protests against Trump’s inauguration.

Not ignored by just anybody, ignored by the media.

On Jan. 20 — 16 years ago — thousands of protesters lined the inauguration parade route of the incoming Republican president. “Not my president,” they chanted. But despite the enormity of the rally, it was largely ignored. Instead, pundits marveled over how George W. Bush “filled out the suit” and confirmed authority.

“The inauguration of George W. Bush was certainly a spectacle on Inauguration Day,” marvels Robin Andersen, the director of Peace and Justice studies at Fordham University, in the 2001 short documentary “Not My President: Voices From the Counter Coup.”

It’s nearly impossible not to anticipate the eerie parallels between George W. Bush’s inauguration and that of Donald Trump.

“Forty percent of the public still believed that Bush had not been legitimately elected, yet there’s almost no discussion of these electoral problems or the constitutional crisis,” Andersen explains in the film. “Instead, Bush undergoes a kind of transformation where he fills out the suit and becomes a leader. Forgotten are any of the questions about his ability, his experience or his mangling of the English language. His transformation is almost magical,” she adds.

Andersen estimated the inauguration protests, which occurred throughout the country, garnered approximately 10 minutes of total coverage on all the major networks.

“When we did see images of protesters, there was no explanation as to why. We were asked to be passive spectators in this ritual of legitimation when the real democratic issues that should have been being discussed were ignored,” Andersen says in the film, reflecting on the “real democracy” in the streets of Washington, D.C.

Your choice. Ten minutes of coverage out of over 24 hours of permitted protesting, or the media covering a 24 hour blockade of the DC Beltway.

fox5dc-map-460

Which one do you think draws more attention to your issues?

A new president will be inaugurated on January 20, 2017, but its your choice whether its him, his wife and a few cronies in attendance or hundreds of thousands.

See protests for more ideas on that possibility.

by Patrick Durusau at January 19, 2017 07:52 PM

Empirical Analysis Of Social Media

How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument by Gary King, Jennifer Pan, and Margaret E. Roberts. American Political Science Review, 2017. (Supplementary Appendix)

Abstract:

The Chinese government has long been suspected of hiring as many as 2,000,000 people to surreptitiously insert huge numbers of pseudonymous and other deceptive writings into the stream of real social media posts, as if they were the genuine opinions of ordinary people. Many academics, and most journalists and activists, claim that these so-called “50c party” posts vociferously argue for the government’s side in political and policy debates. As we show, this is also true of the vast majority of posts openly accused on social media of being 50c. Yet, almost no systematic empirical evidence exists for this claim, or, more importantly, for the Chinese regime’s strategic objective in pursuing this activity. In the first large scale empirical analysis of this operation, we show how to identify the secretive authors of these posts, the posts written by them, and their content. We estimate that the government fabricates and posts about 448 million social media comments a year. In contrast to prior claims, we show that the Chinese regime’s strategy is to avoid arguing with skeptics of the party and the government, and to not even discuss controversial issues. We infer that the goal of this massive secretive operation is instead to regularly distract the public and change the subject, as most of the these posts involve cheerleading for China, the revolutionary history of the Communist Party, or other symbols of the regime. We discuss how these results fit with what is known about the Chinese censorship program, and suggest how they may change our broader theoretical understanding of “common knowledge” and information control in authoritarian regimes.

I differ from the authors on some of their conclusions but this is an excellent example of empirical as opposed to wishful analysis of social media.

Wishful analysis of social media includes the farcical claims that social media is an effective recruitment tool for terrorists. Too often claimed to dignify with a citation but never with empirical evidence, only an author’s repetition of the common “wisdom.”

In contrast, King et al. are careful to say what their analysis does and does not support, finding in a number of cases, the evidence contradicts commonly held thinking about the role of the Chinese government in social media.

One example I found telling was the lack of evidence that anyone is paid for pro-government social media comments.

In the authors’ words:


We also found no evidence that 50c party members were actually paid fifty cents or any other piecemeal amount. Indeed, no evidence exists that the authors of 50c posts are even paid extra for this work. We cannot be sure of current practices in the absence of evidence but, given that they already hold government and Chinese Communist Party (CCP) jobs, we would guess this activity is a requirement of their existing job or at least rewarded in performance reviews.
… (at pages 10-11)

Here I differ from the author’s “guess”

…this activity is a requirement of their existing job or at least rewarded in performance reviews.

Kudos to the authors for labeling this a “guess,” although one expects the mainstream press and members of Congress to take it as written in stone.

However, the authors presume positive posts about the government of China can only result from direct orders or pressure from superiors.

That’s a major weakness in this paper and similar analysis of social media postings.

The simpler explanation of pro-government posts is a poster is reporting the world as they see it. (Think Occam’s Razor.)

As for sharing them with the so-called “propaganda office,” perhaps they are attempting to curry favor. The small number of posters makes it difficult to credit their motives (unknown) and behavior (partially known) as representative for the estimated 2 million posters.

Moreover, out of a population that nears 1.4 billion, the existence of 2 million individuals with a positive view of the government isn’t difficult to credit.

This is an excellent paper that will repay a close reading several times over.

Take it also as a warning about ideologically based assumptions that can mar or even invalidate otherwise excellent empirical work.

PS:

Additional reading:

From the Gary King’s webpage on the article:

This paper follows up on our articles in Science, “Reverse-Engineering Censorship In China: Randomized Experimentation And Participant Observation”, and the American Political Science Review, “How Censorship In China Allows Government Criticism But Silences Collective Expression”.

by Patrick Durusau at January 19, 2017 04:01 PM

GNU Unifont Glyphs [Good News/Bad News]

GNU Unifont Glyphs 9.0.06.

From the webpage:

GNU Unifont is part of the GNU Project. This page contains the latest release of GNU Unifont, with glyphs for every printable code point in the Unicode 9.0 Basic Multilingual Plane (BMP). The BMP occupies the first 65,536 code points of the Unicode space, denoted as U+0000..U+FFFF. There is also growing coverage of the Supplemental Multilingual Plane (SMP), in the range U+010000..U+01FFFF, and of Michael Everson’s ConScript Unicode Registry (CSUR).
… (red highlight in original)

That’s the good news.

The bad news is shown by the coverage mapping:

0.0%  U+012000..U+0123FF  Cuneiform*
0.0%  U+012400..U+01247F  Cuneiform Numbers and Punctuation*
0.0%  U+012480..U+01254F  Early Dynastic Cuneiform*
0.0%  U+013000..U+01342F  Egyptian Hieroglyphs*
0.0%  U+014400..U+01467F  Anatolian Hieroglyphs*

These scripts will require a 32-by-32 pixel grid:

*Note: Scripts such as Cuneiform, Egyptian Hieroglyphs, and Bamum Supplement will not be drawn on a 16-by-16 pixel grid. There are plans to draw these scripts on a 32-by-32 pixel grid in the future.

One additional resource on creating cuneiform fonts:

Creating cuneiform fonts with MetaType1 and FontForge by Karel Píška:

Abstract:

A cuneiform font collection covering Akkadian, Ugaritic and Old Persian glyph subsets (about 600 signs) has been produced in two steps. With MetaType1 we generate intermediate Type 1 fonts, and then construct OpenType fonts using FontForge. We describe cuneiform design and the process of font development.

On creating fonts more generally with FontForge, see: Design With FontForge.

Enjoy!

by Patrick Durusau at January 19, 2017 02:43 PM

January 18, 2017

Patrick Durusau

Do You Have Big Brass Ones*? FOIA The President

Join our project to FOIA the Trump administration by Michael Morisy.

From the post:

Since June 2015, MuckRock users have been filing FOIA requests regarding a possible Trump presidency. In fact, so far there’s been over 160 public Trump-related requests filed through the site, all of which you can browse here.

We’ve also put together a number of guides and articles on the upcoming administration, ranging from what you can and can’t file regarding Trump to deep dives into what’s already out there:

We’ve launched a new project page for users to showcase their requests, find new documents regarding the Trump administration, or get inspiration for their own requests, and we’ve created a special Slack channel for you to join in and strategize on future requests, or help share big league FOIA stories that shed light on the President Elect’s team.

We’ve had a few users join us there already and they’ve helped file some really fun requests, so we’re excited about what else the transparency community can come up with.

An effort worthy of both your time and support!

One answered, remember that availability isn’t the same thing as meaningful access.

OCR, indexing, entity extraction, in short any skill you have is important in this effort.

* No longer a gender specific reference as you well know.

PS: I’ve signed up and need suggestions on what to ask for? Suggestions?

by Patrick Durusau at January 18, 2017 10:20 PM

The CIA’s Secret History Is Now Online [Indexing, Mapping, NLP Anyone?]

The CIA’s Secret History Is Now Online by Jason Leopold.

From the post:

Decades ago, the CIA declassified a 26-page secret document cryptically titled “clarifying statement to Fidel Castro concerning assassination.”

It was a step toward greater transparency for one of the most secretive of all federal agencies. But to find out what the document actually said, you had to trek to the National Archives in College Park, Maryland, between the hours of 9 a.m. and 4:30 p.m. and hope that one of only four computers designated by the CIA to access its archives would be available.

But today the CIA posted the Castro record on its website along with more than 12 million pages of the agency’s other declassified documents that have eluded the public, journalists, and historians for nearly two decades. You can view the documents here.

The title of the Castro document, as it turns out, was far more interesting than the contents. It includes a partial transcript of a 1977 transcript between Barbara Walters and Fidel Castro in which she asked the late Cuban dictator whether he had “proof” of the CIA’s last attempt to assassinate him. The transcript was sent to Adm. Stansfield Turner, the CIA director at the time, by a public affairs official at the agency with a note highlighting all references to CIA.

But that’s just one of the millions documents, which date from the 1940s to 1990s, are wide-ranging, covering everything from Nazi war crimes to mind-control experiments to the role the CIA played in overthrowing governments in Chile and Iran. There are also secret documents about a telepathy and precognition program known as Star Gate, files the CIA kept on certain media publications, such as Mother Jones, photographs, more than 100,000 pages of internal intelligence bulletins, policy papers, and memos written by former CIA directors.

Michael Best, @NatSecGeek has pointed out the “CIA de-OCRed at least some of the CREST files before they uploaded them.”

Spy agency class petty. Grant public access but force the restoration of text search.

The restoration of text search work is underway so next steps will be indexing, NLP, mapping, etc.

A great set of documents to get ready for future official and unofficial leaks of CIA documents.

Enjoy!

PS: Curious if any of the search engine vendors will use CREST as demonstration data? Non-trivial size, interesting search issues, etc.

Ask at the next search conference.

by Patrick Durusau at January 18, 2017 08:59 PM

Resistance Manual / Indivisible

Resistance Manual

An essential reference for the volatile politics of the Trump presidency.

Indivisible

Four former congressional staffers banded together to write: “A practical guide to resisting the Trump Agenda.”

Both are shaped by confidence in current political and social mechanisms, to say nothing of a faith in non-violence.

Education is seen as the key to curing bigotry/prejudice and moving towards a more just society.

You will not find links to:

Steal this Book or the Anarchist Cookbook, 2000 edition for example.

There are numerous examples cited as “successful” non-violent protests. The elimination of de jure segregation in the American South. (Resource includes oral histories of the time.)

But, de facto segregation in schools is larger than it was in the 1960’s.

How do you figure that into the “success” of non-violent protests?

Read both Resistance Manual and Indivisible for what may be effective techniques.

But ask yourself, do non-violent protests comfort the victims of violence?

Or just the non-violent protesters?

by Patrick Durusau at January 18, 2017 06:40 PM

Quantum Computer Resistant Encryption

Irish Teen Introduces New Encryption System Resistant to Quantum Computers by Joseph Young.

From the post:


… a 16-year-old student was named as Ireland’s top young scientist and technologist of 2017, after demonstrating the application of qCrypt, which offers higher levels of protection, privacy and encryption in comparison to other innovative and widely-used cryptographic systems.

BT Young Scientist Judge John Dunnion, the associate professor at University of College Dublin, praised Curran’s project that foresaw the impact quantum computing will have on current cryptographic and encryption methods.

“qCrypt is a novel distributed data storage system that provides greater protection for user data than is currently available. It addresses a number of shortfalls of current data encryption systems; in particular, the algorithm used in the system has been demonstrated to be resistant to attacks by quantum computers in the future,” said Dunnion.

While it may be too early to predict whether technologies like qCrypt can protect existing encryption methods and data protection systems from quantum computers, Curran and the judges of the competition saw promising potential in the technology.

Word is spreading rapidly.

qCrypt has a place-holder website, Post-Quantum Cryptography for the Masses.

A Youtube video:

Shane’s Github repository (no qCrypt, yet)

Not to mention Shane’s website.

qCrypt has the potential to provide safety from government surveillance for everyone, everywhere.

Looking forward to this!

by Patrick Durusau at January 18, 2017 03:37 PM

Top considerations for creating bioinformatics software documentation

Top considerations for creating bioinformatics software documentation by Mehran Karimzadeh and Michael M. Hoffman.

Abstract

Investing in documenting your bioinformatics software well can increase its impact and save your time. To maximize the effectiveness of your documentation, we suggest following a few guidelines we propose here. We recommend providing multiple avenues for users to use your research software, including a navigable HTML interface with a quick start, useful help messages with detailed explanation and thorough examples for each feature of your software. By following these guidelines, you can assure that your hard work maximally benefits yourself and others.

Introduction

You have written a new software package far superior to any existing method. You submit a paper describing it to a prestigious journal, but it is rejected after Reviewer 3 complains they cannot get it to work. Eventually, a less exacting journal publishes the paper, but you never get as many citations as you expected. Meanwhile, there is not even a single day when you are not inundated by emails asking very simple questions about using your software. Your years of work on this method have not only failed to reap the dividends you expected, but have become an active irritation. And you could have avoided all of this by writing effective documentation in the first place.

Academic bioinformatics curricula rarely train students in documentation. Many bioinformatics software packages lack sufficient documentation. Developers often prefer spending their time elsewhere. In practice, this time is often borrowed, and by ducking work to document their software now, developers accumulate ‘documentation debt’. Later, they must pay off this debt, spending even more time answering user questions than they might have by creating good documentation in the first place. Of course, when confronted with inadequate documentation, some users will simply give up, reducing the impact of the developer’s work.
… (emphasis in original)

Take to heart the authors’ observation on automatic generation of documentation:


The main disadvantage of automatically generated documentation is that you have less control of how to organize the documentation effectively. Whether you used a documentation generator or not, however, there are several advantages to an HTML web site compared with a PDF document. Search engines will more reliably index HTML web pages. In addition, users can more easily navigate the structure of a web page, jumping directly to the information they need.

I would replace “…less control…” with “…virtually no meaningful control…” over the organization of the documentation.

Think about it for a second. You write short comments, sometimes even incomplete sentences as thoughts occur to you in a code or data context.

An automated tool gathers those comments, even incomplete sentences, rips them out of their original context and strings them one after the other.

Do you think that provides a meaningful narrative flow for any reader? Including yourself?

Your documentation doesn’t have to be great literature but as Karimzadeh and Hoffman point out, good documentation can make the difference between use and adoption and your hard work being ignored.

Ping me if you want to take your documentation to the next level.

by Patrick Durusau at January 18, 2017 02:33 PM

January 17, 2017

Patrick Durusau

Online tracking: A 1-million-site measurement and analysis [Leaving False Trails]

Online tracking: A 1-million-site measurement and analysis by Steven Englehardt and Arvind Narayanan.

From the webpage:

Tracking Results

During our January 2016 measurement of the top 1 million sites, our tool made over 90 million requests, assembling the largest dataset (to our knowledge) used for studying web tracking. With this scale we can answer many web tracking questions: Who are the largest trackers? Which sites embed the largest number of trackers? Which tracking technologies are used, and who is using them? and many more.

Findings

The total number of third parties present on at least two first parties is over 81,000, but the prevalence quickly drops off. Only 123 of these 81,000 are present on more than 1% of sites. This suggests that the number of third parties that a regular user will encounter on a daily basis is relatively small. The effect is accentuated when we consider that different third parties may be owned by the same entity. All of the top 5 third parties, as well as 12 of the top 20, are Google-owned domains. In fact, Google, Facebook, and Twitter are the only third-party entities present on more than 10% of sites.
… (emphasis in original)

Impressive research based upon an impressive tool, OpenWPM.

The Github page for OpenWPM reads in part:

OpenWPM is a web privacy measurement framework which makes it easy to collect data for privacy studies on a scale of thousands to millions of site. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection, including a proxy, a Firefox extension, and access to Flash cookies. Check out the instrumentation section below for more details.

Just a point of view but I’m more interested in specific privacy tracking data for some given set of servers than general privacy statistics.

Specific privacy tracking data that enables planning the use of remote browsers to leave false trails.

Kudos to the project, however you choose to use the software.

by Patrick Durusau at January 17, 2017 10:35 PM

The Political Librarian (volume 2, issue 2)

The Political Librarian

From the webpage:

The Political Librarian is dedicated to expanding the discussion of, promoting research on, and helping to re-envision locally focused advocacy, policy, and funding issues for libraries.

We want to bring in a variety of perspectives to the journal and do not limit our contributors to just those working in the field of library and information science. We seek submissions from researchers, practitioners, community members, or others dedicated to furthering the discussion, promoting research, and helping to re-envision tax policy and public policy on the extremely local level.

Grab the entire volume 2, issue 2 (December 2016) for reading while stopped on the DC Beltway, January 20, 2017.

Libraries need your help to survive and prosper during the rapidly approaching winter of ignorance.

by Patrick Durusau at January 17, 2017 10:16 PM

#DisruptJ20 – 3 inch resolution aerial imagery Washington, DC @J20protests

3 inch imagery resolution for Washington, DC by Jacques Tardie.

From the post:

We updated our basemap in Washington, DC with aerial imagery at 3 inch (7.5 cm) resolution. The source data is openly licensed by DC.gov, thanks to the District’s open data initiative.

If you aren’t familiar with Mapbox, there is no time like the present!

If you are interested in the just the 3 inch resolution aerial imagery, see: http://opendata.dc.gov/datasets?keyword=imagery.

Enjoy!

by Patrick Durusau at January 17, 2017 09:23 PM

Raw SIGINT Locations Expanded

President Obama has issued new rules for sharing information under Executive Order 12333, with the ungainly title: (U) Procedures for the Availability or Dissemination of Raw Signals Intelligence Information by the National Security Agency Under Section 2.3 of Executive Order 12333 (Raw SIGINT Availability Procedures).

Kate Tummarello, in Obama Expands Surveillance Powers On His Way Out by Kate Tummarello, sees a threat to “innocent persons:”

With mere days left before President-elect Donald Trump takes the White House, President Barack Obama’s administration just finalized rules to make it easier for the nation’s intelligence agencies to share unfiltered information about innocent people.

New rules issued by the Obama administration under Executive Order 12333 will let the NSA—which collects information under that authority with little oversight, transparency, or concern for privacy—share the raw streams of communications it intercepts directly with agencies including the FBI, the DEA, and the Department of Homeland Security, according to a report today by the New York Times.

That’s a huge and troubling shift in the way those intelligence agencies receive information collected by the NSA. Domestic agencies like the FBI are subject to more privacy protections, including warrant requirements. Previously, the NSA shared data with these agencies only after it had screened the data, filtering out unnecessary personal information, including about innocent people whose communications were swept up the NSA’s massive surveillance operations.

As the New York Times put it, with the new rules, the government claims to be “reducing the risk that the N.S.A. will fail to recognize that a piece of information would be valuable to another agency, but increasing the risk that officials will see private information about innocent people.”

All of which is true, but the new rules have other impacts as well.

Who is an “IC element?”

The new rules make numerous references to an “IC element,” but comes up short in defining them:

L. (U) IC element is as defined in section 3.5(h) of E.O. 12333.
(emphasis in original)

Great.

Searching for E.O. 12333 isn’t enough. You need Executive Order 12333 United States Intelligence Activities (As amended by Executive Orders 13284 (2003), 13355 (2004) and 13470 (2008)). The National Archives version of Executive Order 12333 is not amended and hence is misleading.

From the amended E.0. 12333:

3.5 (h) Intelligence Community and elements of the Intelligence Community 
        refers to:
(1) The Office of the Director of National Intelligence;
(2) The Central Intelligence Agency;
(3) The National Security Agency;
(4) The Defense Intelligence Agency;
(5) The National Geospatial-Intelligence Agency;
(6) The National Reconnaissance Office; 
(7) The other offices within the Department of Defense for the collection of 
    specialized national foreign intelligence through reconnaissance programs;
(8) The intelligence and counterintelligence elements of the Army, the Navy,
    the Air Force, and the Marine Corps;
(9) The intelligence elements of the Federal Bureau of Investigation;
(10) The Office of National Security Intelligence of the Drug Enforcement
     Administration;
(11) The Office of Intelligence and Counterintelligence of the Department
      of Energy;
(12) The Bureau of Intelligence and Research of the Department of State;
(13) The Office of Intelligence and Analysis of the Department of the Treasury;
(14) The Office of Intelligence and Analysis of the Department of Homeland 
     Security;
(15) The intelligence and counterintelligence elements of the Coast Guard; and
(16) Such other elements of any department or agency as may be designated by 
     the President, or designated jointly by the Director and the head of the 
     department or agency concerned, as an element of the Intelligence Community. 

The Office of the Director of National Intelligence has an incomplete list of IC elements:

Air Force Intelligence Defense Intelligence Agency Department of the Treasury National Geospatial-Intelligence Agency
Army Intelligence Department of Energy Drug Enforcement Administration National Reconnaissance Office
Central Intelligence Agency Department of Homeland Security Federal Bureau of Investigation National Security Agency
Coast Guard Intelligence Department of State Marine Corps Intelligence Navy Intelligence

I say “incomplete” because from E.O. 12333, it is missing (with original numbers for reference):

...
(7) The other offices within the Department of Defense for the collection of 
    specialized national foreign intelligence through reconnaissance programs;
(8) The intelligence and counterintelligence elements of ..., and the 
    Marine Corps;
...
(16) Such other elements of any department or agency as may be designated by 
     the President, or designated jointly by the Director and the head of the 
     department or agency concerned, as an element of the Intelligence Community.

Under #7 and #16, there are other IC elements that are unnamed and unlisted by the Office of the DOI. I suspect the Marines were omitted for stylistic reasons.

Where to Find Raw SIGINT?

Identified IC elements are important because the potential presence of “Raw SIGINT,” beyond the NSA, has increased their value as targets.

P. (U) Raw SIGINT is any SIGINT and associated data that has not been evaluated for foreign intelligence purposes and/or minimized.
… (emphasis in original, from the new rules.)

Tummarello is justly concerned about “innocent people” but there are less than innocent people, any number of appointed/elected official or barons of industry who may be captured on the flypaper of raw SIGINT.

Happy hunting!

PS:

Warning: It’s very bad OPSEC to keep a trophy chart on your wall. ;-)

IC_Circle-460

You will, despite this warning, but I had to try.

The original image is here at Wikipedia.

by Patrick Durusau at January 17, 2017 08:31 PM

January 16, 2017

Patrick Durusau

Never Allow Your Self-Worth To Depend Upon A Narcissist

The White House press corps has failed, again, in its relationship with President Trump.

The latest debacle is described in Defiant WH Press Corps “won’t go away” if ejected, says Major Garrett.

From the post:

There have been rumblings about kicking the press out of the White House almost since Donald Trump won the presidency, culminating with a report in Esquire last week that the Trump administration has in fact been giving the idea “serious consideration.”

“If they do so, we’ll still cover him. The White House press corps won’t go away,” CBS News Chief White House Correspondent Major Garrett told CBSN’s Josh Elliott Monday. “You can shove us a block away, two blocks away, a mile away. We will be on top of this White House — as we’ve been on top of every White House.”

Mr. Trump and several on his communications team have had a stormy relationship with the press, both during his presidential campaign and during his transition.

“I would not be surprised if they moved us out. I really do think there is something about the Trump administration and those closest to him who want the symbolism of driving reporters out of the White House, moving the elites out farther away from this president,” Garrett said.

Does the self-worth of the White House press corps depend upon where they are located by a known narcissist?

If so, they are in for a long four years.

That is doubly true for Trump’s denigration of reporters and others.

A fundamental truth to remember for the next four years:

Trump’s comments about you, favorable or unfavorable, are smelly noise. They will dissipate, unless repeated over and over, as though it matters if a narcissist denies or affirms your existence.

It doesn’t.

by Patrick Durusau at January 16, 2017 10:26 PM

XML.com Relaunch!

XML.com

Lauren Wood posted this note about the relaunch of XML.com recently:

I’ve relaunched XML.com (for some background, Tim Bray wrote an article here: https://www.xml.com/articles/2017/01/01/xmlcom-redux/). I’m hoping it will become part of the community again, somewhere for people to post their news (submit your news here: https://www.xml.com/news/submit-news-item/) and articles (see the guidelines at https://www.xml.com/about/contribute/). I added a job board to the site as well (if you’re in Berlin, Germany, or able to
move there, look at the job currently posted; thanks LambdaWerk!); if your employer might want to post XML-related jobs please email me.

The old content should mostly be available but some articles were previously available at two (or more) locations and may now only be at one; try the archive list (https://www.xml.com/pub/a/archive/) if you’re looking for something. Please let me know if something major is missing from the archives.

XML is used in a lot of areas, and there is a wealth of knowledge in this community. If you’d like to write an article, send me your ideas. If you have comments on the site, let me know that as well.

Just in time as President Trump is about to stir, vigorously, that big pot of crazy known as federal data.

Mapping, processing, transformation demands will grow at an exponential rate.

Notice the emphasis on demand.

Taking a two weeks to write custom software to sort files (you know the Weiner/Abedin laptop story, yes?) won’t be acceptable quite soon.

How are your on-demand XML chops?

by Patrick Durusau at January 16, 2017 09:11 PM

Defeating Police Formations – Parallel Distributed Protesting

If you haven’t read FEMA’s Field Force Operations PER-200, then you are unprepared for #DisruptJ20 or any other serious protest effort.

It’s a real snore in parts, but knowing police tactics will:

  1. Eliminate the element of surprise and fear of the unexpected
  2. Enable planning of protective clothing and other measures
  3. Enable planning of protests to eliminate police advantages
  4. Enable protesters to respond with their own formations

among other things.

On Common Police Formation

While reading Field Force Operations PER-200, I encountered several police formations you are likely to see at #DisruptJ20.

The crossbow arrest formation is found at pages 48-49 and illustrated with:

cross-bow-01-460

cross-bow-02-460

cross-bow-03-460

A number of counter tactics suggest themselves, depending upon your views on non-violence. Passive resistance by anyone who is arrested, thereby consuming more police personnel to secure their arrest. Passively prevented the retreat of the arrest team and its security circle. Breaching the skirmish line on either side of the column, just before the column surges forward, exposing the flank of the column.

Requirements for the crossbow arrest formation

What does the crossbow arrest formation require more than anything else?

You peeked! ;-)

Yes, the police formations in Field Force Operations PER-200, including the crossbow arrest formation all require a crowd.

Don’t get me wrong, crowds can be a good thing and sometimes the only solution. Standing Rock is a great example of taking and holding a location against all odds.

But a great tactic for one protest and its goals, may be a poor tactic for another protest, depending upon goals, available tactics, resources, etc.

Consider the planned and permitted protests for #DisruptJ20.

All are subject to the police formation detailed by FEMA and the use of “less lethal” force by police forces.

How can #DisruptJ20 demonstrate the anger of the average citizen and at the same time defeat police formations?

Parallel Distributed Protesting

Instead of massing in a crowd, where police formations and “less lethal” force are options, what if protesters stopped, ran out of gas, had flat tires on the 64-mile DC Beltway.

I mention the length of the Beltway, 64 miles, because it is ten miles longer than marches from Montgomery to Selma, Alabama. You may remember one of those marches, it’s documented at The incident at the Edmund Pettus Bridge.

On March 7, 195, Representative John Lewis, Hosea Williams and other protesters marched across the Pettus bridge knowing that brutality and perhaps death awaited them.

Protesters who honor Lewis, Williams and other great civil rights leaders can engage in parallel distributed protesting on January 20, 2017.

Each car slowing, stopping, having a flat tire, is a distributed protest point. With distributed protest points occurring in parallel, the Beltway grinds to a halt. No one enters or leaves Washington, D.C. for a day.

Not the same as the footage from the Pettus Bridge, but shutting down the D.C. Beltway will be a news story for months and years to come.

fox5dc-map-460

Lewis, Williams and others were willing to march into the face violence and evil, are you willing to drive to the D.C. Beltway to stop, run out of gas or have a flat tire in their honor?

PS: Beltway blockaders should always be respectful of police officers. They probably don’t like what is happening any more than you do. Besides, their police cruisers are also blocking traffic so their presence is contributing to the gridlock as well.

by Patrick Durusau at January 16, 2017 07:58 PM

Password Advice For Leakers

What the Most Common Passwords of 2016 List Reveals [Research Study] by Keeper.

As a prospective leaker, if your password is any of the ones listed below, congratulations! (“123456″ leads with 17%.)

Your password is in the top 50% of 10 million passwords analyzed by Keeper in 2016.

Extremely plausible evil hackers “discovered” your login and then “cracked” your password.

No longer a “leak,” but a theft and the thief isn’t you. (How’s that for protecting leakers?)

Rank Password
1. 123456
2. 123456789
3. qwerty
4. 12345678
5. 11111
6. 1234567890
7. 1234567
8. password
9. 123123
10. 987654321
11. qwertyuiop
12. mynoob
13. 123321
14. 666666
15. 18atcsk2w
16. 7777777
17. 1q2w3e4r
18. 654321
19. 555555
20. 3rjs1la7qe
21. google
22. 1q2w3e4r5t
23. 123qwe
24. zxcvbnm
25. 1q2w3e

You must follow the leaking instructions at: https://theintercept.com/leak/, but leak only your login, password and network URL.

No guarantees that The Intercept will take the initiative but they aren’t the only game in town.

by Patrick Durusau at January 16, 2017 04:31 PM

Phishing As A Public Service – Leak Access, Not Data

The Intercept tweeted today:

intercept-460

Kudos to The Intercept for reaching out to (US) federal employees to encourage safe leaking.

On the other hand, have you thought about the allocation of risks for leaking?

Take Edward Snowden for example. If caught, Snowden is going to jail, NOT Glenn Greenwald or other reporters who used the Snowden leak.

The Intercept has a valid point when it says:


Without leaks, journalists would have never connected the Watergate scandal to President Nixon, or discovered that the Reagan White House illegally sold weapons to Iran. In the past 15 years alone, inside sources played a vital role in uncovering secret prisons, abuses at Abu Ghraib, atrocities in Afghanistan and Iraq, and mass surveillance by the NSA.

At least historically speaking. Back in the days when hard copy was the norm.

Hard copy isn’t the norm now and leaking guidelines need to catch up to the present day.

Someone could have leaked a portion of the Office of Personnel Management records but in a modern age, digital was far more powerful. (That was a straight hack but it illustrates the difference between sweaty smuggling of hard copy versus giving others the key to a vault.)

If instead of leaking documents/data, imagine following these instructions:

The best option is to use our SecureDrop server, which has the advantage of allowing us to send messages back to you, while allowing you to remain totally anonymous — even to us, if that is what you prefer.

  • Begin by bringing your personal computer to a Wi-Fi network that isn’t associated with you or your employer, like one at a coffee shop. Download the Tor Browser. (Tor allows you to go online while concealing your IP address from the websites you visit.)
  • You can access our SecureDrop server by going to http://y6xjgkgwj47us5ca.onion/ in the Tor Browser. This is a special kind of URL that only works in Tor. Do NOT type this URL into a non-Tor Browser. It won’t work — and it will leave a record.
  • If that is too complicated, or you don’t wish to engage in back-and-forth communication with us, a perfectly good alternative is to simply send mail to P.O. Box 65679, Washington, D.C., 20035, or to The Intercept, 114 Fifth Avenue, 18th Floor, New York, New York, 10011. Drop it in a mailbox (do not send it from home, work or a post office) with no return address.

And you send the following:

  1. Your email address
  2. Screen shots of legitimate emails you get on a regular basis
  3. What passwords are the most important

That’s it.

The receiver constructs a phishing email and sends it to your address.

Like John Podesta and numerous other public figures, you are taken in by this scam.

Evil doers use your present password for access and you have system recorded evidence that you were duped.

How does that allocation of risk look to you, as a potential leaker?

PS: Some, but not all, journalists will be quick to point out what I suggest is, drum roll, illegal. OK, and the question?

Those journalists are being very brave on behalf of leakers, knowing they will never share the fate of a leaker.

I make an exception for all the very brave journalists writing outside of the United States and a few other areas at great personal risk. But then they are unlikely to be concerned with the niceties of the law when dealing with a rogue government.

Update: Apologies but I forgot to include a link to the original post: Attention Federal Employees: If You See Something, Leak Something.

by Patrick Durusau at January 16, 2017 04:16 PM

Highly Effective Gmail Phishing

Wide Impact: Highly Effective Gmail Phishing Technique Being Exploited by Mark Maunder.

From the post:

As you know, at Wordfence we occasionally send out alerts about security issues outside of the WordPress universe that are urgent and have a wide impact on our customers and readers. Unfortunately this is one of those alerts. There is a highly effective phishing technique stealing login credentials that is having a wide impact, even on experienced technical users.

I have written this post to be as easy to read and understand as possible. I deliberately left out technical details and focused on what you need to know to protect yourself against this phishing attack and other attacks like it in the hope of getting the word out, particularly among less technical users. Please share this once you have read it to help create awareness and protect the community.

Mark’s omission of the “technical details” makes this more of an advertisement for phishing with Gmail than a how-to guide.

Still, the observation that even “experienced technical users” are trapped by this technique should encourage journalists in particular to consider adding phishing, voluntary or otherwise to their data gathering toolkit.

As I pointed out yesterday, Phishing As A Public Service – Leak Access, Not Data, enabling leakers to choose to receive phishing emails can result in greater access to documents by reporters at less risk to leakers.

With the daily hype about data breaches, who can blame some mid-level management type for their computer being breached? Oh, it could result in loss of employment, maybe, but greatly reduces the odds of being fingered as a leaker.

Unlike plain brown paper wrappers with Glenn Greenwald‘s address on them. ;-)

If phishing sounds a bit exotic, consider listing software/versions with known vulnerabilities that users can install and then visit a website for an innocent registration that captures their details.

Journalism as active information gathering as opposed to consuming leaks and government hand-outs.

by Patrick Durusau at January 16, 2017 01:56 PM

January 15, 2017

Patrick Durusau

Thoughts on Blockading Metro Rail Stops

A recent news report mentioned the potential for blockades of DC Metro Rail stops.

Curbed Washington DC posted a list of those stops, but like many reporters, did not provide links to the stops.

:-(

Here’s their list:

metro-stops-460

Metro Stops with Hyperlinks

Here’s my version, in the same ticket color order:

Presented as the original, the list leaves the impression of more Metro stops than require blockading. Here is “apparent” count of Metro Stops is twelve (12).

Discovering Duplicate Metro Rail Stops

Rearrangement by Metro Rail stops reveals duplicates:

Deduped Metro Stops and Priority Map

If we remove the duplicate stops and sort by stop name, we find only eight (8) Metro Stops for blockading.

  1. Capital South Green Ticket Holders
  2. Eastern Market Green Ticket Holders
  3. Federal Center SW Orange Ticket Holders, Silver Ticket Holders
  4. Gallery Place-Chinatown Blue Ticket Holders, Red Ticket Holders
  5. Judiciary Square Blue Ticket Holders, Red Ticket Holders
  6. L’Enfant Plaza Orange Ticket Holders, Silver Ticket Holders
  7. NoMa-Gallaudet U Yellow Ticket Holders
  8. Union Station Yellow Ticket Holders

All of this is public information and with a little rearrangement, it becomes easier to focus resources on any potential blockading of those stops.

In terms of priorities, Curbed Washington DC posted a map of the gate locations and guest sections for ticket holders. I took a screen-shot of the center portion:

guest-sections-460

If your are interested in activities around the checkpoints, see the larger map.

So You Want To Blockade A Metro Stop

A map of Union Station reminded me that open street blockading isn’t likely to close a Metro Rail stop.

Why? Even with a large number of hardened protesters, the police can approach you from all sides, driving you in particular directions with “less lethal” weapons.

But the architecture of a Metro Rail stop offers an alternative strategy to open air resistance.

Don’t blockade outside a Metro Rail stop, blockade the stop by occupying stairwells, access points, etc.

Anyone opposing the blockade will seek to restore service and so be less likely to use persistent gases or other irritants in closed spaces.

The other advantage of escalators, stairways is that the police can only approach from in front or from behind you. Enabling you to defend the edges of your formation with layers of the most recalcitrant protesters.

I know you intend to peacefully and lawfully assemble only but be aware you may have those in your midst who damage and/or disable turnstiles. Either with some variety of fast acting adhesives or jamming them with thin metallic objects. Although illegal, those acts will also contribute to delaying the restoration of full service.

More thoughts on blockades reduce the number of people reaching Metro Rail stops tomorrow.

PS: It’s unfortunate the Metro doesn’t use tokens anymore. There are some interesting things that can happen with tokens.

by Patrick Durusau at January 15, 2017 08:38 PM

January 14, 2017

Patrick Durusau

New Spaceship Speed in Conway’s Game of Life

New Spaceship Speed in Conway’s Game of Life by Alexy Nigin.

From the post:

In this article, I assume that you have basic familiarity with Conway’s Game of Life. If this is not the case, you can try reading an explanatory article but you will still struggle to understand much of the following content.

The day before yesterday ConwayLife.com forums saw a new member named zdr. When we the lifenthusiasts meet a newcomer, we expect to see things like “brand new” 30-cell 700-gen methuselah and then have to explain why it is not notable. However, what zdr showed us made our jaws drop.

It was a 28-cell c/10 orthogonal spaceship:

An animated image of the spaceship

… (emphasis in the original)

The mentioned introduction isn’t sufficient to digest the material in this post.

There is a wealth of material available on cellular automata (the Game of Life is one).

LifeWiki is one and Complex Cellular Automata is another. While not exhaustive of all there is to know about cellular automata, familiarity with take some time and skill.

Still, I offer this as encouragement that fundamental discoveries remain to be made.

But if and only if you reject conventional wisdom that prevents you from looking.

by Patrick Durusau at January 14, 2017 10:09 PM

Looking up words in the OED with XQuery [Details on OED API Key As Well]

Looking up words in the OED with XQuery by Clifford Anderson.

Clifford has posted a gist of work from the @VandyLibraries XQuery group, looking up words in the Oxford English Dictionary (OED) with XQuery.

To make full use of Clifford’s post, you will need for the Oxford Dictionaries API.

If you go straight to the regular Oxford English Dictionary (I’m omitting the URL so you don’t make the same mistake), there is nary a mention of the Oxford Dictionaries API.

The free plan allows 3K queries a month.

Not enough to shut out the outside world for the next four/eight years but enough to decide if it’s where you want to hide.

Application for the free api key was simple enough.

Save that the dumb password checker insisted on one or more special characters, plus one or more digits, plus upper and lowercase. When you get beyond 12 characters the insistence on a special character is just a little lame.

Email response with the key was fast, so I’m in!

What about you?

by Patrick Durusau at January 14, 2017 09:15 PM

D-Wave Just Open-Sourced Quantum Computing [DC Beltway Parking Lot Distraction]

D-Wave Just Open-Sourced Quantum Computing by Dom Galeon.

D-Wave has just released a welcome distraction for CS types sitting in the DC Beltway Parking Lot on January 20-21, 2017. (I assuming you brought extra batteries for your laptop.) After you run out of gas, your laptop will be running on battery power alone.

Just remember to grab a copy of Qbsolv before you leave for the tailgate/parking lot party on the Beltway.

A software tool known as Qbsolv allows developers to program D-Wave’s quantum computers even without knowledge of quantum computing. It has already made it possible for D-Wave to work with a bunch of partners, but the company wants more. “D-Wave is driving the hardware forward,” Bo Ewald, president of D-Wave International, told Wired. “But we need more smart people thinking about applications, and another set thinking about software tools.”

To that end, D-Wave has open-sourced Qbsolv, making it possible for anyone to freely share and modify the software. D-Wave hopes to build an open source community of sorts for quantum computing. Of course, to actually run this software, you’d need access to a piece of hardware that uses quantum particles, like one of D-Wave’s quantum computers. However, for the many who don’t have that access, the company is making it possible to download a D-Wave simulator that can be used to test Qbsolv on other types of computers.

This open-source Qbsolv joins an already-existing free software tool called Qmasm, which was developed by one of Qbsolv’s first users, Scott Pakin of Los Alamos National Laboratory. “Not everyone in the computer science community realizes the potential impact of quantum computing,” said mathematician Fred Glover, who’s been working with Qbsolv. “Qbsolv offers a tool that can make this impact graphically visible, by getting researchers and practitioners involved in charting the future directions of quantum computing developments.”

D-Wave’s machines might still be limited to solving optimization problems, but it’s a good place to start with quantum computers. Together with D-Wave, IBM has managed to develop its own working quantum computer in 2000, while Google teamed up with NASA to make their own. Eventually, we’ll have a quantum computer that’s capable of performing all kinds of advanced computing problems, and now you can help make that happen.

From the github page:

qbsolv is a metaheuristic or partitioning solver that solves a potentially large quadratic unconstrained binary optimization (QUBO) problem by splitting it into pieces that are solved either on a D-Wave system or via a classical tabu solver.

The phrase, “…might still be limited to solving optimization problems…” isn’t as limiting as it might appear.

A recent (2014) survey of quadratic unconstrained binary optimization (QUBO), The Unconstrained Binary Quadratic Programming Problem: A Survey runs some thirty-three pages and should keep you occupied however long you sit on the DC Beltway.

From page 10 of the survey:


Kochenberger, Glover, Alidaee, and Wang (2005) examine the use of UBQP as a tool for clustering microarray data into groups with high degrees of similarity.

Where I read one person’s “similarity” to be another person’s test of “subject identity.”

PS: Enjoy the DC Beltway. You may never see it motionless ever again.

by Patrick Durusau at January 14, 2017 02:10 AM

Calling Bullshit in the Age of Big Data (Syllabus)

Calling Bullshit in the Age of Big Data by Carl T. Bergstrom and Jevin West.

From the about page:

The world is awash in bullshit. Politicians are unconstrained by facts. Science is conducted by press release. So-called higher education often rewards bullshit over analytic thought. Startup culture has elevated bullshit to high art. Advertisers wink conspiratorially and invite us to join them in seeing through all the bullshit, then take advantage of our lowered guard to bombard us with second-order bullshit. The majority of administrative activity, whether in private business or the public sphere, often seems to be little more than a sophisticated exercise in the combinatorial reassembly of bullshit.

We’re sick of it. It’s time to do something, and as educators, one constructive thing we know how to do is to teach people. So, the aim of this course is to help students navigate the bullshit-rich modern environment by identifying bullshit, seeing through it, and combatting it with effective analysis and argument.

What do we mean, exactly, by the term bullshit? As a first approximation, bullshit is language intended to persuade by impressing and overwhelming a reader or listener, with a blatant disregard for truth and logical coherence.

While bullshit may reach its apogee in the political sphere, this isn’t a course on political bullshit. Instead, we will focus on bullshit that comes clad in the trappings of scholarly discourse. Traditionally, such highbrow nonsense has come couched in big words and fancy rhetoric, but more and more we see it presented instead in the guise of big data and fancy algorithms — and these quantitative, statistical, and computational forms of bullshit are those that we will be addressing in the present course.

Of course an advertisement is trying to sell you something, but do you know whether the TED talk you watched last night is also bullshit — and if so, can you explain why? Can you see the problem with the latest New York Times or Washington Post article fawning over some startup’s big data analytics? Can you tell when a clinical trial reported in the New England Journal or JAMA is trustworthy, and when it is just a veiled press release for some big pharma company?

Our aim in this course is to teach you how to think critically about the data and models that constitute evidence in the social and natural sciences.

Learning Objectives

Our learning objectives are straightforward. After taking the course, you should be able to:

  • Remain vigilant for bullshit contaminating your information diet.
  • Recognize said bullshit whenever and wherever you encounter it.
  • Figure out for yourself precisely why a particular bit of bullshit is bullshit.
  • Provide a statistician or fellow scientist with a technical explanation of why a claim is bullshit.
  • Provide your crystals-and-homeopathy aunt or casually racist uncle with an accessible and persuasive explanation of why a claim is bullshit.

We will be astonished if these skills do not turn out to be among the most useful and most broadly applicable of those that you acquire during the course of your college education.

A great syllabus and impressive set of readings, although I must confess my disappointment that Is There a Text in This Class? The Authority of Interpretive Communities and Doing What Comes Naturally: Change, Rhetoric, and the Practice of Theory in Literary and Legal Studies, both by Stanley Fish, weren’t on the list.

Bergstrom and West are right about the usefulness of this “class” but I would use Fish and other literary critics to push your sensitivity to “bullshit” a little further than the readings indicate.

All communication is an attempt to persuade within a social context. If you share a context with a speaker, you are far more likely to recognize and approve of their use of “evidence” to make their case. If you don’t share such a context, say a person claiming a particular interpretation of the Bible due to divine revelation, their case doesn’t sound like it has any evidence at all.

It’s a subtle point but one known in the legal, literary and philosophical communities for a long time. That it’s new to scientists and/or data scientists speaks volumes about the lack of humanities education in science majors.

by Patrick Durusau at January 14, 2017 12:33 AM

January 13, 2017

Patrick Durusau

Security Design: Stop Trying to Fix the User (Or Catch Offenders)

Security Design: Stop Trying to Fix the User by Bruce Schneier.

From the post:

Every few years, a researcher replicates a security study by littering USB sticks around an organization’s grounds and waiting to see how many people pick them up and plug them in, causing the autorun function to install innocuous malware on their computers. These studies are great for making security professionals feel superior. The researchers get to demonstrate their security expertise and use the results as “teachable moments” for others. “If only everyone was more security aware and had more security training,” they say, “the Internet would be a much safer place.”

Enough of that. The problem isn’t the users: it’s that we’ve designed our computer systems’ security so badly that we demand the user do all of these counterintuitive things. Why can’t users choose easy-to-remember passwords? Why can’t they click on links in emails with wild abandon? Why can’t they plug a USB stick into a computer without facing a myriad of viruses? Why are we trying to fix the user instead of solving the underlying security problem?

Traditionally, we’ve thought about security and usability as a trade-off: a more secure system is less functional and more annoying, and a more capable, flexible, and powerful system is less secure. This “either/or” thinking results in systems that are neither usable nor secure.

Non-reliance on users is a good first step.

An even better second step would create financial incentives for Bruce’s first step.

Financial incentives similar to those in products liability cases, where a “reasonable care” standard evolves over time. No product has to be perfect, but there are expectations of how not bad a product must be.

Liability not only for the producer of the software but also enterprises using that software, when third-parties are hurt by data breaches.

Claims about the complexity of software are true, but can you honestly say that software is more complex than drug interactions across an unknown population? Yet, we have products liability standards for those cases.

Without financial incentives, substantial financial incentives, such as with products liability, cybersecurity experts (Bruce excepted) will still be trying to “fix the user” a decade from now.

The romantic quest to capture and punish those guilty of cybercrime, hasn’t worked so well. One collection of cybercrime statistics pointed out that detected cybercrime incidents increased by 38% in the last year.

Tell me, do you know of any statistics showing a 38% increase in the arrest and prosecution of cybercriminals in the last year? No? That’s what I thought.

With estimated cybercrime prevention spending at $80 billion this year and an estimated cybercrime cost of $2 trillion by 2019, you don’t seem to be getting very much return on your investment.

We know that fixing users doesn’t work and capturing cybercriminals is a dicey proposition.

Both of those issues can be addressed by establishing incentives for more secure software. (Legal liability takes legislative misjudgment out of the loop, enabling the organic growth of software liability principles.)

by Patrick Durusau at January 13, 2017 09:09 PM

Ultrasound Tracking Defeats Tor (Provides Pathway Into Government Offices)

Tor users at risk of being unmasked by ultrasound tracking by Danny Bradbury.

How close is your phone to your computer right now?

That close?

You may want to rethink your phone’s location.

From the post:

A new type of attack should make Tor users – and countless dogs around the world – prick up their ears. The attack, revealed at BlackHat Europe in November and at the 33rd Chaos Computer Congress the following month, uses ultrasounds to track users, even if they are communicating over anonymous networks.

The attack uses a technique called ultrasound cross-device tracking (uXDT), which made its way into advertising circles as early as 2012. Marketing companies running uXDT campaigns will play an ultrasonic sound, inaudible to the human ear, in a TV or radio ad, or even in an ad delivered via a computer browser.

Although the user won’t hear it, other devices such as smartphones using uXDT-enabled apps will be listening. When the app hears the signal, it will ping the advertising network with details about itself. What details? Anything it asks for the phone for, such as its IP address, geolocation Coleman’s, telephone number and IMEI (SIM card) code.

That’s creepy enough in marketing. Now, advertisers can tell what TV or radio ads you’ve been listening to, matching them with the universe of other information they have about you from your web searches, social media activity and emails.

In essence the technique uses an ultrasound “beacon” to trigger your phone to “call home.”

Hmmm, betrayed by your own phone.

Danny outlines a number of scenarios of governments using this technique against users.

Ultrasound tracking poses a significant risk for Tor users, but they are security conscious enough to be using Tor.

Consider the flip side of using ultrasound tracking as a pathway into government offices. A phone that can “call home” can certainly listen for keystrokes.

Where do you think most sysadmins keep their phones? ;-)

by Patrick Durusau at January 13, 2017 07:27 PM

ODI – Access To Legal Data News

Strengthening our legal data infrastructure by Amanda Smith.

Amanda recounts an effort between the Open Data Institute (ODI) and Thomas Reuters to improve access to legal data.

From the post:


Paving the way for a more open legal sector: discovery workshop

In September 2016, Thomson Reuters and the ODI gathered publishers of legal data, policy makers, law firms, researchers, startups and others working in the sector for a discovery workshop. Its aims were to explore important data types that exist within the sector, and map where they sit on the data spectrum, discuss how they flow between users and explore the opportunities that taking a more open approach could bring.

The notes from the workshop explore current mechanisms for collecting, managing and publishing data, benefits of wider access and barriers to use. There are certain questions that remain unanswered – for example, who owns the copyright for data collected in court. The notes are open for comments, and we invite the community to share their thoughts on these questions, the data types discussed, how to make them more open and what we might have missed.

Strengthening data infrastructure in the legal sector: next steps

Following this workshop we are working in partnership with Thomson Reuters to explore data infrastructure – datasets, technologies and processes and organisations that maintain them – in the legal sector, to inform a paper to be published later in the year. The paper will focus on case law, legislation and existing open data that could be better used by the sector.

The Ministry of Justice have also started their own data discovery project, which the ODI have been contributing to. You can keep up to date on their progress by following the MOJ Digital and Technology blog and we recommend reading their data principles.

Get involved

We are looking to the legal and data communities to contribute opinion pieces and case studies to the paper on data infrastructure for the legal sector. If you would like to get involved, contact us.
…(emphasis in original)

Encouraging news, especially for those interested in building value-added tools on top of data that is made available publicly. At least they can avoid the cost of collecting data already collected by others.

Take the opportunity to comment on the notes and participate as you are able.

If you think you have seen use cases for topic maps before, consider that the Code of Federal Regulations (US), as of December 12, 2016, has 54938 separate but not unique, definitions of “person.” The impact of each regulation depending upon its definition of that term.

Other terms have similar semantic difficulties both in the Code of Federal Regulations as well as the US Code.

by Patrick Durusau at January 13, 2017 05:44 PM

Cellebrite Hacked (Crowd-Funding for Tools?)

Phone-Hacking Firm Cellebrite Got Hacked; 900GB of Data Stolen by Swati Khandelwal.

From the post:

Israeli firm Cellebrite, the popular company that provides digital forensics tools and software to help law enforcement access mobile phones in investigations, has had 900 GB of its data stolen by an unknown hacker.

But the hacker has not yet publicly released anything from the stolen data archive, which includes its customer information, user databases, and a massive amount of technical data regarding its hacking tools and products.

Instead, attackers are looking for possible opportunities to sell the access to Cellebrite system and data on a few selected IRC chat rooms, the hacker told Joseph Cox, contributor at Motherboard, who was contacted by the hacker and received a copy of the stolen data.

I can understand the hacker’s desire to make money and if unlike TheShadowBrokers, who are still pricing themselves out of a sale (approximately $8,230,000), the price is a reasonable one, crowd-funding might be a useful approach to purchasing the tools for public release.

I can’t afford to bid on the tools as an individual, but would contribute to a crowd-funded effort to secure a public release of the tools.

Why? The more hacking tools that are available, the less secure governments become.

People become less secure as well but governments are a far greater threat to people than cyber-criminals will ever be.

Cyber-criminals want your money, governments want your freedom.

by Patrick Durusau at January 13, 2017 04:07 PM

Humanities Digital Library [A Ray of Hope]

Humanities Digital Library (Launch Event)

From the webpage:

Date
17 Jan 2017, 18:00 to 17 Jan 2017, 19:00

Venue

IHR Wolfson Conference Suite, NB01/NB02, Basement, IHR, Senate House, Malet Street, London WC1E 7HU

Description

6-7pm, Tuesday 17 January 2017

Wolfson Conference Suite, Institute of Historical Research

Senate House, Malet Street, London, WC1E 7HU

www.humanities-digital-library.org

About the Humanities Digital Library

The Humanities Digital Library is a new Open Access platform for peer reviewed scholarly books in the humanities.

The Library is a joint initiative of the School of Advanced Study, University of London, and two of the School’s institutes—the Institute of Historical Research and the Institute of Advanced Legal Studies.

From launch, the Humanities Digital Library offers scholarly titles in history, law and classics. Over time, the Library will grow to include books from other humanities disciplines studied and researched at the School of Advanced Study. Partner organisations include the Royal Historical Society whose ‘New Historical Perspectives’ series will appear in the Library, published by the Institute of Historical Research.

Each title is published as an open access PDF, with copies also available to purchase in print and EPUB formats. Scholarly titles come in several formats—including monographs, edited collections and longer and shorter form works.
(emphasis in the original)

Timely evidence that not everyone in the UK is barking mad! “Barking mad” being the only explanation I can offer for the Investigatory Powers Bill.

I won’t be attending but if you can, do and support the Humanities Digital Library after it opens.

by Patrick Durusau at January 13, 2017 03:16 PM

The People vs the Snoopers’ Charter [No Input = No Surveillance, Of Gaff Hooks]

The People vs the Snoopers’ Charter

From the webpage:


Ever googled something personal?

Who you text, email or call. Your social media activity. Which websites you visit.

Who you bank with. Where your kids go to school. Your sexual preferences, health worries, religious and political beliefs.

Since November, the Snoopers’ Charter – the Investigatory Powers Act – has let the Government access all this intimate information, building up an incredibly detailed picture of you, your family and friends, your hobbies and habits – your entire life.

And it won’t just be accessed by the Home Secretary. Dozens of agencies – the Department for Work and Pensions, HMRC and 46 others – can now see sensitive details of your personal life.

Over 200,000 people signed a petition to stop the Snoopers’ Charter, the Government didn’t listen so we’re taking them to court and we need your help.

There’s no opt-out and you don’t need to be suspected of anything. It will just happen all the time, to every one of us.

The Investigatory Powers Act lets Government keep records of and monitor your private emails, texts and phone calls – that’s where you are, who you speak to, what you say – and all without any suspicion of wrongdoing.

It forces internet companies like Sky, BT and TalkTalk to log every website you visit or app you have used, creating a vast database of deeply sensitive and revealing information. At a time when companies and governments are under increasingly frequent attack from hackers, this will create a goldmine for criminals and foreign spies.

Your support will help us clear the first hurdle, being granted permission by the Court to proceed with our case against the Government.

It’s time we all took a stand. We’ve told the Government we’ll see them in court and we need your help to make that happen. Please donate whatever you can to fund this vital case.
… (emphasis in original)

In case you are missing the background, see: Investigatory Powers Act 2016, which is now law in the UK.

The text as originally enacted.

The true extent of surveillance in the United States is unknown so it isn’t clear if the UK was playing “catch up” with this draconian measure or trying to beat the United States in a race to the least civil society.

Either way, it is an unfortunate milestone in the legal history of a country that gave us the common law.

surveillance-camera-460

From a data science perspective, I would point out that no input = no surveillance.

Your eyes maybe better than mine but in the surveillance camera image, I count at least three vulnerabilities that would render the camera useless.

Ordinary wire cutters:

cutters-460

won’t be useful but a gaff hook could be quite effective in creating a no input state.

The same principle applies whether you choose a professionally made gaff hook or some DIY version of the same instrument.

A gaff hook won’t stop surveillance of ISPs, etc., but disabling a surveillance camera could be seen as poking the government in the eye.

That’s an image I can enjoy. You?

PS: I’m not intimate with UK criminal law. Is possession of a gaff hook legal in the UK?

by Patrick Durusau at January 13, 2017 02:58 PM

Applied Computational Genomics Course at UU: Spring 2017

Applied Computational Genomics Course at UU: Spring 2017 by Aaron Quinlan.

I initially noticed this resource from posts on the two part Introduction to Unix (part 1) and Introduction to Unix (part 2).

Both of which are too elementary for you but something you can pass onto others. They do give you an idea of the Unix skill level required for the rest of the course.

From the GitHub page:

This course will provide a comprehensive introduction to fundamental concepts and experimental approaches in the analysis and interpretation of experimental genomics data. It will be structured as a series of lectures covering key concepts and analytical strategies. A diverse range of biological questions enabled by modern DNA sequencing technologies will be explored including sequence alignment, the identification of genetic variation, structural variation, and ChIP-seq and RNA-seq analysis. Students will learn and apply the fundamental data formats and analysis strategies that underlie computational genomics research. The primary goal of the course is for students to be grounded in theory and leave the course empowered to conduct independent genomic analyses. (emphasis in the original)

I take it successful completion will also enable you to intelligently question genomic analyses by others.

The explosive growth of genomics makes that a valuable skill in public discussions as well something nice for your toolbox.

by Patrick Durusau at January 13, 2017 02:39 AM

Stanford CoreNLP – a suite of core NLP tools (3.7.0)

Stanford CoreNLP – a suite of core NLP tools

The beta is over and Stanford CoreNLP 3.7.0 is on the street!

From the webpage:

Stanford CoreNLP provides a set of natural language analysis tools. It can give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases and word dependencies, indicate which noun phrases refer to the same entities, indicate sentiment, extract particular or open-class relations between entity mentions, get quotes people said, etc.

Choose Stanford CoreNLP if you need:

  • An integrated toolkit with a good range of grammatical analysis tools
  • Fast, reliable analysis of arbitrary texts
  • The overall highest quality text analytics
  • Support for a number of major (human) languages
  • Available interfaces for most major modern programming languages
  • Ability to run as a simple web service

Stanford CoreNLP’s goal is to make it very easy to apply a bunch of linguistic analysis tools to a piece of text. A tool pipeline can be run on a piece of plain text with just two lines of code. CoreNLP is designed to be highly flexible and extensible. With a single option you can change which tools should be enabled and which should be disabled. Stanford CoreNLP integrates many of Stanford’s NLP tools, including the part-of-speech (POS) tagger, the named entity recognizer (NER), the parser, the coreference resolution system, sentiment analysis, bootstrapped pattern learning, and the open information extraction tools. Moreover, an annotator pipeline can include additional custom or third-party annotators. CoreNLP’s analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications.

What stream of noise, sorry, news are you going to pipeling into the Stanford CoreNLP framework?

;-)

Imagine a web service that offers levels of analysis alongside news text.

Or does the same with leaked emails and/or documents?

by Patrick Durusau at January 13, 2017 02:16 AM

Interactive Color Wheel

Interactive Color Wheel

color-wheel-460

You will need to visit this interactive color wheel to really appreciate its capabilities.

What I find most helpful is the display of hex codes for the colors. I can distinguish colors but getting the codes right can be a real challenge.

Enjoy!

by Patrick Durusau at January 13, 2017 02:05 AM

January 12, 2017

Patrick Durusau

Inaugural Ball Cancellation!

Your antipathy towards upcoming inaugural balls and work on possible blockades is having an impact!

The Arkansas Inaugural Ball has been cancelled do to “low demand.”

The listing of inauguration balls I pointed to yesterday has plenty of other places for blockades and other mischief.

Looking forward to the least attended inauguration in history, longest and largest traffic snarl in history and complete social disasters at the inauguration balls.

by Patrick Durusau at January 12, 2017 10:26 PM

Flashing/Mooning For Inauguration Forecast

You can find updated weather forecast for January 20, 2017, updated from my speculations in Blockading Washington – #DisruptJ20 – Unusual Tactic – Nudity in Angela Friz’s Here’s the first of what will surely be many inauguration weather forecasts.

Angela isn’t reporting sun-bathing weather but warm enough that a heavy coat over your birthday suit may be sufficient.

Of course, you could always build a fire in a trash barrel, something we are likely to see a lot of during the Trump presidency.

I’m sure other protesters, in the buff or not will appreciate the extra warmth.

by Patrick Durusau at January 12, 2017 02:55 AM

Missing The Beltway Blockade? Considering Blockading A Ball?

For one reason or another, you may not be able to participate in a Beltway Blockade January 20, 2017, see:

Don’t Panic!

You can still enjoy a non-permitted protest and contribute to the least attended inauguration in history!

2017 Presidential Inaugural Balls

The list is short on location information for many of the scheduled balls but the Commander in Chief’s Ball, Presidential Inaugural Ball, Mid-Atlantic Inauguration Ball, Midwest Inaugural Ball, Western Inaugural Ball, and the Neighborhood Inaugural Ball, are all being held at the: Walter E. Washington Convention Center.

Apologies but I haven’t looked up prior attendance records but just based on known scheduling, disruption in the area of Walter E. Washington Convention Center looks like it will pay the highest returns.

For the balls with location information and/or location information that I can discover, I will post a fuller list with Google Map links tomorrow.

Oh, for inside protesting, here are floor plans of the Walter E. Washington Convention Center.

Those are the official, posted floor plans.

Should that link go dark, let me know. I have a backup copy of them. ;-)

by Patrick Durusau at January 12, 2017 12:18 AM

January 11, 2017

Patrick Durusau

Overcoming Congressional Provincialism

While doing hard core data collection on members of Congress, I kept encountering:

Regrettably, I am unable to reply to any email from individuals residing outside of my congressional district.

The problem, of course, is that you may have an opinion on national intelligence but your representative, for example, isn’t on the intelligence committee.

What if you could identify and reach across congressional boundaries?

More on that tomorrow, alone with news of the data set that has distracted me for several days!

by Patrick Durusau at January 11, 2017 03:25 AM

January 10, 2017

Patrick Durusau

ANSWER Secures More “Permitted” Protest Space

inaugurate-banner-460

The ANSWER Coalition has:

…secured another permitted assembly area for an even larger gathering site on the parade route, the Navy Memorial (8th St. and Pennsylvania Ave. NW).

See: ANSWER Coalition for details and ways to support.

If your not interested in “permitted” protesting, it could be the case that attendees and protesters alike find it difficult if not impossible to attend the inauguration. See protests for an ongoing series of speculations in that direction.

PS: I remember the Constitution reading:

Congress shall make no law … abridging…the right of the people peaceably to assemble…

NOT:

Congress shall make no law … abridging…the right of the people to be permitted to assemble…

Do you?

by Patrick Durusau at January 10, 2017 02:01 AM

January 07, 2017

Patrick Durusau

> 3000 for 2017? – Defining Blockade Success

Trump Inauguration Planners Unveil Tickets, Map says:


About 3,000 people holding purple tickets got stuck on foot in the Interstate 395 tunnel when trying to attend President Barack Obama’s first inauguration, causing many of them to miss the ceremony. The tunnel was later nicknamed the Purple Tunnel of Doom.

Data science projects should define criteria for success, before the project starts.

That helps prevent management from moving the goal posts to claim victory where none exists and protects you from “but it doesn’t ….” when that feature wasn’t included in the criteria for success.

In efforts to #DisruptJ20 the Trump inauguration, it appears that at least 3K people must be prevented from reaching the inauguration.

For your planning purposes, the 2017 SWEARING IN CEREMONY INFORMATION (FAQ) advises:


What time should I get to the US Capitol for the ceremony?

The gates to the mall will most likely open at 5:00am, and the ticketed areas are usually filled by 8:00am. The ceremony will begin around 11:30am with a musical performance prior to that time.

Blockaders are in for a long night!

The rate of removal of cars that intentionally or unintentionally run out of gas, disabled vehicles (think flat tires), etc., is unknown.

As a guesstimate, I would say gridlock conditions starting around 3 AM and persisting until NOON, EST, would result in an inauguration to which a majority of the ticket holders did not attend.

I wonder if the news channels will focus more on protesters or empty bleachers? Guesses?

by Patrick Durusau at January 07, 2017 09:41 PM

Implementing Indivisible – Early Difficulties

Indivisible (Indivisible: A Practical Guide for Resisting the Trump Agenda recommends appearing at every public appearance of your representative.

The same logic to other representatives since their committees votes on issues that impact you.

Question: How do you find all the offices of members of the US House or Senate?

Answer: Not easily.

Take Senator Diane Feinstein for example.

First source

http://bioguide.congress.gov/scripts/biodisplay.pl?index=F000062 gives:

bioguide-feinstein-460

Second Source

GPO Congressional Directory provides:

gpo-feinstein-460

Third Source

https://www.congress.gov/member/dianne-feinstein/F000062 lists:

members-feinstein-460

Local office information appears at Senator Diane Feinstein, but its listing varies from page to page, making automated extraction an iffy proposition.

Empowering Indivisible

Congress is sorely in need of a topic map for its members, that much is obvious.

What does the lack of an easy way to local office information suggest to you?

Would local office information improve your odds of contacting your own representative and others?

by Patrick Durusau at January 07, 2017 08:59 PM

January 06, 2017

Patrick Durusau

Online Database of “Verified” Twitter Accounts (Right On!)

The WikiLeaks Task Force tweeted on 6 Jan. 2017:

We are thinking of making an online database with all “verified” twitter accounts & their family/job/financial/housing relationships.

There are a number of comments to this tweet, the ones containing “dox,” “doxx,” “doxing,” “creepy,” “evil,” etc. that should be ignored.

Ignored because intelligence agencies, news organizations, merchants, banks, etc. are all collecting and organizing that data and more.

Ignored because the public should not preemptively disarm itself.

If anything, the Wikileaks Task Force should start with “verified” Twitter accounts and expand outwards, rapidly.

The public should be able to rapidly find relationships of individuals nominated for office, who contribute money to candidates, who profit from contracts, who launder public money. The public should have the same advantages intelligence agencies enjoy today.

To the nay-sayers to the WikiLeaks Task Force proposal:

Why do you seek to prevent putting the public on a better footing vis-a-vis government?

Question to my readers: What do the nay-sayers gain from a disarmed public?

by Patrick Durusau at January 06, 2017 09:12 PM

Three More Reasons To Learn R

Three reasons to learn R today by David Smith.

From the post:

If you're just getting started with data science, the Sharp Sight Labs blog argues that R is the best data science language to learn today.

The blog post gives several detailed reasons, but the main arguments are:

  1. R is an extremely popular (arguably the most popular) data progamming language, and ranks highly in several popularity surveys.
  2. Learning R is a great way of learning data science, with many R-based books and resources for probability, frequentist and Bayesian statistics, data visualization, machine learning and more.
  3. Python is another excellent language for data science, but with R it's easier to learn the foundations.

Once you've learned the basics, Sharp Sight also argues that R is also a great data science to master, even though it's an old langauge compared to some of the newer alternatives. Every tool has a shelf life, but R isn't going anywhere and learning R gives you a foundation beyond the language itself.

If you want to get started with R, Sharp Sight labs offers a data science crash course. You might also want to check out the Introduction to R for Data Science course on EdX.

Sharp Sight Labs: Why R is the best data science language to learn today, and Why you should master R (even if it might eventually become obsolete)

If you need more reasons to learn R:

  • Unlike Facebook, R isn’t a sinkhole of non-testable propositions.
  • Unlike Instagram, R is rarely NSFW.
  • Unlike Twitter, R is a marketable skill.

Glad to hear you are learning R!

by Patrick Durusau at January 06, 2017 08:31 PM

ANSWER Protest Permit – Least Attended Inauguration in History

Permits secured for Jan. 20 Mass Protest at the Inauguration!

From the post:


The ANSWER Coaltion has a permit for 14th street and Pennsylvania Ave NW and a portion of Freedom Plaza, beginning at 7:00am on Jan. 20 for a protest that will continue throughout the day.

We are also continuing our long legal battle for additional permitted space along Pennsylvania Ave. on Inauguration Day. The National Park Service has stonewalled the issuance of additional permits in an attempt to sanitize the most visible and primary locations along the route from dissent and free speech activity. Additionally, we are continuing to challenge the illegal, unconstitutional system whereby NPS reserves large portions of the route and other parts of D.C. on Inauguration Day, and in the days and weeks prior to and after, on behalf of Trump’s Presidential Inaugural Committee.

We think it’s critically important for the people to not be intimidated, to not be silent and to use all public spaces to express themselves.

(emphasis in original)

All of that is true, but attending the ANSWER protest means you will be counted as “attending the inauguration.”

What if you took ANSWER’s later advice:

use all public spaces to express themselves.

With tailgate parties on the Beltway (Tailgating @DisruptJ20), and blockades of the same (Low Risk Blockading of the DC Beltway).

Secure President-elect Trump’s place in the history books, with the least attended inauguration in history.

You can be part of that historical event, while sitting on the Beltway out of gas.

Which is it? Do you want to be “in attendance” or “truant” for Trump’s inauguration?

by Patrick Durusau at January 06, 2017 01:46 AM

January 05, 2017

Patrick Durusau

Beall’s List of Predatory Publishers 2017 [Avoiding “fake” scholarship, journalists take note]

Beall’s List of Predatory Publishers 2017 by Jeffrey Beall.

From the webpage:

Each year at this time I formally announce my updated list of predatory publishers. Because the publisher list is now very large, and because I now publish four, continuously-updated lists, the annual releases do not include the actual lists but instead include statistical and explanatory data about the lists and links to them.

Jeffrey maintains four lists of highly questionable publishers/publications:

Beall’s list should be your first stop when an article arrives from an unrecognized publication.

Not that being published in Nature and/or Science is a guarantee of quality scholarship, but publication on Beall’s list should raise publication stopping red flags.

Such a publication could be true, but bears the burden of proving itself to be so.

by Patrick Durusau at January 05, 2017 04:42 PM

January 04, 2017

Patrick Durusau

BaseX – Bad News, Good News

Good news and bad news from BaseX. Christian Grün posted to the BaseX discussion list today:

Dear all,

This year, there will be no BaseX Preconference Meeting in XML Prague. Sorry for that! The reason is simple: We were too busy in order to prepare ourselves for the event, look for speakers, etc.

At least we are happy to announce that BaseX 8.6 will finally be released this month. Stay tuned!

All the best,

Christian

That’s disappointing but understandable. Console yourself with watching presentations from 2013 – 2016 and reviewing issues for 8.6.

Just a guess on my part, ;-), but I suspect more testing of BaseX builds would not go unappreciated.

Something to keep yourself busy while waiting for BaseX 8.6 to drop.

by Patrick Durusau at January 04, 2017 10:39 PM

Eight Years of the Republican Weekly Address

We looked at eight years of the Republican Weekly Address by Jesse Rifkin.

From the post:

Every week since Ronald Reagan started the tradition in 1982, the president delivers a weekly address. And every week, the opposition party delivers an address as well.

What can the Weekly Republican Addresses during the Obama era reveal about how the GOP has attempted to portray themselves to the American public, by the public policy topics they discussed and the speakers they featured? To find out, GovTrack Insider analyzed all 407 Weekly Republican Addresses for which we could find data during the Obama era, the first such analysis of the weekly addresses as best we can tell. (See the full list of weekly addresses here.)

Sometimes they discuss the same topic as the president’s weekly address — particularly common if a noteworthy event occurs in the news that week — although other times it’s on an unrelated topic of the party’s choosing. It also features a rotating cast of Republicans delivering the speech, most of them congressional, unlike the White House which has almost always featured President Obama, with Vice President Joe Biden occasionally subbing in.

On the issues, we found that Republicans have almost entirely refrained from discussing such inflammatory social issues as abortion, guns, or same-sex marriage in their weekly addresses, despite how animating such issues are to their base. They also were remarkably silent on Donald Trump until the week before the election.

We also find that while Republicans often get slammed on women’s rights and minority issues, Republican congressional women and African Americans are at least proportionally represented in the weekly addresses, compared to their proportions in Congress, if not slightly over-represented — but Hispanics are notably under-represented.

You have seen credible claims of On Predicting Social Unrest Using Social Media by Rostyslav Korolov, et al., and less credible claims from others, CIA claims it can predict some social unrest up to 5 days ahead.

Rumor has it that the CIA has a Word template named, appropriately enough: theRussiansDidIt. I can neither confirm nor deny that rumor.

Taking credible actors at their word, are you aware of any parallel research on weekly addresses by Congress and following congressional action?

A very lite skimming of the literature on predicting Supreme Court decisions comes up with: Competing Approaches to Predicting Supreme Court Decision Making by Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger, and Pauline T. Kim (2004), Algorithm predicts US Supreme Court decisions 70% of time by David Kravets (2014), Fantasy Scotus (a Supreme Court fantasy league with cash prizes).

Congressional voting has been studied as well, for instance, Predicting Congressional Voting – Social Identification Trumps Party. (Now there’s an unfortunate headline for searchers.)

Congressional votes are important but so is the progress of bills, the order in which issues are addressed, etc., and it the reflection of those less formal aspects in weekly addresses from congress that could be interesting.

The weekly speeches may be as divorced from any shared reality as comments inserted in the Congressional Record. On the other hand, a partially successful model, other than the timing of donations, may be possible.

by Patrick Durusau at January 04, 2017 10:23 PM

Q&A Cathy O’Neil…

Q&A Cathy O’Neil, author of ‘Weapons of Math Destruction,’ on the dark side of big data by Christine Zhang.

From the post:

Cathy O’Neil calls herself a data skeptic. A former hedge fund analyst with a PhD in mathematics from Harvard University, the Occupy Wall Street activist left finance after witnessing the damage wrought by faulty math in the wake of the housing crash.

In her latest book, “Weapons of Math Destruction,” O’Neil warns that the statistical models hailed by big data evangelists as the solution to today’s societal problems, like which teachers to fire or which criminals to give longer prison terms, can codify biases and exacerbate inequalities. “Models are opinions embedded in mathematics,” she writes.

Great interview that hits enough high points to leave you wanting to learn more about Cathy and her analysis.

On that score, try:

Read her mathbabe blog.

Follow @mathbabedotorg.

Read Weapons of math destruction : how big data increases inequality and threatens democracy.

Try her new business: ORCAA [O’Neil Risk Consulting and Algorithmic Auditing].

From the ORCAA homepage:


ORCAA’s mission is two-fold. First, it is to help companies and organizations that rely on time and cost-saving algorithms to get ahead of this wave, to understand and plan for their litigation and reputation risk, and most importantly to use algorithms fairly.

The second half of ORCAA’s mission is this: to develop rigorous methodology and tools, and to set rigorous standards for the new field of algorithmic auditing.

There are bright line cases, sentencing, housing, hiring discrimination where “fair” has a binding legal meaning. And legal liability for not being “fair.”

Outside such areas, the search for “fairness” seems quixotic. Clients are entitled to their definitions of “fair” in those areas.

by Patrick Durusau at January 04, 2017 07:30 PM

FEMA – HOW-TO Demonize Your Opponents

Beryl Lipton writes in FEMA Field Force manual offers protesters insights into the future of crowd control:

Though construction on the Dakota Access Pipeline has halted for now, the lessons for law enforcement and protesters are still percolating. For the former, they’ll likely find themselves one day studying the event as they prepare for future mass gatherings, maybe in a guide just like the one distributed by the Federal Emergency Management Agency (FEMA) to North Dakota law enforcement in September.

… (graphic of DHS omitted)

Obtained by Unicorn Riot via a request to the North Dakota Department of Corrections, an agency with far fewer individuals in its custody than attended the protest at Standing Rock, the manual is a Field Force Operations training program for students, a crash course in eight parts on how to deal with a mixed crowd of lawful and unlawful dissenters.

I extracted the Field Force Operations PER-200 manual, from the zip file posted at MuckRock for your reading/access convenience.

As a government training document, allegedly “our” government, the manual fails in a number of aspects.

Consider its efforts to demonize protesters:


b. Protesters. Not every protester is the same nor should be viewed the same by law enforcement. By better understanding protesters, law enforcement officers can make better choices on how to respond. A small group of unruly protesters can stand out from the peaceful majority—often comprised of others who just want to be there along with innocent bystanders accidentally caught in the melee.

(1) Everyday citizens. Most protests include everyday citizens gathering through their First Amendment right to peaceably make their voices heard (Driscoll, 2003).

(2) Professional protesters. These people train or are trained in protester tactics often by direct action organizations that promote two universal messages: First, intervention demands responsibility. Second, a smaller harm is acceptable if it prevents a greater harm. One interpretation of this second message is that it is acceptable for protesters to break laws they consider less important like vandalism to prevent a greater harm like environmental damage. Some activism organizations may produce booklets that demonstrate use or construction of devices, including the infamous Road Raging – Top Tips for Wrecking Roadbuilding (Road Alert!, 1997).

(3) Anarchists. These people aim to disrupt, often seeking to challenge authority and capitalism at any cost. They are frequently young college students who express themselves through the destruction of property. Anarchists may mix into peaceful protests despite the efforts of the nonviolent protesters to limit destructive activities—leading to fighting sometimes between protesters. One common anarchist technique is the black bloc (violent, destructive activity), demonstrated at the Occupy Seattle protests.
… (at page 106)

If you think that lacks a charitable attitude towards ordinary people out-raged as some government misconduct, consider this listing of the “types” of individual protesters:


(1) Impulsive. These short-tempered people are the kind who are always spoiling for a fight and only need a fancied insult or a slight provocation to excite them to violence or incite others to violence.

(2) Suggestible. People who get into the action early and are easily influenced to follow the lead of the more violent.

(3) Cautious. Individuals who wait for the cloak of anonymity to give them courage by hiding their identity.

(4) Yielders. Those who do not join the action until a large number of participants give the impression of universality. In other words, “Everyone is taking part, so why shouldn’t I?”

(5) Supportive. People who do not actively join the mob but who enjoy the show and even shout encouragement.

(6) Resisters. Persons whose standards of judgment are not swayed by the emotional frenzy of the mob but who maintain level heads. They can disagree with the actions of the majority.

(7) Psychopathic. Individuals with a pathological personality structure are angry at the world and seek to use a riotous situation as a means of getting even with society (FBI, 1967, p. 21).
… (page 108)

How’s that for a rhetorical move?

In three pages the reader is drawn from “everyday citizens” to a range of personality disorders that range up to the “psychopathic.”

Any reader instinctively feels a gathering of protesters is a boiling pot of crazy ready to explode.

A false worldview but one promoted by the FEMA manual.

Imagine you are a local law enforcement officer, with little or no personal experience with civil disobedience, being told by FEMA that protesters are the harbingers of chaos. What’s your reaction going to be?

It’s only one example but Julia Carrie Wong and Sam Levin report in: Standing Rock protesters hold out against extraordinary police violence:


Harkening back to an earlier era, when police in Birmingham, Alabama, attacked African American schoolchildren with dogs and high-pressure water hoses, North Dakota officers trained water cannons on hundreds of Dakota Access pipeline protesters.

On the night of 20 November, though, the temperature was below freezing and the protesters, who call themselves “water protectors”, were camping outdoors for the evening.

Water is just one many “less-than-lethal” munitions that have been trained against the activists.

“They seem to have almost an infinite arsenal of different types of weapons,” said Rachel Lederman, attorney for the National Lawyers Guild (NLG). “I don’t think local law enforcement understands how dangerous they are.”

Police have acknowledged using sponge rounds, bean bag rounds, stinger rounds, teargas grenades, pepper spray, Mace, Tasers and a sound weapon. The explosive teargas grenades in use at Standing Rock have been banned by some US law enforcement agencies because they indiscriminately spray people, Lederman said.

More than two dozen people were hospitalized and 300 injured during the conflict, according to the medic and healer council. One woman’s arm was nearly blown off, according to her father, and the complaint alleges that another woman was shot in the eye, resulting in the detachment of her retina and possible permanent blindness.

Question: Should “everyday citizens” be sprayed with water cannon in sub-zero weather and assaulted with sponge rounds, bean bag rounds, stinger rounds, teargas grenades, pepper spray, Mace, Tasers and a sound weapon?

That’s not a hard question is it?

I suspect every non-psychotic law enforcement officer at Standing Rock would answer no, just like you.

But Morton County sheriff Kyle Kirchmeier confirms the FEMA schooled view of law enforcement:


On Thanksgiving, Morton County sheriff Kyle Kirchmeier released a statement condemning the actions of “paid agitators and protesters” without offering any evidence that people were being paid to fight the pipeline. The department has not responded to requests to substantiate the claim.

In another statement that week, the sheriff said activists were not engaged in “civil disobedience” but were acting like “evil agitators”. The Mandan, North Dakota, police chief, Jason Ziegler, has asserted that law enforcement agencies “can use whatever force necessary to maintain peace”.

To be fair, numerous law enforcement agencies have declined to subscribe to this FEMA inspired madness, Sheriffs Across US Refusing to Send Police and Equipment to DAPL as Outrage and Costs Grow by Claire Bernish.

At least in this instance. When protests come closer to home is the real test of law enforcement avoiding the FEMA “…evil agitators….” psychosis.

Government training manuals that humanize protesters are less likely to result in protests being used as proving grounds for “less lethal” weapons.

Teaching police officers to see protesters as their kith and kin will make major strides in the humane treatment of protesters.

Police officers may realize they have more in common with protesters than with players far removed from consequences on the streets. (Is that what FEMA is trying to avoid?)

by Patrick Durusau at January 04, 2017 04:55 PM

Sharpening Your Hacking Skills!

40+ Intentionally Vulnerable Websites To Practice Your Hacking Skills.

From the post:

Attack is definitely the best form of defense and this also applies to Cyber Security.

Companies are now hacking their own websites and even hiring ethical hackers in an attempt to find vulnerabilities before the bad guys do. As such ethical hacking is now a much sought after skill but hacking websites without permission can get you on the wrong side of the law, even if you’re just practising.

So how do practice your hacking skills whilst staying on the right side of the law? Well there are a number of deliberately vulnerable websites out there designed to allow you to practise and hone your hacking skills, without fear of prosecution. So we’ve decided to compile a list of over forty of them, each with short description.

Once you feel comfortable finding vulnerabilities, the next step could be a job as a penetration tester or participation in one of the bug bounty programmes where companies reward you based on the severity of the bugs that you find, which could be very lucrative. Facebook is one such company offering a bug bounty programme and has paid out more than a million dollars to date.

So without further ado, here’s the list. If you know of a good hacking website that’s not on this list, let me know and I’ll add it. Oh, and don’t forget to bookmark this page! :)

Yes! Not only bookmark this page but visit the sites it lists!

My only disappointment was that the Office of Personnel Management wasn’t listed. I guess the OPM site is requiring permission for hacking now. ;-)

by Patrick Durusau at January 04, 2017 01:31 AM

The GRU-Ukraine Artillery Hack That May Never Have Happened

The GRU-Ukraine Artillery Hack That May Never Have Happened by Jeffrey Carr.

From the post:

Crowdstrike’s latest report regarding Fancy Bear contains its most dramatic and controversial claim to date; that GRU-written mobile malware used by Ukrainian artillery soldiers contributed to massive artillery losses by the Ukrainian military. “It’s pretty high confidence that Fancy Bear had to be in touch with the Russian military,” Dmitri Alperovich told Forbes. “This is exactly what the mission is of the GRU.”

Crowdstrike’s core argument has three premises:

  1. Fancy Bear (APT28) is the exclusive developer and user of X-Agent [1]
  2. Fancy Bear developed an X-Agent Android variant specifically to compromise an Android ballistic computing application called Попр-Д30.apk for the purpose of geolocating Ukrainian D-30 Howitzer artillery sites[2]
  3. The D-30 Howitzers suffered 80% losses since the start of the war.[3]

If all of these premises were true, then Crowdstrike’s prior claim that Fancy Bear must be affiliated with the GRU [4] would be substantially supported by this new finding. Dmitri referred to it in the PBS interview as “DNA evidence”.

In fact, none of those premises are supported by the facts. This article is a summary of the evidence that I’ve gathered during hours of interviews and background research with Ukrainian hackers, soldiers, and an independent analysis of the malware by CrySys Lab. My complete findings will be presented in Washington D.C. next week on January 12th at Suits and Spooks.

Sadly I won’t be in attendance but am looking forward to reports of Carr’s details on the alleged GRU-Ukraine hack.

Not that I am expecting the New York Times to admit the Russian hacking of the 2016 election is a tissue of self-serving lies.

Disappointing but not expected.

by Patrick Durusau at January 04, 2017 01:16 AM

Achieving a 300% speedup in ETL with Apache Spark

Achieving a 300% speedup in ETL with Apache Spark by Eric Maynard.

From the post:

A common design pattern often emerges when teams begin to stitch together existing systems and an EDH cluster: file dumps, typically in a format like CSV, are regularly uploaded to EDH, where they are then unpacked, transformed into optimal query format, and tucked away in HDFS where various EDH components can use them. When these file dumps are large or happen very often, these simple steps can significantly slow down an ingest pipeline. Part of this delay is inevitable; moving large files across the network is time-consuming because of physical limitations and can’t be readily sped up. However, the rest of the basic ingest workflow described above can often be improved.

Campaign finance data suffers more from complexity and obscurity than volume.

However, there are data problems where volume and not deceit is the issue. In those cases, you may find Eric’s advice quite helpful.

by Patrick Durusau at January 04, 2017 12:57 AM

Expiring Patents

Expatents returns a list of patents expiring that day and you can sign up for a weekly digest of expiring patents.

The site claims that over 80% of patents are never commercially exploited.

Are expired patents, that is without commercial exploitation, like articles that are never cited by anyone?

Potential shareholder litigation over the not-so-trivial cost of patents that never resulted in commercial exploitation?

Was it inside or outside counsel that handled the patent filings?

There’s an interesting area for tracing relationships (associations) and expenses.

by Patrick Durusau at January 04, 2017 12:37 AM

A 3-Second Blockading Proposal

Nearly everyone I know has read Steal This Book by Abbie Hoffman at some point but despite its being on the internet, younger readers may have missed it.

To set the background for the 3-second blockading proposal, consider what Abbie has to say about anti-tire weapons:

Don’t believe all those bullshit tire ads that make tires seem like the Superman of the streets. Roofing nails spread out on the street are effective in stopping a patrol car. A nail sticking out from a strong piece of wood wedged under a rear tire will work as effectively as a bazooka. An ice pick will do the trick repeatedly but you’ve got to have a strong arm to strike home…. (page 122 of the pdf Steal This Book I can’t say how that corresponds to other copies.)

Everything Abbie says is true, but I see problems with each of his suggestions:

  1. Roofing nails: Roofing nails work, are easy to purchase and not expensive. At the same time, they are an indiscriminate weapon, not unlike carpet bombing when the objective is intersecting a single road.
  2. Nail in wood: The comparison of a “strong piece of wood” and “bazooka” makes me think of nails in the end of a 2 x 4 board. Works but even TSA agents trained to spot bottled water can spot someone sporting a 2 x 4 on one shoulder. (Not what Abbie meant but a humorous image.)
  3. Ice Pick: Like the man says, requires “a strong arm to strike home.” If reduced to using an ice pick, you do know to go for the thinner sideways. Yes?

Other tire weapon methods include: flattening tires with bayonets, shooting out tires, snd the current fad with spike strips:

spike-strip-460

Pictured is the Stinger Spike System, which is advertised online for $889.20 (not including shipping and tax).

Blockading with tire weapons sounds indiscriminate (roofing nails), obvious (2 x 4 with nails), difficult (ice pick), unlikely (bayonets/guns), and/or expensive (police spikes).

But that’s not necessarily so.

What if you had the opportunity to use this truck as part of a highway blockade:

semi-train-460

Impressive. Yes?

Look at all those tires! That just seems way too difficult. But, perhaps not.

How many of those tires would have to be disabled to make that semi-train part of a road blockade?

Here’s an image to help with that question:

tractor-cab-460

Out of all those tires, only one of the font two steering tires must be disabled. Disable either one and the truck becomes a fixture unless and until someone can clear enough traffic from around it and repair the tire.

BTW, the same lesson applies to school buses, tour buses, garbage trucks, dump trucks, Metro buses (includes links to schedules in case you want to wait for one), in short, anything that is big and difficult to move until repaired.

A 3-Second Blockading Proposal

Large, difficult to repair vehicles make great elements of a roadway blockade. If they lose either one of their front tires, there they sit until repaired.

So how do either one of the front tires get flattened?

What fact about tires did Abbie Hoffman overlook in Steal This Book?

You’re ahead of me. Yes, valve stems.

Valve stems are nearly obscured on tractor trailer rigs by the wheel housing:

tire_wheel-stem-460-red

Valve stems vary depending on the type of vehicle and by design are not easy to cut.

The ideal (and unproven) scenario would be:

  1. Spot blockade target’s valve stem
  2. Cut valve stem
  3. Be on your way

all in 3 seconds or less.

But see the next section:

Lack of Practical Experience – Variety Intervenes

My first impulse was to recommend using robust cutters:

cutters-460

for severing valve stems but your success with those will depend upon your arm strength and the tires you encounter.

Quite frankly, the variety of wheels and tires is too large to make a judgment about tools until reconnaissance on the tires you are likely to encounter.

Add to that my lack of tractor trailer tires immediately available for trials, and further research is indicated.

Any research/experience you can point to and/or contribute concerning cutting valve stems, specifying tire model(s) and the tool(s) used, would be greatly appreciated.


Steal This Book is still a great read but is sorely in need of an update. It does have my favorite paragraph from all counter-culture literature:


If you are around a military base, you will find it relatively easy to get your hands on an M-79 grenade launcher, which is like a giant shotgun and is probably the best self-defense weapon of all time.

It’s not clear what experience Abbie had with the M-79, but you have to admit it is one hell of an image:

blooper-guns21-460

I understand that ammunition for the M-79 is hard to find. You?

by Patrick Durusau at January 04, 2017 12:20 AM

January 03, 2017

Patrick Durusau

News Bubble Bursting – World Newspapers and Magazines Online

World Newspapers and Magazines Online

Newspaper and magazine listings for one hundred and ninety-nine (199) countries.

At the rate of one country per week, it would take 3.8 years to work your way through this listing.

Considering the depth of government and corporate deception, don’t you owe it to yourself, if not your readers, to sample that deception widely?

In an age of automatic, if not always smooth and correct, translation, do you have a good excuse for doing any less?

by Patrick Durusau at January 03, 2017 02:27 AM

Russian Hackers – Repeating History?

Maybe there is something to reading accounts of recent history. (A fascination with markup/computer and ANE languages doesn’t lead to much recent reading in “recent” history.)

But I was reading Manufacturing Consent by Edward S. Herman and Noam Chomsky (2002), when I encountered a repetition of the currently popular meme, “Russian hackers hacked the DNC.” (Despite the Podesta emails being obtained due to user carelessness that is hard to characterize as a “hack.”)

History Repeating (Not for the first time)

Set your wayback machine for 1981, another time when Russia (then the USSR) was an “evil empire.” (Or so claimed by people with particular agendas.)

A Turkish facist and member of a violent anti-left party in Turkey, one Mehmet Ali Agca, attempted to assassinate Pope John Paul II in May 1981. After being interrogated for 17 months, Agca “confessed” that he was an agent of the KGB and Bulgarians.

Herman and Chomsky walk through the unraveling of this fantasy of the Reagan era political elites (pages xxvii-xxix), only to conclude:


The New York Times, which had been consistently supportive of the connection in both news and editorials, not only failed to report Weinstein’s negative findings from the search of the Bulgarian files, it also excluded Goodman’s statements on the CIA penetration of the Bulgarian secret services from their excerpts of his testimony. The Times had long maintained that the CIA and the Reagan administration “recoiled from the devastating implication that Bulgarian agents were bound to have acted only on a signal from Moscow.” 58 But Goodman’s and Ford’s testimony show that this was the reverse of the truth, and that CIA heads William Casey and Robert Gates overrode the views of CIA professionals and falsified evidence to support a Soviet linkage. The Times was not alone in following a misleading party line, but it is notable that this paper of record has yet to acknowledge its exceptional gullibility and propaganda service.

Hmmm,

recoiled from the devastating implication that Bulgarian agents were bound to have acted only on a signal from Moscow

Does that sound similar to anything you have read recently or have heard repeated by the out-going US president?

December 11, 2016

Jump forward now to December 11, 2016 and you can read the New York Times reporting:


“This is why I hate the term ‘we speak truth to power,’” said Mark M. Lowenthal, a former senior C.I.A. analyst. “We don’t have truth. We have really good ideas.”

Mr. Lowenthal said that determining the motives of foreign leaders — in this case, what drove President Vladimir V. Putin of Russia to order the hacking — was one of the most important missions for C.I.A. analysts. In 2002, one of the critical failures of American spy agencies was their inability to understand Saddam Hussein’s goals and motives.

A simple search reveals the internet is replete with such trash talking by the CIA, DHS, FBI and an assorted of agencies that rearrange conclusions but offer no facts in support of those conclusions.

A Final Blow as 2016 Closes

With the same credibility I would accord the now discredited NYTimes fable about Russian backing for the attempt on the life of Pope John Paul II, hacking of the Democratic National Committee at the direction of Vladimir Putin, comes this final shot from Russian hackers:

carey-460

Since the Islamic States hasn’t claimed credit, it must be those damned Russian hackers! (Caution: That is “fake news.” Carey may have been sabotaged by someone but it wasn’t Russian hackers.)

A Case For Topic Maps & Subject Identity Anyone?

I haven’t worked out the details but these repeated charades by the US government, among others, offer an opportunity to put subject identity as defined by topic maps to work for true journalists.

The particulars of any particular subject vary but they all have:

  1. Accusations sans evidence by one or more agencies of the US government
  2. Chest-thumping by the New York Times (and others) in both reporting (sic) and editorial columns
  3. Articles/editorials rely on unnamed government sources or financially interested contractors
  4. Months without any evidence but more chest-thumping by US government agencies and their familiars

When all four of those properties are found, you are at least part way to identifying yet another repetition of the attempted assassination of Pope John Paul II fable.

Although, quite honestly, it needs a catchier moniker than that one.

Suggestions?

by Patrick Durusau at January 03, 2017 12:35 AM

January 02, 2017

Patrick Durusau

Historic American Newspapers (Bulk OCR Data Find!)

Historic American Newspapers

From the webpage:

Search America’s historic newspaper pages from 1789-1924 or use the U.S. Newspaper Directory to find information about American newspapers published between 1690-present.

A total of 2,134 newspapers, digitized (images) and searchable. Some 11,520,159 pages for searching and review.

Quite a treasure trove for genealogy types, primary/secondary research papers, people trying to escape the smoothing influence over historical events by history books and others.

Did I mention the site has an API?

Or that it offers access to all of its OCR data in bulk?

It’s not “big data” in the sense of the astronomy community but creating sub-sets for local communities of “their papers” would have a certain cachet.

Enjoy!

by Patrick Durusau at January 02, 2017 02:41 AM

The Best And Worst Data Stories Of 2016

The Best And Worst Data Stories Of 2016 by Walt Hickey.

From the post:

It’s time once again to dole out FiveThirtyEight’s Data Awards, our annual (OK, we’ve done it once before) chance to honor those who did remarkably good stuff with data, to shame those who did remarkably bad stuff with data, and to acknowledge the key numbers that help describe what went down over the past year. As always, these are based on the considered analysis of an esteemed panel of judges, by which I mean that I pestered people around the FiveThirtyEight offices until they gave me some suggestions.

I had to list this under both data science and humor. ;-)

What “…bad stuff with data…” stories do you know and how will you avoid being listed in 2017? (Assuming there is another listing.)

I suspect we learn more from data fail stories than ones that report success.

You?

Enjoy!

by Patrick Durusau at January 02, 2017 02:13 AM

OpenTOC (ACM SIG Proceedings – Free)

OpenTOC

From the webpage:

ACM OpenTOC is a unique service that enables Special Interest Groups to generate and post Tables of Contents for proceedings of their conferences enabling visitors to download the definitive version of the contents from the ACM Digital Library at no charge.

Downloads of these articles are captured in official ACM statistics, improving the accuracy of usage and impact measurements. Consistently linking to definitive versions of ACM articles should reduce user confusion over article versioning.

Conferences are listed by year, 2014 – 2016 and by event.

A step in the right direction.

Do you know if the digital library allows bulk downloading of search result metadata?

It didn’t the last time I had a digital library subscription. Contacting the secret ACM committee that decides on web features was verboten.

Enjoy this improvement in access while waiting for ACM access bottlenecks to wither and die.

by Patrick Durusau at January 02, 2017 01:59 AM