Building trust in media

CiteIt is developing new digital tools that help combat misinformation and selective quotations. These tools show the context surrounding the quoted media in order to build trust and understanding.



Often when I’m reading a story in print, I come across a quote that makes me wonder: What was said in the sentences preceding or following the quote? In other words: What is the Context? Is this quote cherry-picked and can I really get a full sense for what happened from just the quoted selection?

What is is the name I chose for a web service that allows writers to augment their writing with greater context about the quotations they make. The concept behind is part of the tradition that started with footnotes, became hypercharged with hypertext links, and now has evolved to enable the words of the original sources to flow into the citing document, without requiring the reader to interrupt their reading experience by leaving the original document.

Inspiration was inspired by the work of Ted Nelson — who coined the term “hypertext” in 1962 and had the vision for a universal hypertext network, long before Bill Gates and Steve Jobs had even assembled their first personal computers.

Philosphy of HypertextWhile writing a review of Ted Nelson’s 2002 Ph.D. thesis, I was inspired to develop a way to write in a way more similar to what Nelson advocates.  Ted’s design — Xanadu — features parallel texts that provide the full context of their sources. My design is a crude semblance of Xanadu, displaying only one pane of text in which the text surrounding a citation is “injected”.  Because I do not have the technical ability to implement Ted’s vision and so much existing writing is available on the web as Html, I thought I could pursue Ted’s vision best by creating a quick-and-dirty extension to Html that would allow writers to start to create a collection of citation data.  There is a saying from the 1989 movie “Field of Dreams — If you build it – – he will come. My approach is a data-centric one, that prioritizes the collection of citation data as a foundation that attracts others to the task of building a better version of hypertext.  As the system develops, and as authors get used to the concept of pulling their quote context from the original source, it is my hope that the compilation of a public data set of citation data will attract great programmers to gradually rebuild hypertext, replacing my current Html cludge with advanced front and backend features.

photo: Tim Langeman
Tim Langeman    (home page)
Akron, PA (USA)

FAQ: Frequently Asked Questions

Can you locate the source of any quote

  • No, CiteIt only locates the context of quotes whose source has been identified by the author with a URL.
  • If an author chooses not to identify a source, CiteIt will not attempt to locate it.

When did you begin CiteIt?

Why is the context cut off mid-word?

  • One of the ideas behind CiteIt is that authors should not get to cherry-pick their quote or their quote's context. As a result, for every quote ..
  • CiteIt gives authors no discretion over their context and instead pulls the 500 characters of context, cutting off the selection mid-word if necessary.

Why did you release this as open-source software under a free license like the MIT license?

  • I want the type of citation I’m doing with to spread as widely as possible.
  • I want to promote the norm that serious discussions are substantiated with citations.  The question I want people to ask is: “Can you CiteIt?”
  • By providing as few restrictions on implementations as possible, I hope others will extend my vision, taking it places I could not go on my own.

Is there a simple code demonstration of CiteIt I can inspect that is stripped of all the WordPress boilerplate?

What was Neotext?

  • Neotext was the first name I gave to the project that became CiteIt.
  • Lynn Schmidt Miller suggested the name of “CiteIt” as a way to relate the project to the well-understood concept of citation.
  • You may see some stray references in the code to Neotext.

Want to Get Involved?

If you have ideas or think you can help, send me an email. (email address found on homepage)


Special thanks to the following contributors:

  • Alex Mayer: Helped with Docker Build Script
  • Daniel Miller: Programming Advice
  • Lynn Schmidt Miller suggested the name of “CiteIt” instead of “Neotext”.
  • Matt Langeman: Helped with previous AWS SAM setup and redesign and rebrand CiteIt website using 11y
  • Rajat Sharma: Helped with UI: ( Javascript: reverse expanding arrows direction)
  • Phil Zook: Mocked up demo Wikipedia pages
  • Will Nissley: First user of WordPress plugin

Open Source Ecosystem:

CiteIt has benefitted from being able to build upon the shoulders of the open source community. Here are a few of the people and projects that have had a significant influence on me or that CiteIt uses.

Python ecosystem:
Other Tools

Future Plans:

My long-term goal is to work on CiteIt full-time, ideally as part of an existing non-profit, such as the Internet Archive. The Internet Archive gets funding from public donations, as well as foundations. I've done a little bit of research into funding from foundations and the Knight foundation and Sloan foundation look like they might be a good fit because the Knight Foundation has partnerships with journalists with whom I would like to partner and the Sloan Foundation has funded work into the Universal Access to Knowledge and improving Wikipedia's accuracy and credibility . If you have ideas about organizations and funders that would be a good fit send me an email.

Project Plan

I've started to outline technical plans for the project (Google Sheets).

View Extended Plan Details    (Google Sheets)

(The stages of these plans do not line up exactly with the "phases" described below.)

Here are a few organizations/categories, I'm intestested in partnering with:

Phase 1: Building Trust through "Context"

The first phase of the project involves providing context for authors, through expanding blockquotes or contextual popups..

I think that Judaism has the same problem that any thick civilization has in a world in which, as you say, context is stripped away. And not only is context stripped away, but attention to any one thing is scanter and less than it used to be. So, for example, a lot of Jewish commentary is based on your recognizing the reference that I make. Who recognizes references anymore? Because people don’t spend years studying books.

The project's path, depends upon which groups choose to adopt CiteIt first.

Media Types:
PDF Support: Webservice

The python version of the webservice currently supports using digital PDFs as sources but the Docker-based public webservice does not yet support PDFs. Adding PDF support, both “digital” and scanned, to the Docker image is important to academic disciplines like history, as well as discussions of public affairs that cite many government documents.

Video & Audio Transcripts

There is a draft version of CiteIt which allows citing YouTube transcripts. More could be done to enable audio and video contextual citations.


A lot of work in the initial phase will need to be done to develop a full feature set and improve robustness. It is one thing to develop a proof-of-concept and quite another to develop a mature, production-tested service.

Stopping at Phase 1?

If CiteIt remains soley a citation app, I will consider it a success. The vision for further development is more speculative.

Get in touch with me if you want to start a conversation about the possibilities.

Phase 2: Create a Citation Database

Authors’ citations could be added to a database when the authors index their pages. This database could be made available to aggregation services.

The second phase of the project involves taking advantage of the data collected from authors who use CiteIt to pull context into their webpages.

"Targeted" Page Rank
The Value of Citation Data & Aggregation Services

Citation data is of enormous value, as Google’s PageRank has shown. Adding source urls to specific quotations allows citations to become more granular and adding tagging and meta data allows citations to become more expressive and informative.

This database could be published using an API and programmers could develop competitive services to offer annotation, moderation, and search aggregation services.

Many people now recognize the value that links play in Google's calculation of the Page Rank algorithm which ranks search engine results.

CiteIt extends linking at a more granular level, enabling blockquotes and q-tags to link to a specific portion of a page.

"Second-Order Effects" of Data
Second Order Effect refers to the idea that every action has a consequence, and each consequence has a subsequent consequence.
Traffic Reports as a Second-Order of Cell-Phone Location

Many people now recognize the second order effect of collecting cell phone location data allows Google to render a report of traffic speed.

Google Traffic Map: New York City

Google Maps Traffic: measure driver's speed, from cell phone location.

The second order effect of collecting quote citation data is that it opens up the possibility of displaying the intersection of two documents at a granular level.

Google News
Google News meets Inline Comments
"Inline Comments" as a Second-Order of External Citations

Just like cell phone location data enables Google traffic reports, linked quotations could enable web developers to design creative interfaces to display the quotes that intersect with specific poritions of a webpage.

Although a few sites have experiemented with inline comments, we are used to most comments coming at the end of an article (if a publication has enabled comments).

CiteIt would enable programmers to develop systems that could be like a cross of Google News meets inline comments.

CiteIt Distributed Comments

Phase 3: Community Building

A) Moderation with Tags

Tagging could be added to quote tags and these tags could be used to moderate the linked content.

2 Types of Tags:
  1. Descriptive Meta-data:    Show/hide sample meta-data
  2. Opinions: Would more expressive tags allow for citations to be aggregated by a more sophisticated moderation algorithm? Just like on twitter, anything could be a tag, such as:    brilliant, misleading, consensus-building, bad-faith, funny, disagree.
B) Breaking Media Silos with Consensus Building
Silos Silo Canister in Indiana.    Photo Credit: Alan Berning

I would like to make the CiteIt database freely available so that a broad array of people to analyze it and develop aggregators.

Open Data > A New Generation of Aggregators

I would hope that a new generation of search engines (or aggregators) could locate the top ranked citation comments associated with citation fragmnents and develop innovative user interfaces.

Viewing the Top Results from Many Silos

One desirable feature would be to classify and group quotes of a particular type together and display the top quotes for each group.

Surfacing citations from accross the internet has the potential to reduce the polarization that results when individuals cluster into their own homogenous groups (silos).

C) Civil Conversations

If CiteIt develops as a community, work should be done to intentially build Civil Conversations, as healthy communities take time and intentionality to develop.

created at

Phase 4: Decentralization

Ultimately, if CiteIt were to succeed as a concept, it seems like the service might benefit from being built as a decentralized system that archives and verifies the original context of quotes. This would provide readers with greater trust in the data and defend against attempts at censorship.