CiteIt is developing new digital tools that help combat misinformation and selective quotations. These tools show the context surrounding the quoted media in order to build trust and understanding.
CiteIt.net is an open-source project (MIT license) whose mission to create a higher standard of citation is accelerated through the help of volunteers.
If you know how to code (especially in Python or Javascript) contact Tim to help out.
Verify Quote before allowing Citation: Enforce Rules or Flag Suspicious Quotes that do not exactly match the original version, allowing for structured variation:
Modify Parsing to handle [C]apitalization:
“[T]his is a quote that was pulled from the middle of a sentence, but capitalized to fit with the new context”
Allow new [additional] words to be added if the word is surrounded by brackets.
Modify Parsing to handle Elipses ..
“The quote was pulled from a sentence .. and the middle was skipped. This was noted with an ellipse.” It would be nice if the middle of the quote could be expanded.
Implement Google Text fragments in Python if necessary: Allow unique specification if phrase occurs multiple times in a document.
This could be done by specifiying enough of the “before” and “after” text to make the phrase unique.
Javascript: Port Python-based webservice to Javascript. The advantage of a javascript-based lookup service is that it could run in the browser and use Google's Text-Fragment to specify which instance of a quote to link to.
How does CiteIt have to be modified to allow quotations of quotations, nested multiple levels deep?
Make YouTube (and other) transcripts highlight the current word/s while the recording plays. This will likely require creating a format for an intermediate data structure which stores the start and end times for each word/phrase in the transcript. (example: YouTube Speech-Text API script)
Modify Document class to save a copy of the original file in its original encoding to S3-style storage
call archive.org API to archive the citing and cited pages if the page is new or the hash has changed
It would be nice if the archive process didn’t slow down the citation process. Perhaps this means that the archive process (which could take several seconds) should be done asynchronously.
Setup Tests to verify that changes to the web service do not break existing quotes.
Setup CI Server to run tests before Github commits code
Add Ability to Standardize Quotes on Canonical Sources
The Bible has multiple versions and translations. Can quotation be done in a way that quotes from different versions and different websites are standardized?
Create a streamlined process of uploading audio and creating transcripts from Google Speech-to-Text. See work already done in Git Repo. Example: Otto von Bismark biography. (Links)
Develop a web interface to crowdsource the process of cleaning up auto-generated transcripts. Would a wiki be of use here?
I released the CiteIt client and server code under an open source license because I want the concept of Contextual Citations to spread as widely as possible.
I wanted to remove this objection to adopting CiteIt, so I chose an open source license that does not require derivate works to be open-sourced. Use it as you like. Feel free to sell your derivative works without revealing your source code changes. Just don't sue me! :-)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.