Zenodotus is an archive system for media that has been fact checked to provide durable, long-term storage for research purposes primarily. The project is named after Zenodotus, the first superintendent of the Library of Alexandria and the man credited with inventing the first tagging system.
Additional documentation can be found in the /docs folder.
There are a few prerequisites that you need on your machine to run this system. All of this was designed for macOS, though any Linux distribution should be pretty similar. As for Windows, I have no idea, though @oneroyalace may be able to help with that. (I imagine the answer is WSL.)
Things we need to install include (steps below for all this):
- Homebrew (for macOS, or other package manager for other systems)
- Ruby (see /.ruby-verison for the current version)
- Chrome (latest version)
- PostgreSQL (13 or higher)
- Redis
- ChromeDriver
- ffmpeg
- vips
- Yarn (v1) for JavaScript linting only
A package manager for macOS similar to Apt or Yum in the Linux world. You'll want this if you don't have it because it makes installing the other prereqs SUPER easy. Install it from here.
Note: You may have to install the Xcode Command Line Tools for macOS: xcode-select --install.
The version of Ruby installed by your operating system, or available through standard package managers, is probably out of date and doesn't support multiple versions of Ruby on a single machine. To remedy both problems, we recommend using a Ruby version manager. Once installed, make sure to use the version of Ruby indicated in /.ruby-version.
✅ rbenv
This is our recommended Ruby version manager. It's lightweight, well maintained, and works pretty flawlessly.
Note: One of the Gems used in this project, dhash-vips, uses Ruby source files to speed up image similarity processing. To ensure that rbenv stores the Ruby source files locally, use the --keep flag when installing a new Ruby version. E.g. rbenv install 3.0.2 --keep. (Note that you would actually use the command rbenv install --keep while in the project root, and rbenv would pick up the correct version number from ./ruby-version automatically.)
This is the classic Ruby version manager. Many developers have moved on to rbenv, but RVM continues to work perfectly fine, if a little heavier than rbenv.
❌ chruby
Don't use this one. For the most part, it's both too complex and too difficult, and it's probably not what you're actually looking for if you're reading this section anyways.
A version manager for multiple programming languages. Some people like it so you don't have to use multiple different managers on your machine. The Ruby plugin for it is here. I've never used it myself, but it's well maintained and I've heard good things.
Well, if you don't have this already I'm not sure what to tell you.
macOS:
You can download it from here for your system. Personally, if you can, just use the desktop version so you don't accidentally have a database running 24/7 in the background. Keep the default credentials unless you know what you're doing, this is just development so we don't care about security and the like. Should you happen to change the credentials, you'll need to update the default rails db settings in ./config/database.yml (and avoid committing the changes).
If pg_config isn't already in your PATH, locate it (try looking in /Library/PostgreSQL/{version_#}/bin) and add it so the pg gem can install properly.
Ubuntu:
Postgresql 13 can be downloaded using apt. To do so, add the Postgres APT repository to your machine, then download/install the packages listed below.
sudo apt -y install bash-completion wget
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" |sudo tee /etc/apt/sources.list.d/pgdg.list
sudo apt update
sudo apt install postgresql-13 libpq-devIf you'd like to use Postgres without a password, you'll need to update your pg_hba.conf file to trust local users. See here for instructions.
Redis backs Sidekiq and Action Cable, and a Redis server will need to be running on your machine while using the scraping portions of the app.
- macOS::
brew install redis
Ubuntu:
On Ubuntu 18.04, the yarn keyword is associated with another tool, cmdtest. To get the desired Yarn,
sudo apt remove cmdtest
sudo apt remove yarn
curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
sudo apt-get update && sudo apt-get install yarnUsed for scraping.
- macOS:
brew install --cask chromedriver - Ubuntu:
sudo apt-get install chromium-driver
ffmpeg is a video processing library. It's used on that Mars helicopter and at YouTube, so it's fine. We need to install it to process previews for videos.
- macOS:
brew install ffmpeg - Ubuntu:
sudo apt-get install ffmpeg
A faster image manipulation library than ImageMagick.
- macOS:
brew install vips
An open-source JavaScript package manager used to install/manage JavaScript dependencies. We only use it for installing development dependencies, specifically ESLint for linting our JavaScript. We do not install Node packages for use in the app itself. Instead, we use the Rails 7+ preference for importmaps.
Note: Yarn went through a significant architectural change in v2, while we continue to use the "classic" v1.
- Install all the prerequisites, including the version of Ruby indicated in /.ruby-version, ensuring Ruby source files are stored locally (
--keep) - Clone this repo:
git clone https://github.com/TechAndCheck/zenodotus - Navigate into the project folder
cd zenodotus(or whatever) - Optional: If you intend to develop the birdsong and zorki Gems, clone them to a separate location and update the Zenodotus ./Gemfile to point to the local instances
- Install all the Gems:
bundle install(this may take a few minutes) - Make sure Postgres is running
- Set up the database:
rails db:create && rails db:setup - Set up your environment variables:
- For local development,
touch config/application.ymland ask another developer for the config values - For production, make sure the environment variables are set properly
- For local development,
- Add the following entries to your
/etc/hostsfile (or equivalent, if you have a more complex routing setup):127.0.0.1 www.factcheckinsights.local 127.0.0.1 vault.factcheckinsights.local- These are just the suggested defaults. Feel free to replace this with an alternative routing method or different URLs.
- Bootstrap assets:
rails assets:precompile - In your shell, run
rails s(or./bin/devif you will be editing the styles or markup)
✨ The app should now be running and available at http://www.factcheckinsights.local:3000 (Insights) and http://vault.factcheckinsights.local:3000 (MediaVault), or whatever URLs you chose. If not, contact @cguess or another developer.
If you plan to do anything that will trigger the scraper, such as archiving a new URL or checking the status of jobs, you will also need to fire up Redis and Sidekiq:
- Make sure Redis is running (e.g.,
redis-serverfrom anywhere on your system) - Start Sidekiq from the project directory:
sidekiq
Personally, I love tmux, it allows you to have multiple consoles while being able to hide and bring them back, tile them, automate stuff, it's just awesome. You can easily install it through most package managers like brew install tmux (for Homebrew) or sudo apt-get install tmux (for Debian/Ubuntu and the like).
The reason this is mentioned here is because this repo is set up to use tmuxinator, a tmux manager that lets you script setup windows, panes and the like. If you decide to use tmuxinator,
If you want to use this (I recommend it) do the following:
- Install tmux, see above
- Duplicate the
.tmuxinator.yml.examplefile in the root directory of this projects - Rename the new file removing the
.exampleat the end of the filename (do not commit the new one) - Change the
rootkey intmuxinator.ymlto reflect the project's path on your machine - Make sure you ran
bundle installwhen setting the project up in the first place - Run
tmuxinatoron the command line, and you should see everything pop up and start running properly.
We use mostly a standard Rails stack, with a few new things that are generally recommended by the Rails core team.
- We use the Rails 7+ importmap method for managing JavaScript assets
- We use StimulusJS for our JavaScript
- We use Turbo for all the page load stuff
- We use Sorbet to add type-checking to our Ruby and prevent a bunch of runtime bugs early
- We use Rubocop for linting Ruby and ESLint for linting JavaScript
See docs/DEVELOPMENT.md for additional development setup instructions.
This app can run on bare metal using Docker Compose. The compose stack includes:
- Postgres, Redis, Memcached, Neo4j (internal only)
- Web (Puma) and Sidekiq worker
- Caddy reverse proxy to
web:3000
- Install Docker and Docker Compose v2
- Create environment file:
- Copy
env.exampleto.env - Fill required values:
SECRET_KEY_BASE,AWS_REGION,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_S3_BUCKET_NAME,CLOUDFRONT_HOST - Optionally adjust:
FACT_CHECK_INSIGHTS_HOST,MEDIA_VAULT_HOST,PUBLIC_LINK_HOST
- Copy
- Build images:
docker compose build
# Start backing services first (optional; web will wait on healthchecks)
docker compose up -d db redis memcached neo4j
# Start application services
docker compose up -d web worker caddy
# Stop all
docker compose down- Postgres:
pg_isready(with start_period) - Redis:
redis-cli ping - Neo4j:
cypher-shellRETURN 1; - Web:
curl http://localhost:3000/withHostheader
web waits for healthy db, redis, and neo4j before starting. Neo4j and Postgres ports are not exposed externally.
docker compose logs -f web
docker compose logs -f worker
docker compose logs -f db redis memcached neo4j caddydocker compose up -d --scale worker=3# Rails console
docker compose exec web bundle exec rails console
# DB migrations (also auto-run by web entrypoint on boot)
docker compose exec web bundle exec rails db:migrate
# One-off rake task
docker compose run --rm web bundle exec rake your:taskProduction uses S3 via Shrine. No persistent local volumes are mounted for uploads. Configure S3 credentials in .env.
webentrypoint runs migrations by default (RUN_MIGRATIONS=true);workersets it to false- Caddy proxies to
web:3000; configure DNS forvault.factcheckinsights.orgto point at the host - To rebuild after code changes:
docker compose build web worker docker compose up -d web worker
See LICENSE for the terms governing this software.