Zenodotus (Ζηνόδοτος)

Zenodotus is an archive system for media that has been fact checked to provide durable, long-term storage for research purposes primarily. The project is named after Zenodotus, the first superintendent of the Library of Alexandria and the man credited with inventing the first tagging system.

Additional documentation can be found in the /docs folder.

Setup

Prerequisites

There are a few prerequisites that you need on your machine to run this system. All of this was designed for macOS, though any Linux distribution should be pretty similar. As for Windows, I have no idea, though @oneroyalace may be able to help with that. (I imagine the answer is WSL.)

Things we need to install include (steps below for all this):

Homebrew (for macOS, or other package manager for other systems)
Ruby (see /.ruby-verison for the current version)
Chrome (latest version)
PostgreSQL (13 or higher)
Redis
ChromeDriver
ffmpeg
vips
Yarn (v1) for JavaScript linting only

Homebrew

A package manager for macOS similar to Apt or Yum in the Linux world. You'll want this if you don't have it because it makes installing the other prereqs SUPER easy. Install it from here.

Note: You may have to install the Xcode Command Line Tools for macOS: xcode-select --install.

Ruby

The version of Ruby installed by your operating system, or available through standard package managers, is probably out of date and doesn't support multiple versions of Ruby on a single machine. To remedy both problems, we recommend using a Ruby version manager. Once installed, make sure to use the version of Ruby indicated in /.ruby-version.

✅ rbenv

This is our recommended Ruby version manager. It's lightweight, well maintained, and works pretty flawlessly.

Note: One of the Gems used in this project, dhash-vips, uses Ruby source files to speed up image similarity processing. To ensure that rbenv stores the Ruby source files locally, use the --keep flag when installing a new Ruby version. E.g. rbenv install 3.0.2 --keep. (Note that you would actually use the command rbenv install --keep while in the project root, and rbenv would pick up the correct version number from ./ruby-version automatically.)

RVM

This is the classic Ruby version manager. Many developers have moved on to rbenv, but RVM continues to work perfectly fine, if a little heavier than rbenv.

❌ chruby

Don't use this one. For the most part, it's both too complex and too difficult, and it's probably not what you're actually looking for if you're reading this section anyways.

ASDF

A version manager for multiple programming languages. Some people like it so you don't have to use multiple different managers on your machine. The Ruby plugin for it is here. I've never used it myself, but it's well maintained and I've heard good things.

Chrome

Well, if you don't have this already I'm not sure what to tell you.

PostgreSQL (13 or higher)

macOS:

You can download it from here for your system. Personally, if you can, just use the desktop version so you don't accidentally have a database running 24/7 in the background. Keep the default credentials unless you know what you're doing, this is just development so we don't care about security and the like. Should you happen to change the credentials, you'll need to update the default rails db settings in ./config/database.yml (and avoid committing the changes).

If pg_config isn't already in your PATH, locate it (try looking in /Library/PostgreSQL/{version_#}/bin) and add it so the pg gem can install properly.

Ubuntu:

Postgresql 13 can be downloaded using apt. To do so, add the Postgres APT repository to your machine, then download/install the packages listed below.

sudo apt -y install bash-completion wget
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" |sudo tee  /etc/apt/sources.list.d/pgdg.list
sudo apt update
sudo apt install postgresql-13 libpq-dev

If you'd like to use Postgres without a password, you'll need to update your pg_hba.conf file to trust local users. See here for instructions.

Redis

Redis backs Sidekiq and Action Cable, and a Redis server will need to be running on your machine while using the scraping portions of the app.

macOS:: brew install redis

Ubuntu:

On Ubuntu 18.04, the yarn keyword is associated with another tool, cmdtest. To get the desired Yarn,

sudo apt remove cmdtest
sudo apt remove yarn
curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
sudo apt-get update && sudo apt-get install yarn

ChromeDriver

Used for scraping.

macOS: brew install --cask chromedriver
Ubuntu: sudo apt-get install chromium-driver

ffmpeg

ffmpeg is a video processing library. It's used on that Mars helicopter and at YouTube, so it's fine. We need to install it to process previews for videos.

macOS: brew install ffmpeg
Ubuntu: sudo apt-get install ffmpeg

vips

A faster image manipulation library than ImageMagick.

macOS: brew install vips

Yarn

An open-source JavaScript package manager used to install/manage JavaScript dependencies. We only use it for installing development dependencies, specifically ESLint for linting our JavaScript. We do not install Node packages for use in the app itself. Instead, we use the Rails 7+ preference for importmaps.

Note: Yarn went through a significant architectural change in v2, while we continue to use the "classic" v1.

Installation

Install all the prerequisites, including the version of Ruby indicated in /.ruby-version, ensuring Ruby source files are stored locally (--keep)
Clone this repo: git clone https://github.com/TechAndCheck/zenodotus
Navigate into the project folder cd zenodotus (or whatever)
Optional: If you intend to develop the birdsong and zorki Gems, clone them to a separate location and update the Zenodotus ./Gemfile to point to the local instances
Install all the Gems: bundle install (this may take a few minutes)
Make sure Postgres is running
Set up the database: rails db:create && rails db:setup
Set up your environment variables:
1. For local development, touch config/application.yml and ask another developer for the config values
2. For production, make sure the environment variables are set properly
Add the following entries to your /etc/hosts file (or equivalent, if you have a more complex routing setup):
```
127.0.0.1	www.factcheckinsights.local
127.0.0.1	vault.factcheckinsights.local
```
- These are just the suggested defaults. Feel free to replace this with an alternative routing method or different URLs.
Bootstrap assets: rails assets:precompile
In your shell, run rails s (or ./bin/dev if you will be editing the styles or markup)

✨ The app should now be running and available at http://www.factcheckinsights.local:3000 (Insights) and http://vault.factcheckinsights.local:3000 (MediaVault), or whatever URLs you chose. If not, contact @cguess or another developer.

Starting the scraper

If you plan to do anything that will trigger the scraper, such as archiving a new URL or checking the status of jobs, you will also need to fire up Redis and Sidekiq:

Make sure Redis is running (e.g., redis-server from anywhere on your system)
Start Sidekiq from the project directory: sidekiq

Optional: tmux

Personally, I love tmux, it allows you to have multiple consoles while being able to hide and bring them back, tile them, automate stuff, it's just awesome. You can easily install it through most package managers like brew install tmux (for Homebrew) or sudo apt-get install tmux (for Debian/Ubuntu and the like).

The reason this is mentioned here is because this repo is set up to use tmuxinator, a tmux manager that lets you script setup windows, panes and the like. If you decide to use tmuxinator,

If you want to use this (I recommend it) do the following:

Install tmux, see above
Duplicate the .tmuxinator.yml.example file in the root directory of this projects
Rename the new file removing the .example at the end of the filename (do not commit the new one)
Change the root key in tmuxinator.yml to reflect the project's path on your machine
Make sure you ran bundle install when setting the project up in the first place
Run tmuxinator on the command line, and you should see everything pop up and start running properly.

Notes on tech used

We use mostly a standard Rails stack, with a few new things that are generally recommended by the Rails core team.

We use the Rails 7+ importmap method for managing JavaScript assets
We use StimulusJS for our JavaScript
We use Turbo for all the page load stuff
We use Sorbet to add type-checking to our Ruby and prevent a bunch of runtime bugs early
We use Rubocop for linting Ruby and ESLint for linting JavaScript

Development

See docs/DEVELOPMENT.md for additional development setup instructions.

Docker deployment and management

This app can run on bare metal using Docker Compose. The compose stack includes:

Postgres, Redis, Memcached, Neo4j (internal only)
Web (Puma) and Sidekiq worker
Caddy reverse proxy to web:3000

One-time setup

Install Docker and Docker Compose v2
Create environment file:
- Copy env.example to .env
- Fill required values: SECRET_KEY_BASE, AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_S3_BUCKET_NAME, CLOUDFRONT_HOST
- Optionally adjust: FACT_CHECK_INSIGHTS_HOST, MEDIA_VAULT_HOST, PUBLIC_LINK_HOST
Build images:
```
docker compose build
```

Start/stop

# Start backing services first (optional; web will wait on healthchecks)
docker compose up -d db redis memcached neo4j

# Start application services
docker compose up -d web worker caddy

# Stop all
docker compose down

Healthchecks (compose-managed)

Postgres: pg_isready (with start_period)
Redis: redis-cli ping
Neo4j: cypher-shell RETURN 1;
Web: curl http://localhost:3000/ with Host header

web waits for healthy db, redis, and neo4j before starting. Neo4j and Postgres ports are not exposed externally.

Logs

docker compose logs -f web
docker compose logs -f worker
docker compose logs -f db redis memcached neo4j caddy

Scaling workers

docker compose up -d --scale worker=3

Running tasks

# Rails console
docker compose exec web bundle exec rails console

# DB migrations (also auto-run by web entrypoint on boot)
docker compose exec web bundle exec rails db:migrate

# One-off rake task
docker compose run --rm web bundle exec rake your:task

Storage

Production uses S3 via Shrine. No persistent local volumes are mounted for uploads. Configure S3 credentials in .env.

Notes

web entrypoint runs migrations by default (RUN_MIGRATIONS=true); worker sets it to false
Caddy proxies to web:3000; configure DNS for vault.factcheckinsights.org to point at the host

To rebuild after code changes:

docker compose build web worker
docker compose up -d web worker

License

See LICENSE for the terms governing this software.

Name		Name	Last commit message	Last commit date
Latest commit History 1,533 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.ipynb_checkpoints		.ipynb_checkpoints
.yardoc		.yardoc
app		app
bin		bin
config		config
db		db
doc		doc
docs		docs
githooks		githooks
lib		lib
log		log
public		public
sorbet		sorbet
storage		storage
test		test
vendor		vendor
.browserslistrc		.browserslistrc
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.eslintrc		.eslintrc
.gitattributes		.gitattributes
.gitignore		.gitignore
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
.tmuxinator.yml.example		.tmuxinator.yml.example
.tool-versions		.tool-versions
Aptfile		Aptfile
Caddyfile		Caddyfile
Dockerfile		Dockerfile
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
Procfile		Procfile
Procfile.dev		Procfile.dev
README.md		README.md
Rakefile		Rakefile
app.json		app.json
config.ru		config.ru
docker-compose.yml		docker-compose.yml
env.example		env.example
env.recommended		env.recommended
jsconfig.json		jsconfig.json
package.json		package.json
start_pagekite.sh		start_pagekite.sh
test_urls.txt		test_urls.txt
yarn.lock		yarn.lock
zellij-layout.kdl		zellij-layout.kdl

License

TechAndCheck/zenodotus

Folders and files

Latest commit

History

Repository files navigation

Zenodotus (Ζηνόδοτος)

Setup

Prerequisites

Homebrew

Ruby

✅ rbenv

RVM

❌ chruby

ASDF

Chrome

PostgreSQL (13 or higher)

Redis

ChromeDriver

ffmpeg

vips

Yarn

Installation

Starting the scraper

Optional: tmux

Notes on tech used

Development

Docker deployment and management

One-time setup

Start/stop

Healthchecks (compose-managed)

Logs

Scaling workers

Running tasks

Storage

Notes

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages