This application crawls the Romanian National Bank for RON (Romanian Leu) exchange rates and weather conditions from OpenWeatherMaps.
git clone this repository and run ./composer install afterwards.
In the config directory, copy / paste the contents from crawl.yml.dist into crawl.yml and adjust them accordingly:
For now, data can be persisted in a MySQL database. Make sure to fill in the configuration the appropriate values for your environment:
crawl:
storage:
mysql:
host: 127.0.0.1
db: crawler
user: crawler
password: ~You just need to create the database, the tables are automatically maintained by the application (see Migrations below).
Data sources for exchange rates are configured through this section:
crawl:
exchange:
notification: false
sources:
-
class: 'Stingus\Crawler\Exchange\NbrCrawler'
url: 'http://www.bnro.ro/nbrfxrates.xml'
-
class: 'Stingus\Crawler\Exchange\InforeuroCrawler'
url: 'http://ec.europa.eu/budg/inforeuro/api/public/monthly-rates'If you'd like to skip the Inforeuro exchange rate from crawling, remove the entry from the config. The NBR crawler MUST be left in place, because it provides the reference date for each crawl.
Data sources for weather are configured through this section:
crawl:
weather:
notification: false
unit: 'C'
sources:
-
class: 'Stingus\Crawler\Weather\OpenWeatherCrawler'
url: 'http://api.openweathermap.org/data/2.5'
stations: [683506]
lang: 'en'
apiKey: 'abcdef'OpenWeatherMaps (OWM) is already built-in, but other sources could be easily added. It provides geolocation for the selected station IDs, sunset, sunrise, atmospheric pressure, humidity and a 5-day forecast.
- For units you can use 'C' for Celsius or 'F' for Fahrenheit
- You don't need to change the
urlvalue, it's already set to use the OWM APIs - Station IDs can be found here
- You can customize the
langparameter with any of the supported languages. This setting will get the weather conditions in the desired language - The
apikeycan be obtained from your OWM account
If you'd like to receive error notifications when running the crawlers, you can setup the system in this config section:
crawl:
notification:
email: <your_email>
smtp_host: <your_smtp_server>
smtp_port: <your_smtp_port>
smtp_user: <optional_smtp_username>
smtp_password: <optional_smtp_password>
smtp_from: <your_from_email>You'll also need to enable the notifications on each crawler section:
crawl:
...
exchange:
notification: true
...
weather:
notification: trueYou can disable the notification per crawler section or entirely, by removing the whole notification section.
For exchange rates, run the bin/exchange command and for weather run bin/weather.
The application checks if the DB schema is in place and it creates it if required.
The data is stored in the exchange and weather tables.
You might want to use a cron to run the scripts. For the exchange rates, it's recommended to run the crawler after 11am UTC, when the NBR updates the numbers. The weather crawler can be ran on an hourly basis.
The application maintains the schema automatically, by checking if the schema is valid before each run. In case a new exchange rate is crawled and a new column is needed, you'll need to update the code repository with the latest version and that's it :)
If you'd like to check for schema updates, independent of the crawling process, run the bin/migration command.
Don't alter or change in any way the version table! Doing so will render the schema migration system useless!
You can run the tests using vendor/bin/phpunit command.
#happycrawling!