We simply demonstrate how you can combine Stata with R into a single Docker image.
Note that this could also include installing Pandoc or LaTeX, if needed, for Stata.
Alternatively, one could use a JupyterLab setup, install the stata_kernel, and then copy in the Stata binaries.
- Edit the
init.config.txtto have the desired values for the Docker image you will create:
VERSION=17
# the TAG can be anything, but could be today's date
TAG=$(date +%F)
MYHUBID=larsvilhuber
MYIMG=${PWD##*/}
STATALIC=/path/to/stata.lic
where
-
VERSIONis the Stata version you want to use (this might be ignored right now) -
TAGis the Docker tag you will be using - could be "latest", could be a specific name. Has to be lower-case. -
MYHUBIDis presumably your Docker login -
MYIMGis the name you want to give the Docker image you are creating. By default, it presumes that it will be the same name as the Git repository. -
STATALICis the path to a valid Stata license file (for instance, as installed on your laptop) -
Edit the
Dockerfile. The primary configuration parameters are at the top:
ARG SRCVERSION=17
ARG SRCTAG=2022-01-17
ARG SRCHUBID=dataeditors
ARG RVERSION=4.1.0
ARG RTYPE=verse
where
-
SRCVERSIONis the Stata version you want to use -
SRCTAGis the tag of the Stata version you want to use as an input -
SRCHUBIDis where the Stata image comes from - should probably not be modified, but you could use your own. -
RVERSIONandRTYPEare used to pin therocker/RTYPE:RVERSIONversioned image. Adjust as necessary -
Finally, edit the
setup.dofile, which will install any Stata packages into the image.
Use build.sh (NAME OF STATA LICENSE FILE), e.g.
./build.sh
Because Stata is licensed software, you need to have a valid license to run the software. If you include this in the Docker container itself, then you should not publish the container, since it will permanently include the license (even if you first include it, and then remove it, unless you become really tricky...).
So how should you go about building into the container some of the packages? You should not. You should instead include these as part of a replication package. However, you can use the container to set them up as follows:
- Build the container - no license file required.
- Run the container a first time, with the setup script that installs the packages. You should include a configuration that uses a project-specific ado directory. You will later use the same config to run the code itself, so your setup program might look like this:
// setup.do
include "config.do"
// install packages
ssc install estout
- Your
config.dowould look like this one. - Your
main.dowould not re-execute thesetup.do, but would include theconfig.dothat redirects Stata to use the project-specific ado directory:
// main.do
include "config.do"
// rest of the code
- Your replication package would include the
setup.do, theconfig.do, themain.do, and theadodirectory created.
You also need the Stata license for running it all. For convenience, use the run.sh script:
./run.sh