Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified lectures3/Pythonlearn-02-Expressions.pdf
Binary file not shown.
Binary file modified lectures3/Pythonlearn-02-Expressions.pptx
Binary file not shown.
14 changes: 10 additions & 4 deletions lectures3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@

The `convert2pdf.sh` script converts all `.pptx` files in the current directory and its subdirectories into PDF format using LibreOffice’s command-line interface.

Each converted PDF is saved in a `pdf` subdirectory located in the same directory as its source `.pptx` file. If the `pdf` directory does not exist, it is created automatically.
Each PDF is saved in the same directory as its source `.pptx` file.

Each PDF is exported with PDF/UA (ISO 14289) specification and special accessibility tags enabled.
See https://help.libreoffice.org/latest/en-US/text/shared/guide/pdf_params.html
---

## Requirements
Expand Down Expand Up @@ -70,7 +72,7 @@ Ensure this path is included in your system `PATH` environment variable, or upda

* Recursively searches for `.pptx` files starting from the current directory
* Converts each file to PDF using LibreOffice in headless mode
* Outputs each PDF into a `pdf/` subdirectory alongside the original file
* Outputs each PDF in the same directory as the PPTX file

---

Expand Down Expand Up @@ -107,12 +109,16 @@ Run the script from the directory containing your PowerPoint files:

The script will process all `.pptx` files found in the current directory and its subdirectories.

It will only generate a PDF when the PPTX is newer or the PDF does not exist.

---

## Notes
* There's no official doc for the CLI convert to filter option names that correspond to the Impress UI options
, but the option names can be found in the [source code](https://opengrok.libreoffice.org/xref/core/filter/source/pdf/). Look in files
* PDF export command line options are documented [here](https://help.libreoffice.org/latest/en-US/text/shared/guide/pdf_params.html)

* Related source code be found [here](https://opengrok.libreoffice.org/xref/core/filter/source/pdf/). Look in files
`pdfexport.cxx` and `impdialog.cxx`.

* The script assumes `.pptx` files are valid and readable by LibreOffice
* Output PDFs will overwrite existing files with the same name in the `pdf` directory
* Font availability on the system may affect final rendering in the PDF
52 changes: 30 additions & 22 deletions lectures3/convert2pdf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ if [ -z "$JAVA_HOME" ]; then
echo "Please set JAVA_HOME manually if you encounter issues with LibreOffice."
else
echo "Detected JAVA_HOME: $DEFAULT_JAVA_HOME"
export JAVA_HOME="$DEFAULT_JAVA_HOME"
echo "Set JAVA_HOME to: $JAVA_HOME"
fi
else
echo "Using existing JAVA_HOME: $JAVA_HOME"
Expand All @@ -58,44 +60,50 @@ if [ -z "$SOFFICE_PATH" ]; then
exit 1
fi

# Define the PDF filter with options for PDF/A compliance and accessibility tags PDF
# Reference: https://help.libreoffice.org/latest/en-US/text/shared/guide/pdf_params.html
PDF_FILTER='pdf:impress_pdf_Export:{"PDFUACompliance":{"type":"boolean","value":"true"},"UseTaggedPDF":{"type":"boolean","value":"true"}}'

# Find all .pptx files recursively starting from the current directory
pptx_files=$(find . -type f -name "*.pptx")

count=0
error_count=0
skip_count=0

# Convert each pptx file to pdf
for pptx_file in $pptx_files; do
convert_pptx_to_pdf() {
local pptx_file="$1"
# Get the directory containing the pptx file
dir=$(dirname "$pptx_file")
echo "Processing directory: $dir"

# Get the filename without extension
filename=$(basename "$pptx_file" .pptx)

# Create pdf subdirectory if it doesn't exist
pdf_dir="$dir/pdf"
mkdir -p "$pdf_dir"
if [ $? -ne 0 ]; then
echo "Error creating directory: $pdf_dir"
error_count=$((error_count + 1))
continue
fi

# Define output PDF path
# Define pdf paths
pdf_dir="$dir" # keep pdfs in same directory as pptx
pdf_file="$pdf_dir/$filename.pdf"

# Convert pptx to PDF
echo "Converting: $pptx_file to $pdf_file"
"$SOFFICE_PATH" --headless --convert-to pdf:impress_pdf_Export:{"PDFUACompliance":true,"UseTaggedPDF":true,"ReduceImageResolution":true,"MaxImageResolution":300,"UseLosslessCompression":false,"Quality":90} --outdir "$pdf_dir" "$pptx_file"

if [ $? -eq 0 ]; then
echo "Successfully converted: $pptx_file"
count=$((count + 1))
if [[ ! -f "$pdf_file" || "$pptx_file" -nt "$pdf_file" ]]; then
# Convert because PDF does not exist or PPTX is newer
echo "Converting: $pptx_file to $pdf_file"
"$SOFFICE_PATH" --headless --convert-to "$PDF_FILTER" --outdir "$pdf_dir" "$pptx_file"
if [ $? -eq 0 ]; then
echo "Successfully converted: $pptx_file"
count=$((count + 1))
else
echo "Error converting: $pptx_file"
error_count=$((error_count + 1))
fi
else
echo "Error converting: $pptx_file"
error_count=$((error_count + 1))
echo "Skipping: $pptx_file (PDF is up to date)"
skip_count=$((skip_count + 1))
fi
}

# Convert each pptx file to pdf
for pptx_file in $pptx_files; do
convert_pptx_to_pdf "$pptx_file"
done

echo "Conversion complete! $count files converted, $error_count files failed."
echo "Conversion complete! $count files converted, $error_count files failed, $skip_count files skipped."
Binary file added lectures3/es/1.1_Spanish.pdf
Binary file not shown.
Binary file added lectures3/es/1.2_Spanish.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added lectures3/es/3.1_Spanish.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file removed lectures3/es/pdf/1.1_Spanish.pdf
Binary file not shown.
Binary file removed lectures3/es/pdf/1.2_Spanish.pdf
Binary file not shown.
Binary file removed lectures3/es/pdf/3.1_Spanish.pdf
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-01-Intro.pdf
Binary file not shown.
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-03-Conditional.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-07-Files.pdf
Binary file not shown.
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-09-Dictionaries.pdf
Binary file not shown.
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-11-Regex.pdf
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-12-HTTP.pdf
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-13-WebServices.pdf
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-14-Objects.pdf
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-15-Databases.pdf
Binary file not shown.
Binary file added lectures3/gr/Pythonlearn-16-Data-Viz.pdf
Binary file not shown.
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-01-Intro.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-03-Conditional.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-07-Files.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-09-Dictionaries.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-11-Regex.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-12-HTTP.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-13-WebServices.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-14-Objects.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-15-Databases.pdf
Binary file not shown.
Binary file removed lectures3/gr/pdf/Pythonlearn-16-Data-Viz.pdf
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-01-Intro.pdf
Binary file not shown.
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-03-Conditional.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-06-Strings.pdf
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-07-Files.pdf
Binary file not shown.
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-09-Dictionaries.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-12-HTTP.pdf
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-13-WebServices.pdf
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-14-Objects.pdf
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-15-Databases.pdf
Binary file not shown.
Binary file added lectures3/ru/Pythonlearn-16-Data-Viz.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-01-Intro.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-03-Conditional.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-06-Strings.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-07-Files.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-09-Dictionaries.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-12-HTTP.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-13-WebServices.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-14-Objects.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-15-Databases.pdf
Binary file not shown.
Binary file removed lectures3/ru/pdf/Pythonlearn-16-Data-Viz.pdf
Binary file not shown.
Binary file added lectures3/ua/Pythonlearn-01-Intro.pdf
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file removed lectures3/ua/pdf/Pythonlearn-01-Intro.pdf
Binary file not shown.