Finite-Horizon Dynamic Programming and Simulation

Project Overview

This repository contains the full solution to the Week 1 assignment of the Dynamic Programming & Reinforcement Learning course. The task explores finite-horizon stochastic dynamic programming applied to a simplified inventory management problem with random demand and unreliable supply.

We:

Define a state–action formulation of the system,
Solve it by backward induction (Bellman recursion),
Derive the optimal policy π*(t, x),
Validate it through 1 000 Monte Carlo simulations, and
Compare the simulated rewards with the theoretical DP expectation.

The full mathematical derivation and interpretation are provided in assignment-report.pdf, and all code used for computation and visualization is available in dprl-w1-script.ipynb.

Key Results

Quantity	Symbol	Value
Expected maximal reward (DP)	V₁(5)	≈ 32.37
Mean reward (simulation, 1 000 runs)		≈ 32.35

These nearly identical values confirm that the simulated process reproduces the theoretical optimum and that the implemented policy performs as expected under uncertainty.

Model Summary

The system evolves over T = 150 discrete periods.

State (xₜ): current inventory
Action (aₜ): {0 = no order, 1 = order one unit}
Demand (Dₜ): Bernoulli (pₜ) with pₜ = (t−1)/149 (increasing demand probability)
Order arrival (Aₜ): Bernoulli (0.5) when aₜ = 1
Reward: sales revenue ( +1 per item ) − holding cost ( 0.1 × inventory )

The Bellman recursion: [ V_t(x) = -0.1x + \max_a \mathbb{E}[,\text{profit}(x,a), + V_{t+1}(x'),], ] was solved by backward induction to obtain V and π* over all (t, x).

Broader Interpretation – From Inventory to Finance

Although the task is presented in an operations-research context, its mathematical skeleton mirrors many financial optimization problems:

Inventory Problem Concept	Financial Analogue
Inventory level	Cash buffer or risky-asset position
Ordering decision	Trade / rebalancing / hedging action
Uncertain delivery	Execution risk or market liquidity uncertainty
Random demand	Random cash outflow or market shock
Holding cost	Opportunity cost, financing rate, or risk penalty
Finite horizon	Investment horizon or regulatory reporting period

Under this lens, the same DP structure could support:

Liquidity-buffer optimization: deciding when to raise or deploy cash given uncertain outflows.
Dynamic risk management: adjusting exposure to maintain capital adequacy under stochastic losses.
Optimal trading & execution: planning orders when execution probability varies (analogous to our 0.5 arrival rate).
Portfolio rebalancing: balancing expected return (sales revenue) against risk or transaction costs (holding cost).

Thus, even a simple DP exercise forms the computational foundation for dynamic asset-allocation, stochastic control, and reinforcement-learning approaches in quantitative finance.

Takeaway

This project demonstrates how dynamic programming translates theoretical stochastic control principles into quantifiable, testable strategies. Whether the “inventory” represents physical goods or financial exposure, the same Bellman logic applies:

Decide today, while thinking about tomorrow’s uncertainty.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
assignment-report.pdf		assignment-report.pdf
dprl-w1-script.ipynb		dprl-w1-script.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Finite-Horizon Dynamic Programming and Simulation

Project Overview

Key Results

Model Summary

Broader Interpretation – From Inventory to Finance

Takeaway

About

Uh oh!

Releases

Packages

Languages

Milanpeter-77/Coursework-Finite-Horizon-Dynamic-Programming

Folders and files

Latest commit

History

Repository files navigation

Finite-Horizon Dynamic Programming and Simulation

Project Overview

Key Results

Model Summary

Broader Interpretation – From Inventory to Finance

Takeaway

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages