This is a note about finding the connection between structured prediction problems and reinforcement learning, which starts from the structured-svm and conditional random fields, and ends with the expected reward maximization with entropy regularization and reward augmented maximum likelihood.

Structured-SVM

View.

Derive Multi-class SVM from Logistic Regression

View.

Reward Augmented Maximum Likelihood

View.