Paper critiques

Foundations for Restraining Bolts:Reinforcement Learning with LTLf/LDLf restraining specifications

What is the problem being addressed? The paper investigates methods for solving reinforcement learning while conforming to LTL restraining specifications, such that, the restraining bolt accounts for features of the world distinct from the RL agent.

Asking the Right Questions: Learning Interpretable Action Models Through Query Answering

What is the problem being addressed? The paper addresses the problem of learning action models of black-box autonomous agents (both symbolic and simulator) through query answering. Why is that a problem worth solving?

Mastering the game of Go without human knowledge

What is the problem being addressed? The problem of developing a pure reinforcement learning approach (i.e. without the use of any domain knowledge beyond basic game rules) for the game of Go has been addressed.

Point-based value iteration: An anytime algorithm for POMDPs

What is the problem being addressed? The paper addresses the problem of solving large partially observable Markov decision processes (POMDPs) in an anytime manner. Why is that a problem worth solving?

Value Iteration Networks

What is the problem being addressed? The paper addresses the problem of exploring better generalizing policy representations mainly in the form of neural networks that can learn planning computations required for goal-directed behavior and solve general unseen tasks.

The MAXQ Method for Hierarchical Reinforcement Learning

What is the problem being addressed? The authors address the problem of finding recursively optimal policies through methods for hierarchical reinforcement learning (HRL) that are general-purpose, support non-hierarchical execution, preserve the markovian property of subtasks involved, and are not adversely affected by state abstractions.

Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming

What is the problem being addressed? The paper addresses the problem of speeding up the final convergence of RTDP, a heuristic search-DP algorithm to produce optimal policies for fully observable non-deterministic (more specifically, stochastic shortest path) problems faster compared to the existing search algorithms.

Learning generalized relational heuristic networks for model-agnostic planning

What is the problem being addressed? This paper proposes a deep learning approach to learn generalized heuristic generation functions (HGFs) that can scale well to unseen problems with different object names and quantities without the use of symbolic action models.

Learning Domain-Independent Planning Heuristics with Hypergraph Networks

What is the problem being addressed? The authors address developing a framework to learn generalized domain-independent heuristics for planning without using any available heuristics. The goal is to learn heuristics that can generalize not only across problem instances of different sizes, states, and goals but also across unseen domains.

Heuristic Search Planner 2.0

What is the problem being addressed? The paper tackles the challenge of developing a general platform for experimenting with the choice of state-space search strategy, search algorithm, and heuristics when solving planning problems.