ALL topUMass
Home
People
Research
Pubs
Contact
Links
Restricted
RL Repository

Department of Computer Science
University of Massachusetts, Amherst

Research

The ALL conducts research on machine learning and computational models of biological learning. The following pages include general descriptions of some of the areas to which we contribute: Below are descriptions of recent projects. Please look at the publications page for the most recent research.

Autonomous Skill Acquisition

A skill is a building block of behavior - a closed-loop policy over one-step actions. A good set of skills can improve an agent's ability to learn. If an agent can develop such skills automatically, it should be able to efficiently solve a variety of problems without relying on hand-coded skills tailored to specific problems. What constitutes a useful skill? How can an agent acquire such skills efficiently? One hypothesis that we are testing is that states that lie between densely-connected regions of a state-space are useful sub-goals and that skills for getting to these states are useful for problems involving that state-space.
The "playroom domain" (left), in which the agent must learn several skills to accomplish a goal. The state space representation (right) of the playroom domain shows how the state space is clustered and how those clusters are connected. An appropriate skill allows the agent to move between clusters.

Proto-Value Functions

Proto-value functions (PVFs) are a unified framework for learning representation and behavior. This framework addresses a longstanding puzzle in AI: how can agents transform their temporal experience into multi-scale task-independent representations that effectively guide long-term task-specific behavior? Proto-value functions are learned from the topology of the underlying state space and reflect large-scale geometric invariants, such as bottlenecks and symmetries. Fourier proto-value functions are learned by computing the eigenvectors of the graph Laplacian; wavelet proto-value functions are learned by dilating the unit basis functions using the powers of a random walk diffusion operator on the graph. For more information on PVFs, please visit the Proto-value functions group webpage.
A three-room "grid-world" (left) and a corresponding Fourier PVF (right). Note how the PVF captures the structure inherent to the state space. Such structure can be useful for learning future tasks in this environment.

Intrinsic Motivation

Humans and animals often engage in activities for their own sakes rather than as steps toward solving specific problems; psychologists refer to this as intrinsically motivated behavior. Although such activities are often not immediately useful, we hypothesize that they contribute towards the agent's long-term problem-solving abilities. For example, an agent might develop a set of skills through intrinsic motivation; these skills can be used to help the agent solve problems yet to be defined. We are investigating an approach to making artificial agents actually "want" to learn for learning's sake.
A schematic of agent-environment interaction in intrinsically motivated RL. While learning a specific task, the environment supplies an external reward (r) which the agent uses to modify its behavior. However, the agent can also use an intrinsic reward signal (ri ) to develop skill, even if there is no specific task to be solved.

Activity Modeling

Activity modeling is a process in which an agent observes another agent performing actions in the environment and attempts to understand what the latter is doing. This can be used to quickly learn a similar task or to interact with the other agent more efficiently. The agent can also model its own actions and understand its environment through its own interactions. We are applying various machine learning techniques to this problem and investigating its relationship to RL.
Data from people walking through the entryway of the CS building, used by an agent to model their activities. A trained agent can learn the different activities represented by the data and predict to which activity a small set of data points belong.

Intelligent Tutors

We are aiming to enhance the ability of an intelligent tutoring system to assess a student's state of knowledge and to optimize the pedagogical decisions the tutoring system makes to increase long-term student learning. We obtained data from local high school students using a tutoring system for SAT-style geometry problems. We have explored different statistical models (such as static and dynamic Bayesian networks and item response theory) to infer a student's state of mastery of relevant geometry concepts. We are currently extending these models to include student motivation.
An item response theory model used to infer a student's latent ability from observed performance on a set of problems. The probability of the student correctly responding to a given problem is based on a logistic curve, parameterized by discrimination and difficulty.

Modeling Motor Control

We collaborate with neuroscience researchers at UMass and other institutions in modeling how the brain controls movement. Experimental evidence suggests that some of the strategies the brain uses to solve problems may be similar to those used in machine learning and reinforcement learning algorithms. Projects include developing a biologically-plausible framework for motor skill acquisition, comparing different computational strategies for controlling movement in the face of uncertainty and feedback delay, and developing models of muscle activation.
Through a biologically-plausible learning scheme, an agent learns to coarticulate - explore the null space of a subtask to select actions best for the overall task. Coarticulation is a characteristic of a learned motor skill.

[ Top of page ] [ ALL Home ] [ Department of Computer Science ] [ University of Massachusetts, Amherst ]