(1) Overview

Introduction

Psychophysics is a branch of experimental psychology concerned, primarily, with quantifying the limits of perception [1, 2]. For example, we may wish to know what the faintest sound is that somebody can hear [3], or the finest spatial detail they can see. [4] This can be important in academic research (e.g., when studying the function of sensory systems), and also in clinical practice (e.g., when attempting to diagnose or monitor disease).

Adaptive psychophysical techniques attempt to answer such questions by systematically varying the parameters of a stimulus (e.g., the intensity of a sound), and finding the model that best fits the observer’s responses (e.g., whether or not the observer successfully detected each stimulus).

In the case of QUEST+, the user specifies the parametric form of a model, along with a set of possible values for each parameter (‘hypotheses’). The primary role of QUEST+ is to compute the posterior probability of each parameter value being true, given vectors of trial-by-trial stimulus values, x, and observed responses, r.

A detailed exposition of how QUEST+ does this is given elsewhere [5] (see also [6, 7]). However, essentially it is operates using Bayes’ Theorem. Thus, after n trials the posterior probability for a given set of parameters, θ, is given by:

(Eq. 1)
p(θ|{x,r})=p(θ)i=1np(ri|{xi,θ})

where p(θ) is the prior probability of each parameter value being true (NB: a uniform value, if no prior data is available), and xi and ri are pairs of stimulus values and observed responses, respectively, for each of the n trials.

Each update of the posterior requires the model to consider the likelihood of every possible response observation given every possible parameter combination and every possible combination of stimulus values. Under typical use the posterior distribution is updated after every observation. This presents a considerable computational challenge. In practice, the current implementation of QUEST+ minimizes the computational burden by precomputing all of these conditional probabilities in advance, and storing the results in a look-up table. However, this approach is limited by the amount of available memory. For example, consider a task containing: two outcomes (e.g., ‘correct’ or ‘incorrect’), a stimulus with two parameters (e.g., contrast and spatial frequency), each of which can take 50 possible values, and a model with five parameters, each of which can take 150 possible values. This implies 502 × 1505 × 2 probabilities: more values than the largest possible double array that MATLAB supports (2^48–1 on 64-bit platforms). Typically, therefore, QUEST+ is applied to situations where the model contains no more than 4 parameters, and the stimulus has only one or two dimensions.

In addition to its core function of estimating parameter values, QUEST+ also has two secondary functions. First, QUEST+ is able to advise the experimenter on what stimulus value(s) to present next. QUEST+ does this by computing the stimulus that minimizes the expected negative Shannon entropy of the N-dimensional posterior probability density (i.e., of the N parameter estimates). Essentially, this can be thought of as the most informative stimulus. Note the experimenter is free to ignore the suggestions of QUEST+. For example, the experimenter may sometimes wish to present a very ‘easy’ stimulus to motivate the observer, rather than presenting the most informative stimulus on every trial.

Secondly, QUEST+ is also capable of advising the experimenter on when to stop testing (the Stopping Criterion). This is again computed based on the notion of entropy, and the precise mathematics can be found elsewhere in [5]. However, to give an intuition it should suffice to note that entropy is essentially a measure of how well a single hypothesis fits the evidence (i.e., a low value of negative Shannon entropy means that a considerable mass of the probability distribution is confined to a small set of measures). Furthermore, in the simple case of a 1D posterior distribution, the entropy of the distribution is approximately proportional to its variance. Given that our goal is to have an infinitely narrow distribution, centered on the “true” parameter value, a minimum level of negative entropy provides a straightforward and principled cutoff point for testing. By default this cutoff is set at 3.0, but a serviceable value for a particular experiment can typically be found by trial-and-error. Moreover, as with stimulus placement, the experimenter is free to override the minimum entropy criterion, and determine their own Stopping Criterion. For example, users may wish to use a fixed number of trials, or to terminate the test early if the participant is becoming uncooperative.

Example application

Imagine one wanted to know how capable a particular individual was of telling apart the frequency (pitch) of two tones. We know, from previous research, that performance (proportion of trials correct: p(C)) on a two-alternative frequency discrimination task is generally described by a 4-parameter cumulative Guassian function, shown graphically in Figure 1, and described mathematically as:

Figure 1 

Graphical illustration of the example application, at the start of the first, second, fourth, eighth and sixteenth trial. Left panels show the 30 ‘hypotheses’ (i.e., possible values of µ, given the model in Equation 2). The darkness of the curves is proportional to the relative likelihood of the estimate being true. Right panels show the posterior density function, with a blue vertical line denoting the current estimate (here based on the mean value). The stimuli, x, and observer’s responses, r, aren’t shown. However, it can be inferred that in the first few trials the observer successfully discriminated a number of mid-magnitude stimuli, making high values of µ (poor sensitivity) unlikely.

(Eq. 2)
p(C)=γ+(1λγ)Φ(x;μ,σ)

where Φ is the cumulative Gaussian function, x is stimulus magnitude, and μ,σ, γ, and λ are the four model parameters. The parameter γ represents the chance (‘guess’) rate, which, due to the nature of the task (two alternatives), is fixed at 0.5 (50%). The parameter λ represents the lapse rate (incorrect responses to seen stimuli, due to inattention or response errors), which we shall fix for now at a broadly representative value of 0.02 (2%). The parameters μ and σ relate to the sensitivity of the observer. We shall also fix σ for now at a representative value of 1 Hz, while μ is a free parameter that we wish to estimate. Based on prior data and piloting, we shall assume that the range of possible values is 1–20, and we shall posit 30 possible values (‘hypotheses’), uniformly linearly-distributed within this range, as shown in Figure 1.

Finally, after specifying the model we must also specify the domain of possible responses (0: incorrect; 1: correct) and the domain of stimulus values, which in this case is a univariate parameter corresponding to the magnitude of the frequency difference between the two tones, in Hz. Given the model we are attempting to fit and the assumed range of possible parameter values, 40 possible stimulus values, log-distributed between 0.1 Hz and 100 Hz should be sufficient to constrain the model fit. Note, however, that these values should be refined through piloting and/or simulation, prior to running the final experiment.

Programmatically, the desired initialization is written as follows:

Then, during the test, we can query QUEST+ for suggested stimulus values, and update the QUEST+ with pairs of stimulus-response values (i.e., the actual stimulus value present, and the observed response). Thus:

Finally, once the test is complete, the QUEST+ can be queried to provide parameter estimates given the current state of the posterior density function (Note, however, that when reporting data, users may wish to obtain more accurate estimates by refitting the model post-hoc):

A graphical illustration of this code in action is given in Figure 1, which shows how we initially posit 30 ‘hypotheses’, which are then progressive ruled-out by successive empirical observations.

Finally, it is important to note that QUEST+ is highly flexible/generalizable. For example, it is trivial to: (i) vary the underlying model; (ii) make more than one value a free parameter; and (iii) add prior estimates to each of the parameter values, thus:

For more examples of use, see [5], and see also the test cases embedded in in the current implementation (see QuestPlus.runExamples).

Implementation and architecture

The software is written entirely in MATLAB code, and all of the methods are contained within a single class: QuestPlus.m.

Quality control

End-to-end tests (which also serve as Minimal Working Examples) are incorporated within the QuestPlus class, and can be executed by running the static method QuestPlus.runExample(N), where N = 1, 2, …, 7. Test 6 in particular replicates an application from [5], who published numerical outputs that the present implementation has been validated against. QuestPlus is also being used for several studies within our lab.

(2) Availability

Operating system

QuestPlus is pure MATLAB code, and should function on all operating systems in which MATLAB is supported.

Programming language

QuestPlus requires MATLAB V2012 or later. Due to the use of modern OOP syntax, QuestPlus is not compatible with earlier versions of MATLAB, or with Octave.

Additional system requirements

None.

Dependencies

There are no MATLAB dependencies in the core QUEST+ algorithm. However, several of the examples of use require the statistics toolbox.

List of contributors

Pete Jones wrote the software and is its current maintainer. Andrew B. Watson is the creator of the QUEST+ method.

Archive and Code Depository

GitHub

Name

QuestPlus

Licence

GNU GPL v3.0

Current version

v1.0.0

Publisher

Zenodo

Date published

28/09/2017

Language

English

(3) Reuse potential

QUEST+ is suitable for all psychophysical applications. Because of its extremely high efficiency, it is particularly well-suited to domains where the total number of trials is restricted, such as when working with children or patients. Furthermore, because of its great flexibility, it is ideal for problems require the fitting of complex/arbitrary models. Currently, we are using it to estimate Contrast Sensitivity Functions in children with visual impairments.