Web-based experimenting -- that is, what we do at Game with Words -- has a number of advantages over traditional lab-based experiments. The primary one, as far as I'm concerned, is the ability to easily test large numbers of participants. Typical cognition experiments require 8-20 participants. Testing some hypotheses, however, may require hundreds or even thousands of participants. Recruiting 1000+ participants, explaining the experiment to them and recording their data is time-consuming and thus expensive. Recruiting a thousand participants on the Internet is not exactly easy, but it is easier than the alternatives. Testing a thousand participants through the Internet is a snap. Once the experiment is loaded on the Web page, there's nothing more for me to do.
Why might an experiment require large numbers of subjects? There are many reasons. The experiment I am about describe is one example. Early this year, as I was setting up my first Web-based lab (the Visual Cognition Online Lab, now closed), Tal Makovski, then a post-doc at the same lab as me at Harvard, came to me with an idea.
One of Tal's research interests is visual attention. You don't pay attention to everything you see, and this can affect what parts of your environment you are aware of. Not long ago, Tal and his PhD advisor suggested that in some instances, trying to ignore an area in your visual field actually causes you to pay more attention to it (Tsal & Makovski, 2006). They called this the "White Bear Hypothesis." This name comes from the following "experiment": Quick! Don't think about a white bear! If you immediately thought about a white bear, you get the idea.
In the original study, participants were supposed to identify a quickly-presented stimulus, while ignoring a distractor. The stimulus always appeared in the same position, and the distractor always appeared in the same position. The participants knew where to look and what to ignore. It was particularly important to ignore the distractor, because it would otherwise throw the participant off. In one version of this type of task, the stimulus is either an H or a K, and your job is to say which one you see. The distractor is also either an H or a K. It requires some concentration to not accidentally identify the distractor instead of the stimulus. Again, the location is what sets the stimulus and distractor apart. The subjects did this task over and over. On some trials, instead of the normal task, two dots would appear -- one in the distractor's location and one in another location that had not been important so far. Although these dots appeared simultaneously, participants said that the one in the distractor location appeared first. This is the result you would expect if participants were paying particular attention to the distractor location (despite the fact that they were supposed to be ignoring that location).
One problem with this experiment is this: The appearance of the dots was called a "suprise trial," but it happened many times during the course of the experiment. The first time, the participants might have been surprised, but after that, they knew that occasionally two dots would appear -- one in the distractor location -- and that they would need to report the order in which the dots appeared. This might encourage them to pay attention to the distractor location after all. Why not do just one surprise trial per participant? The reason Tsal & Makovski repeated the surprise trials was to get statistical robustness. There is a reason that standardized tests like the SATs have more than one question; this produces a more stable and more nuanced result. The worry with Tsal & Makovski's study was that perhaps they had the equivalent of an SAT test with one question repeated a hundred times. The authors used a number of controls to try to eliminate this possibility, but the doubt still lingered. Now with our new Web-based lab, Tal reasoned that we could "surprise" each participant only one time, and make up for the reduced amount of data by having many participants. That is exactly what we did.
There were 7 versions of the experiment (more about that below). A little over 500 people completed the final \
ersion. The participant was briefly presented with a letter in the middle of the screen. They were to press one key if the letter was an H or a K, and a different key for S or C. A distractor letter appeared near to the target, which either matched the target category (congruent trials) or did not not (incongruent trials). This was repeated a number of times (16 in the final version of the experiment). Not surprisingly, participants were significantly more accurate and significantly faster in the congruent condition than the incongruent condition.
Scientifically, this was expected, but it was exciting nonetheless. By putting the experiment on the Web, we lost a lot of control over the timing of the display. Similarly, we can't get faithful reports of the participant's speed in responding. Many people had been skeptical that our program would have enough percision to successfully show this effect.
On the 17th trial, either a P or a Q flashed on the screen, either in the same position as the distractors had appeared or in a different position. Participants were then asked, "Did you see a P or a Q?" Unfortunately for us, there was no significant difference in accuracy when the P or Q appeared in the distractor location (73.1%) or the new location (70.7%). The numbers go in the direction of the hypothesis, but statistically the two results are equivalent. With over 500 participants already tested, it is unlikely that testing more will make this difference significant.
What does that mean? There are two possibilities. One possibility is that Tsal & Makovski's original result was in fact due to the repetetive nature of their task. Another possibility was that our new experiment wasn't sensitive enough. There are many ways this would be possible. The early versions of the task were either too fast (people couldn't see the P or Q regardless of its position) or too slow (everybody go the P and Q correct regardless of its position). If the distractor was too easy to ignore, that could mean we would not get an effect, so we adjusted the difficulty of that task. Etc. Perhaps, in the end, it was still too easy. Perhaps the two-dot suprise trial would have shown the effect, but the P/Q task does not. Perhaps our probe, which required discriminating two possible stimuli, is fundamentally different from theirs, which required detecting a stimulus. The possibilities are endless.
The story would have been better if we had made a major discovery. Unfortunately, this is a more typical: an inconclusive result. Still, I'm pretty happy with the outcome. In terms of the technical aspects, this was by far the most ambitious Web experiment I have run. Most Web experiments are surveys. I wasn't sure that they dynamic aspects of this experiment -- especially recording response time -- would even be possible. The fact that the distractor task (H/K vs. S/C) worked as expected is very encouraging.