The ESP game (extrasensory perception game) is a human-based computation game developed to address the problem of creating difficult metadata. The idea behind the game is to use the computational power of humans to perform a task that computers cannot (originally, image recognition) by packaging the task as a game. It was originally conceived by Luis von Ahn of Carnegie Mellon University and first posted online in 2003.[1]

On the official website, there was a running count of "Labels collected since October 5, 2003", updated every 12 hours. They stated that "If the ESP game is played as much as other popular online games, we estimate that all the images on the Web can be labeled in a matter of weeks!"[2] 36 million labels had been collected as of May 2008.[3] The original paper (2004) estimated that 5,000 people continuously playing the game could label all images indexed by Google in 31 days.[1]

In late 2008, the game was rebranded as GWAP ("game with a purpose"), with a new user interface. Some other games that were also created by Luis von Ahn, such as "Peekaboom" and "Phetch", were discontinued at that point. "Peekaboom" extends the ESP game by asking players to select the region of the image that corresponds to the label. "Squigl" asks players to trace the object outline in an image. "Matchin" asks players to pick the more beautiful out of two images.[4] "Verbosity", which collects common-sense facts from players.[5]

Google bought a license to create its own version of the game (Google Image Labeler) in 2006 in order to return better search results for its online images.[6] The license of the data acquired by Ahn's ESP game, or the Google version, is not clear.[clarification needed] Google's version was shut down on September 16, 2011, as part of the Google Labs closure in September 2011.

Most of the ESP dataset is not publicly available. It was reported in the ImageNet paper that as of 2008, only 60K images and their labels can be accessed.[7]

Concept

edit

Image recognition was historically a task that was difficult for computers to perform independently. Humans are perfectly capable of it, but are not necessarily willing. By making the recognition task a "game", people are more likely to participate. When questioned about how much they enjoyed playing the game, collected data from users was extremely positive.

The applications and uses of having so many labeled images are significant; for example, more accurate image searching and accessibility for visually impaired users, by reading out an image's labels. Partnering two people to label images makes it more likely that entered words will be accurate. Since the only thing the two partners have in common is that they both see the same image, they must enter reasonable labels to have any chance of agreeing on one.

The ESP Game as it is currently implemented encourages players to assign "obvious" labels, which are most likely to lead to an agreement with the partner. But these labels can often be deduced from the labels already present using an appropriate language model and such labels therefore add only little information to the system. A Microsoft research project assigns probabilities to the next label to be added. This model is then used in a program, which plays the ESP game without looking at the image.[8]

ESP game authors presented evidence that the labels produced using the game were indeed useful descriptions of the images. The results of searching for randomly chosen keywords were presented and show that the proportion of appropriate images when searching using the labels generated by the game is extremely high. Further evaluation was achieved by comparing the labels generated using the game to labels generated by participants that were asked to describe the images.

Rules of the game

edit

Once logged in, a user is automatically matched with a random partner. The partners do not know each other's identity and they cannot communicate. Once matched, they will both be shown the same image. Their task is to agree on a word that would be an appropriate label for the image. They both enter possible words, and once a word is entered by both partners (not necessarily at the same time), that word is agreed upon, and that word becomes a label for the image. Once they agree on a word, they are shown another image. They have two and a half minutes to label 15 images.

Both partners have the option to pass; that is, give up on an image. Once one partner passes, the other partner is shown a message that their partner wishes to pass. Both partners must pass for a new image to be shown.

Some images have "taboo" words; that is, words that cannot be entered as possible labels. These words are usually related to the image and make the game harder as they prevent common words to be used to label the image. Taboo words are obtained from the game itself. The first time an image is used in the game, it will have no taboo words. If the image is ever used again, it will have one taboo word: the word that resulted from the previous agreement. The next time the image is used, it will have two taboo words, and so on. "Taboo" words is done automatically by the system: once an image has been labeled enough times with the same word, that word becomes taboo so that the image will get a variety of different words as labels.

Occasionally, the game will be played solo, without a human partner, with the ESP Game itself acting as the opponent and delivering a series of pre-determined labels to the single human player (which have been harvested from labels given to the image during the course of earlier games played by real humans). This is necessary if there are an odd number of people playing the game.[9]

This game has been used as an important example of Social Machine with a Purpose (teleological social machine), providing an example of an intelligent system emerging from the interaction of human participants in the book "The shortcut" by Nello Cristianini,[10] where the intelligence of social media platforms is discussed.

Cheating

edit

Ahn has described countermeasures which prevent players from "cheating" the game, and introducing false data into the system. By giving players occasional test images for which common labels are known, it is possible to check that players are answering honestly, and a player's guesses are only stored if they successfully label the test images.[9]

Furthermore, a label is only stored after a certain number of players (N) have agreed on it. At this point, all of the taboo lists[clarification needed] for the images are deleted and the image is returned to the game pool as if it were a fresh image. If X is the probability of a label being incorrect despite a player having successfully labelled test images, then after N repetitions the probability of corruption is  , assuming that end repetitions are independent of each other.[9]

Image selection

edit

The choice of images used by the ESP game makes a difference in the player's experience. The game would be less entertaining if all the images were chosen from a single site and were all extremely similar.

The first run of the ESP game used a collection of 350,000 images chosen by the developers. Later versions selected images at random from the web, using a small amount of filtering. Such images are reintroduced into the game several times until they are fully labeled.[9] The random images were chosen using "Random Bounce Me", a website that selects a page at random from the Google database. "Random Bounce Me" was queried repeatedly, each time collecting all JPEG and GIF images in the random page, except for images that did not fit the criteria: blank images, images that consist of a single color, images that are smaller than 20 pixels on either dimension, and images with an aspect ratio greater than 4.5 or smaller than 1/4.5. This process was repeated until 350,000 images were collected. The images were then rescaled to fit the game's display. Fifteen different images from the 350,000 are chosen for each session of the game.

References

edit
  1. ^ a b von Ahn, Luis; Dabbish, Laura (2004-04-25). "Labeling images with a computer game". ACM: 319–326. doi:10.1145/985692.985733. ISBN 978-1-58113-702-6. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ "The ESP Game Project". web.archive.org. 2003-10-20. Retrieved 2024-11-13.
  3. ^ "The ESP Game Project". web.archive.org. 2008-05-09. Retrieved 2024-11-13.
  4. ^ "Solving the web's image problem". 2008-05-14. Retrieved 2024-11-13.
  5. ^ von Ahn, Luis; Kedia, Mihir; Blum, Manuel (2006-04-22). "Verbosity: a game for collecting common-sense facts". ACM: 75–78. doi:10.1145/1124772.1124784. ISBN 978-1-59593-372-0. {{cite journal}}: Cite journal requires |journal= (help)
  6. ^ "Solving the web's image problem". bbc. 2008-05-14. Retrieved 2008-12-14.
  7. ^ Deng, Jia; Dong, Wei; Socher, Richard; Li, Li-Jia; Kai Li; Li Fei-Fei (2009-06). "ImageNet: A large-scale hierarchical image database". IEEE: 248–255. doi:10.1109/CVPR.2009.5206848. ISBN 978-1-4244-3992-8. {{cite journal}}: Check date values in: |date= (help); Cite journal requires |journal= (help)
  8. ^ Robertson, Stephen; Vojnovic, Milan; Weber, Ingmar (2009-04-04). "Rethinking the ESP game". CHI '09 Extended Abstracts on Human Factors in Computing Systems. CHI EA '09. New York, NY, USA: Association for Computing Machinery: 3937–3942. doi:10.1145/1520340.1520597. ISBN 978-1-60558-247-4.
  9. ^ a b c d GoogleTalksArchive (2012-08-22). Human Computation. Retrieved 2024-11-13 – via YouTube. {{cite AV media}}: |last= has generic name (help) Talk by Luis von Ahn on July 26, 2006
  10. ^ Cristianini, Nello (2023). The shortcut: why intelligent machines do not think like us. Boca Raton. ISBN 978-1-003-33581-8. OCLC 1352480147.{{cite book}}: CS1 maint: location missing publisher (link)
edit