Products and applications

OpenAI's research tend to focus on reinforcement learning. OpenAI is viewed as an important competitor to DeepMind.

Gymedit

Gym aims to provide an easy to set up, general-intelligence benchmark with a wide variety of different environments—somewhat akin to, but broader than, the ImageNet Large Scale Visual Recognition Challenge used in supervised learning research—and that hopes to standardize the way in which environments are defined in AI research publications, so that published research becomes more easily reproducible. The project claims to provide the user with a simple interface. As of June 2017, Gym can only be used with Python. As of September 2017, the Gym documentation site was not maintained, and active work focused instead on its GitHub page.

RoboSumoedit

In "RoboSumo", virtual humanoid "metalearning" robots initially lack knowledge of how to even walk, and given the goals of learning to move around, and pushing the opposing agent out of the ring. Through this adversarial learning process, the agents learn how to adapt to changing conditions; when an agent is then removed from this virtual environment and placed in a new virtual environment with high winds, the agent braces to remain upright, suggesting it had learned how to balance in a generalized way. OpenAI's Igor Mordatch argues for that competition between agents can create an intelligence "arms race" that can increase an agent's ability to function, even outside the context of the competition.

Debate Gameedit

In 2018, OpenAI launched the Debate Game, which teaches machines to debate toy problems in front of a human judge. The purpose is to research whether such an approach may assist in auditing AI decisions and in developing explainable AI.

Dactyledit

Dactyl uses machine learning to train a robot Shadow Hand from scratch, using the same reinforcement learning algorithm code that OpenAI Five uses. The robot hand is trained entirely in physically inaccurate simulation.

Generative modelsedit

GPTedit

The original paper on generative pre-training (GPT) of a language model was written by Alec Radford and colleagues, and published in preprint on OpenAI's website on June 11, 2018. It showed how a generative model of language is able to acquire world knowledge and process long-range dependencies by pre-training on a diverse corpus with long stretches of contiguous text.

GPT-2edit

Generative Pre-trained Transformer 2, commonly known by its abbreviated form GPT-2, is an unsupervised transformer language model and the successor to GPT. GPT-2 was first announced in February 2019, with only limited demonstrative versions initially released to the public. The full version of GPT-2 was not immediately released out of concern over potential misuse, including applications for writing fake news. Some experts expressed skepticism that GPT-2 posed a significant threat. The Allen Institute for Artificial Intelligence responded to GPT-2 with a tool to detect "neural fake news". Other researchers, such as Jeremy Howard, warned of "the technology to totally fill Twitter, email, and the web up with reasonable-sounding, context-appropriate prose, which would drown out all other speech and be impossible to filter". In November 2019, OpenAI released the complete version of the GPT-2 language model. Several websites host interactive demonstrations of different instances of GPT-2 and other transformer models.

GPT-2's authors argue unsupervised language models to be general-purpose learners, illustrated by GPT-2 achieving state-of-the-art accuracy and perplexity on 7 of 8 zero-shot tasks (i.e. the model was not further trained on any task-specific input-output examples). The corpus it was trained on, called WebText, contains slightly over 8 million documents for a total of 40 GB of text from URLs shared in Reddit submissions with at least 3 upvotes. It avoids certain issues encoding vocabulary with word tokens by using byte pair encoding. This allows to represent any string of characters by encoding both individual characters and multiple-character tokens.

GPT-3edit

Generative Pre-traineda Transformer 3, commonly known by its abbreviated form GPT-3, is an unsupervised Transformer language model and the successor to GPT-2. It was first described in May 2020. OpenAI stated that full version of GPT-3 contains 175 billion parameters, two orders of magnitude larger than the 1.5 billion parameters in the full version of GPT-2 (although GPT-3 models with as few as 125 million parameters were also trained).

OpenAI stated that GPT-3 succeeds at certain "meta-learning" tasks. It can generalize the purpose of a single input-output pair. The paper gives an example of translation and cross-linguistic transfer learning between English and Romanian, and between English and German.

GPT-3 dramatically improved benchmark results over GPT-2. OpenAI cautioned that such scaling up of language models could be approaching or encountering the fundamental capability limitations of predictive language models. Pre-training GPT-3 required several thousand petaflop/s-daysb of compute, compared to tens of petaflop/s-days for the full GPT-2 model. Like that of its predecessor, GPT-3's fully trained model was not immediately released to the public on the grounds of possible abuse, though OpenAI planned to allow access through a paid cloud API after a two-month free private beta that began in June 2020.

On September 23, 2020, GPT-3 was licensed exclusively to Microsoft.

Musicedit

OpenAI's MuseNet (2019) is a deep neural net trained to predict subsequent musical notes in MIDI music files. It can generate songs with ten different instruments in fifteen different styles. According to The Verge, a song generated by MuseNet tends to start out reasonably but then fall into chaos the longer it plays.

OpenAI's Jukebox (2020) is an open-sourced algorithm to generate music with vocals. After training on 1.2 million samples, the system accepts a genre, artist, and a snippet of lyrics, and outputs song samples. OpenAI stated the songs "show local musical coherence, follow traditional chord patterns" but acknowledged that the songs lack "familiar larger musical structures such as choruses that repeat" and that "there is a significant gap" between Jukebox and human-generated music. The Verge stated "It's technologically impressive, even if the results sound like mushy versions of songs that might feel familiar", while Business Insider stated "surprisingly, some of the resulting songs are catchy and sound legitimate".

APIedit

In June 2020, OpenAI announced a multi-purpose API which it said was "for accessing new AI models developed by OpenAI" to let developers call on it for "any English language AI task."

Video game bots and benchmarksedit

OpenAI Fiveedit

OpenAI Five is the name of a team of five OpenAI-curated bots that are used in the competitive five-on-five video game Dota 2, who learn to play against human players at a high skill level entirely through trial-and-error algorithms. Before becoming a team of five, the first public demonstration occurred at The International 2017, the annual premiere championship tournament for the game, where Dendi, a professional Ukrainian player, lost against a bot in a live 1v1 matchup. After the match, CTO Greg Brockman explained that the bot had learned by playing against itself for two weeks of real time, and that the learning software was a step in the direction of creating software that can handle complex tasks like a surgeon. The system uses a form of reinforcement learning, as the bots learn over time by playing against themselves hundreds of times a day for months, and are rewarded for actions such as killing an enemy and taking map objectives.

By June 2018, the ability of the bots expanded to play together as a full team of five and they were able to defeat teams of amateur and semi-professional players. At The International 2018, OpenAI Five played in two exhibition matches against professional players, but ended up losing both games. In April 2019, OpenAI Five defeated OG, the reigning world champions of the game at the time, 2:0 in a live exhibition match in San Francisco. The bots' final public appearance came later that month, where they played in 42,729 total games in a four-day open online competition, winning a percentage of 99.4% of those games.

GYM Retroedit

Gym Retro is a platform for reinforcement learning research on games. Gym Retro is used to conduct research on RL algorithms and study generalization. Prior research in RL has mostly focused on optimizing agents to solve single tasks. Gym Retro gives the ability to generalize between games with similar concepts but different appearances.

Search This Blog

gkvivek