ChatGPT: Optimizing Language Models for Dialogue

We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in a prompt and provide a detailed response.

We are excited to introduce ChatGPT to get users’ feedback and learn about its strengths and weaknesses. During the research preview, usage of ChatGPT is free. Try it now at chat.openai.com

Methods

We trained this model using Reinforcement Learning from Human Feedback (RLHF), using the same methods as InstructGPT, but with slight differences in the data collection setup. We trained an initial model using supervised fine-tuning: human AI trainers provided conversations in which they played both sides—the user and an AI assistant. We gave the trainers access to model-written suggestions to help them compose their responses.

To create a reward model for reinforcement learning, we needed to collect comparison data, which consisted of two or more model responses ranked by quality. To collect this data, we took conversations that AI trainers had with the chatbot. We randomly selected a model-written message, sampled several alternative completions, and had AI trainers rank them. Using these reward models, we can fine-tune the model using Proximal Policy Optimization. We performed several iterations of this process.

ChatGPT is fine-tuned from a model in the GPT-3.5 series, which finished training in early 2022. You can learn more about the 3.5 series here. ChatGPT and GPT 3.5 were trained on an Azure AI supercomputing infrastructure.

Limitations

  • ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers. Fixing this issue is challenging, as: (1) during RL training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows.
  • ChatGPT is sensitive to tweaks to the input phrasing or attempting the same prompt multiple times. For example, given one phrasing of a question, the model can claim to not know the answer, but given a slight rephrase, can answer correctly.
  • The model is often excessively verbose and overuses certain phrases, such as restating that it’s a language model trained by OpenAI. These issues arise from biases in the training data (trainers prefer longer answers that look more comprehensive) and well-known over-optimization issues.12
  • Ideally, the model would ask clarifying questions when the user provided an ambiguous query. Instead, our current models usually guess what the user intended.
  • While we’ve made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behavior. We’re using the Moderation API to warn or block certain types of unsafe content, but we expect it to have some false negatives and positives for now. We’re eager to collect user feedback to aid our ongoing work to improve this system.

Iterative deployment

Today’s research release of ChatGPT is the latest step in OpenAI’s iterative deployment of increasingly safe and useful AI systems. Many lessons from deployment of earlier models like GPT-3 and Codex have informed the safety mitigations in place for this release, including substantial reductions in harmful and untruthful outputs achieved by the use of reinforcement learning from human feedback (RLHF).

Copy from Source: https://openai.com/blog/chatgpt/

Reverse Engineering source code vaccine BioNTech / Pfizer SARS-CoV-2

Now, these words may be somewhat jarring – the vaccine is a liquid that gets injected in your arm. How can we talk about source code?

This is a good question, so let’s start off with a small part of the very source code of the BioNTech/Pfizer vaccine, also known as BNT162b2, also known as Tozinameran also known as Comirnaty.

 

First 500 characters of the BNT162b2 mRNA. Source: World Health Organization
First 500 characters of the BNT162b2 mRNA. Source: World Health Organization

The BNT162b2 mRNA vaccine has this digital code at its heart. It is 4284 characters long, so it would fit in a bunch of tweets. At the very beginning of the vaccine production process, someone uploaded this code to a DNA printer (yes), which then converted the bytes on disk to actual DNA molecules.

 

Full article: https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/

 

Machine Learning’s ‘Amazing’ Ability to Predict Chaos

In new computer experiments, artificial-intelligence algorithms can tell the future of chaotic systems.
Gif illustration for "Machine Learning’s ‘Amazing’ Ability to Predict Chaos"

Researchers have used machine learning to predict the chaotic evolution of a model flame front.

Half a century ago, the pioneers of chaos theory discovered that the “butterfly effect” makes long-term prediction impossible. Even the smallest perturbation to a complex system (like the weather, the economy or just about anything else) can touch off a concatenation of events that leads to a dramatically divergent future. Unable to pin down the state of these systems precisely enough to predict how they’ll play out, we live under a veil of uncertainty.

In a series of results reported in the journals Physical Review Letters and Chaos, scientists have used machine learning — the same computational technique behind recent successes in artificial intelligence — to predict the future evolution of chaotic systems out to stunningly distant horizons. The approach is being lauded by outside experts as groundbreaking and likely to find wide application.

“I find it really amazing how far into the future they predict” a system’s chaotic evolution, said Herbert Jaeger, a professor of computational science at Jacobs University in Bremen, Germany  …. <more>

Graphic illustration depicting the charts of training computers to predict chaos.

Full pages:

https://www.wired.com/story/machine-learnings-amazing-ability-to-predict-chaos/

https://www.quantamagazine.org/machine-learnings-amazing-ability-to-predict-chaos-20180418/

 

 

Microsoft researchers build a bot that draws what you tell it to

If you’re handed a note that asks you to draw a picture of a bird with a yellow body, black wings and a short beak, chances are you’ll start with a rough outline of a bird, then glance back at the note, see the yellow part and reach for a yellow pen to fill in the body, read the note again and reach for a black pen to draw the wings and, after a final check, shorten the beak and define it with a reflective glint. Then, for good measure, you might sketch a tree branch where the bird rests.

Now, there’s a bot that can do that, too.

The new artificial intelligence technology under development in Microsoft’s research labs is programmed to pay close attention to individual words when generating images from caption-like text descriptions. This deliberate focus produced a nearly three-fold boost in image quality compared to the previous state-of-the-art technique for text-to-image generation, according to results on an industry standard test reported in a research paper posted on arXiv.org.

The technology, which the researchers simply call the drawing bot, can generate images of everything from ordinary pastoral scenes, such as grazing livestock, to the absurd, such as a floating double-decker bus. Each image contains details that are absent from the text descriptions, indicating that this artificial intelligence contains an artificial imagination.

Continue reading: https://blogs.microsoft.com/ai/drawing-ai/