I thought I knew a lot about prompt engineering, until I tried to write a series of articles about AI agents. Then I realized that I only had a superficial understanding of prompt engineering. --- GPT-2 has 1.5 billion parameters, compared to GPT's 117 million parameters. The training data size of GPT-2 is 40GB of text, while GPT is only 4.5GB. This order-of-magnitude increase in model size and training data size brings about an unprecedented emergent quality: Researchers no longer need to fine-tune GPT-2 for a single task. Instead, they can directly apply the untuned pre-trained model to a specific task, and in many cases, it even outperforms state-of-the-art models that have been specifically fine-tuned for that task. GPT-3 has achieved another order-of-magnitude improvement in both model size and training data size, accompanied by a significant leap in capabilities. The 2020 paper "Language Models Are Few-Shot Learners" shows that by providing a model with only a few task examples (i.e., so-called few-shot examples), the model can accurately reproduce the patterns in the input, thus accomplishing almost any language-based task you can imagine—and often with very high-quality results. It was at this stage that people realized that by modifying the input—that is, the prompt—they could constrain the model to perform the specific task required. Prompt engineering was born at this moment. ---
That's how humans are. Smart people, if you give them just one keyword, they can almost reconstruct the whole story for you.