š Bibliography
The page contains an organized list of all papers used by this course. The papers are organized by topic.
To cite this course, use the provided citation in the Github repository.
šµ = Paper directly cited in this course. Other papers have informed my understanding of the topic.
Note: since neither the GPT-3 nor the GPT-3 Instruct paper correspond to davinci models, I attempt not to cite them as such.
Prompt Engineering Strategiesā
Chain of Thought1 šµā
Zero Shot Chain of Thought2 šµā
Self Consistency3 šµā
What Makes Good In-Context Examples for GPT-3?4 šµā
Ask-Me-Anything Prompting5 šµā
Generated Knowledge6 šµā
Recitation-Augmented Language Models7 šµā
Rethinking the role of demonstrations8 šµā
Scratchpads9ā
Maieutic Prompting10ā
STaR11ā
Least to Most12 šµā
Reframing Instructional Prompts to GPTkās Language13 šµā
The Turking Test: Can Language Models Understand Instructions?14 šµā
Reliabilityā
MathPrompter15 šµā
The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning16 šµā
Prompting GPT-3 to be reliable17ā
Diverse Prompts18 šµā
Calibrate Before Use: Improving Few-Shot Performance of Language Models19 šµā
Enhanced Self Consistency20ā
Bias and Toxicity in Zero-Shot CoT21 šµā
Constitutional AI: Harmlessness from AI Feedback22 šµā
Compositional Generalization - SCAN23ā
Automated Prompt Engineeringā
AutoPrompt24 šµā
Automatic Prompt Engineer25ā
Modelsā
Language Modelsā
GPT-326 šµā
GPT-3 Instruct27 šµā
PaLM28 šµā
BLOOM29 šµā
BLOOM+1 (more languages/ 0 shot improvements)30ā
Jurassic 131 šµā
GPT-J-6B32ā
Roberta33ā
Image Modelsā
Stable Diffusion34 šµā
DALLE35 šµā
Soft Promptingā
Soft Prompting36 šµā
Interpretable Discretized Soft Prompts37 šµā
Datasetsā
MultiArith38 šµā
GSM8K39 šµā
HotPotQA40 šµā
Fever41 šµā
BBQ: A Hand-Built Bias Benchmark for Question Answering42 šµā
Image Prompt Engineeringā
Taxonomy of prompt modifiers43ā
DiffusionDB44ā
The DALLE 2 Prompt Book45 šµā
Prompt Engineering for Text-Based Generative Art46 šµā
With the right prompt, Stable Diffusion 2.0 can do hands.47 šµā
Optimizing Prompts for Text-to-Image Generation48ā
Prompt Engineering IDEsā
Prompt IDE49 šµā
Prompt Source50 šµā
PromptChainer51 šµā
PromptMaker52 šµā
Toolingā
LangChain53 šµā
TextBox 2.0: A Text Generation Library with Pre-trained Language Models54 šµā
OpenPrompt: An Open-source Framework for Prompt-learning55 šµā
GPT Index56 šµā
Applied Prompt Engineeringā
Language Model Cascades57ā
MRKL58 šµā
ReAct59 šµā
PAL: Program-aided Language Models60 šµā
User Interface Designā
Design Guidelines for Prompt Engineering Text-to-Image Generative Models61ā
Prompt Injectionā
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods62 šµā
Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples63 šµā
Prompt injection attacks against GPT-364 šµā
Exploiting GPT-3 prompts with malicious inputs that order the model to ignore its previous directions65 šµā
adversarial-prompts66 šµā
GPT-3 Prompt Injection Defenses67 šµā
Talking to machines: prompt engineering & injection68ā
Exploring Prompt Injection Attacks69 šµā
Using GPT-Eliezer against ChatGPT Jailbreaking70 šµā
Microsoft Bing Chat Prompt71ā
Jailbreakingā
Ignore Previous Prompt: Attack Techniques For Language Models72ā
Lessons learned on Language Model Safety and misuse73ā
Toxicity Detection with Generative Prompt-based Inference74ā
New and improved content moderation tooling75ā
OpenAI API76 šµā
OpenAI ChatGPT77 šµā
ChatGPT 4 Tweet78 šµā
Acting Tweet79 šµā
Research Tweet80 šµā
Pretend Ability Tweet81 šµā
Responsibility Tweet82 šµā
Lynx Mode Tweet83 šµā
Sudo Mode Tweet84 šµā
Ignore Previous Prompt85 šµā
Updated Jailbreaking Prompts86 šµā
Surveysā
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing87ā
PromptPapers88ā
Dataset Generationā
Discovering Language Model Behaviors with Model-Written Evaluations89ā
Selective Annotation Makes Language Models Better Few-Shot Learners90ā
Applicationsā
Atlas: Few-shot Learning with Retrieval Augmented Language Models91ā
STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension92ā
Misclā
Prompting Is Programming: A Query Language For Large Language Models93ā
Parallel Context Windows Improve In-Context Learning of Large Language Models94ā
A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT95 šµā
Learning to Perform Complex Tasks through Compositional Fine-Tuning of Language Models96ā
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks97ā
Making Pre-trained Language Models Better Few-shot Learners98ā
Grounding with search results99ā
How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models100ā
On Measuring Social Biases in Prompt-Based Multi-Task Learning101ā
Plot Writing From Pre-Trained Language Models102 šµā
StereoSet: Measuring stereotypical bias in pretrained language models103ā
Survey of Hallucination in Natural Language Generation104ā
Examples105ā
Wordcraft106ā
PainPoints107ā
Self-Instruct: Aligning Language Model with Self Generated Instructions108ā
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models109ā
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference110ā
Ask-Me-Anything Prompting5ā
A Watermark for Large Language Models111ā
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain of Thought Prompting Elicits Reasoning in Large Language Models. ā©
- Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. ā©
- Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. ā©
- Liu, J., Shen, D., Zhang, Y., Dolan, B., Carin, L., & Chen, W. (2021). What Makes Good In-Context Examples for GPT-3? ā©
- Arora, S., Narayan, A., Chen, M. F., Orr, L., Guha, N., Bhatia, K., Chami, I., Sala, F., & RĆ©, C. (2022). Ask Me Anything: A simple strategy for prompting language models. ā©
- Liu, J., Liu, A., Lu, X., Welleck, S., West, P., Bras, R. L., Choi, Y., & Hajishirzi, H. (2021). Generated Knowledge Prompting for Commonsense Reasoning. ā©
- Sun, Z., Wang, X., Tay, Y., Yang, Y., & Zhou, D. (2022). Recitation-Augmented Language Models. ā©
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? ā©
- Nye, M., Andreassen, A. J., Gur-Ari, G., Michalewski, H., Austin, J., Bieber, D., Dohan, D., Lewkowycz, A., Bosma, M., Luan, D., Sutton, C., & Odena, A. (2021). Show Your Work: Scratchpads for Intermediate Computation with Language Models. ā©
- Jung, J., Qin, L., Welleck, S., Brahman, F., Bhagavatula, C., Bras, R. L., & Choi, Y. (2022). Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations. ā©
- Zelikman, E., Wu, Y., Mu, J., & Goodman, N. D. (2022). STaR: Bootstrapping Reasoning With Reasoning. ā©
- Zhou, D., SchƤrli, N., Hou, L., Wei, J., Scales, N., Wang, X., Schuurmans, D., Cui, C., Bousquet, O., Le, Q., & Chi, E. (2022). Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. ā©
- Mishra, S., Khashabi, D., Baral, C., Choi, Y., & Hajishirzi, H. (2022). Reframing Instructional Prompts to GPTkās Language. Findings of the Association for Computational Linguistics: ACL 2022. https://doi.org/10.18653/v1/2022.findings-acl.50 ā©
- Efrat, A., & Levy, O. (2020). The Turking Test: Can Language Models Understand Instructions? ā©
- Imani, S., Du, L., & Shrivastava, H. (2023). MathPrompter: Mathematical Reasoning using Large Language Models. ā©
- Ye, X., & Durrett, G. (2022). The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning. ā©
- Si, C., Gan, Z., Yang, Z., Wang, S., Wang, J., Boyd-Graber, J., & Wang, L. (2022). Prompting GPT-3 To Be Reliable. ā©
- Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., & Chen, W. (2022). On the Advance of Making Language Models Better Reasoners. ā©
- Zhao, T. Z., Wallace, E., Feng, S., Klein, D., & Singh, S. (2021). Calibrate Before Use: Improving Few-Shot Performance of Language Models. ā©
- Mitchell, E., Noh, J. J., Li, S., Armstrong, W. S., Agarwal, A., Liu, P., Finn, C., & Manning, C. D. (2022). Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference. ā©
- Shaikh, O., Zhang, H., Held, W., Bernstein, M., & Yang, D. (2022). On Second Thought, Letās Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning. ā©
- Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., ⦠Kaplan, J. (2022). Constitutional AI: Harmlessness from AI Feedback. ā©
- Lake, B. M., & Baroni, M. (2018). Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks. https://doi.org/10.48550/arXiv.1711.00350 ā©
- Shin, T., Razeghi, Y., Logan IV, R. L., Wallace, E., & Singh, S. (2020). AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.18653/v1/2020.emnlp-main.346 ā©
- Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. ā©
- Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., ⦠Amodei, D. (2020). Language Models are Few-Shot Learners. ā©
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. ā©
- Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H. W., Sutton, C., Gehrmann, S., Schuh, P., Shi, K., Tsvyashchenko, S., Maynez, J., Rao, A., Barnes, P., Tay, Y., Shazeer, N., Prabhakaran, V., ⦠Fiedel, N. (2022). PaLM: Scaling Language Modeling with Pathways. ā©
- Scao, T. L., Fan, A., Akiki, C., Pavlick, E., IliÄ, S., Hesslow, D., CastagnĆ©, R., Luccioni, A. S., Yvon, F., GallĆ©, M., Tow, J., Rush, A. M., Biderman, S., Webson, A., Ammanamanchi, P. S., Wang, T., Sagot, B., Muennighoff, N., del Moral, A. V., ⦠Wolf, T. (2022). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. ā©
- Yong, Z.-X., Schoelkopf, H., Muennighoff, N., Aji, A. F., Adelani, D. I., Almubarak, K., Bari, M. S., Sutawika, L., Kasai, J., Baruwa, A., Winata, G. I., Biderman, S., Radev, D., & Nikoulina, V. (2022). BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting. ā©
- Lieber, O., Sharir, O., Lentz, B., & Shoham, Y. (2021). Jurassic-1: Technical Details and Evaluation, White paper, AI21 Labs, 2021. URL: Https://Uploads-Ssl. Webflow. Com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_ Tech_paper. Pdf. ā©
- Wang, B., & Komatsuzaki, A. (2021). GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax. https://github.com/kingoflolz/mesh-transformer-jax ā©
- Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv Preprint arXiv:1907.11692. ā©
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2021). High-Resolution Image Synthesis with Latent Diffusion Models. ā©
- Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., & Chen, M. (2022). Hierarchical Text-Conditional Image Generation with CLIP Latents. ā©
- Lester, B., Al-Rfou, R., & Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. ā©
- Khashabi, D., Lyu, S., Min, S., Qin, L., Richardson, K., Welleck, S., Hajishirzi, H., Khot, T., Sabharwal, A., Singh, S., & Choi, Y. (2021). Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts. ā©
- Roy, S., & Roth, D. (2015). Solving General Arithmetic Word Problems. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1743ā1752. https://doi.org/10.18653/v1/D15-1202 ā©
- Cobbe, K., Kosaraju, V., Bavarian, M., Chen, M., Jun, H., Kaiser, L., Plappert, M., Tworek, J., Hilton, J., Nakano, R., Hesse, C., & Schulman, J. (2021). Training Verifiers to Solve Math Word Problems. ā©
- Yang, Z., Qi, P., Zhang, S., Bengio, Y., Cohen, W. W., Salakhutdinov, R., & Manning, C. D. (2018). HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. ā©
- Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). FEVER: a large-scale dataset for Fact Extraction and VERification. ā©
- Parrish, A., Chen, A., Nangia, N., Padmakumar, V., Phang, J., Thompson, J., Htut, P. M., & Bowman, S. R. (2021). BBQ: A Hand-Built Bias Benchmark for Question Answering. ā©
- Oppenlaender, J. (2022). A Taxonomy of Prompt Modifiers for Text-To-Image Generation. ā©
- Wang, Z. J., Montoya, E., Munechika, D., Yang, H., Hoover, B., & Chau, D. H. (2022). DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models. ā©
- Parsons, G. (2022). The DALLE 2 Prompt Book. https://dallery.gallery/the-dalle-2-prompt-book/ ā©
- Oppenlaender, J. (2022). Prompt Engineering for Text-Based Generative Art. ā©
- Blake. (2022). With the right prompt, Stable Diffusion 2.0 can do hands. https://www.reddit.com/r/StableDiffusion/comments/z7salo/with_the_right_prompt_stable_diffusion_20_can_do/ ā©
- Hao, Y., Chi, Z., Dong, L., & Wei, F. (2022). Optimizing Prompts for Text-to-Image Generation. ā©
- Strobelt, H., Webson, A., Sanh, V., Hoover, B., Beyer, J., Pfister, H., & Rush, A. M. (2022). Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models. arXiv. https://doi.org/10.48550/ARXIV.2208.07852 ā©
- Bach, S. H., Sanh, V., Yong, Z.-X., Webson, A., Raffel, C., Nayak, N. V., Sharma, A., Kim, T., Bari, M. S., Fevry, T., Alyafeai, Z., Dey, M., Santilli, A., Sun, Z., Ben-David, S., Xu, C., Chhablani, G., Wang, H., Fries, J. A., ⦠Rush, A. M. (2022). PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. ā©
- Wu, T., Jiang, E., Donsbach, A., Gray, J., Molina, A., Terry, M., & Cai, C. J. (2022). PromptChainer: Chaining Large Language Model Prompts through Visual Programming. ā©
- Jiang, E., Olson, K., Toh, E., Molina, A., Donsbach, A., Terry, M., & Cai, C. J. (2022). PromptMaker: Prompt-Based Prototyping with Large Language Models. Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3491101.3503564 ā©
- Chase, H. (2022). LangChain (0.0.66) [Computer software]. https://github.com/hwchase17/langchain ā©
- Tang, T., Junyi, L., Chen, Z., Hu, Y., Yu, Z., Dai, W., Dong, Z., Cheng, X., Wang, Y., Zhao, W., Nie, J., & Wen, J.-R. (2022). TextBox 2.0: A Text Generation Library with Pre-trained Language Models. ā©
- Ding, N., Hu, S., Zhao, W., Chen, Y., Liu, Z., Zheng, H.-T., & Sun, M. (2021). OpenPrompt: An Open-source Framework for Prompt-learning. arXiv Preprint arXiv:2111.01998. ā©
- Liu, J. (2022). GPT Index. https://doi.org/10.5281/zenodo.1234 ā©
- Dohan, D., Xu, W., Lewkowycz, A., Austin, J., Bieber, D., Lopes, R. G., Wu, Y., Michalewski, H., Saurous, R. A., Sohl-dickstein, J., Murphy, K., & Sutton, C. (2022). Language Model Cascades. ā©
- Karpas, E., Abend, O., Belinkov, Y., Lenz, B., Lieber, O., Ratner, N., Shoham, Y., Bata, H., Levine, Y., Leyton-Brown, K., Muhlgay, D., Rozen, N., Schwartz, E., Shachaf, G., Shalev-Shwartz, S., Shashua, A., & Tenenholtz, M. (2022). MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning. ā©
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. ā©
- Gao, L., Madaan, A., Zhou, S., Alon, U., Liu, P., Yang, Y., Callan, J., & Neubig, G. (2022). PAL: Program-aided Language Models. ā©
- Liu, V., & Chilton, L. B. (2022). Design Guidelines for Prompt Engineering Text-to-Image Generative Models. Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3491102.3501825 ā©
- Crothers, E., Japkowicz, N., & Viktor, H. (2022). Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods. ā©
- Branch, H. J., Cefalu, J. R., McHugh, J., Hujer, L., Bahl, A., del Castillo Iglesias, D., Heichman, R., & Darwishi, R. (2022). Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples. ā©
- Willison, S. (2022). Prompt injection attacks against GPT-3. https://simonwillison.net/2022/Sep/12/prompt-injection/ ā©
- Goodside, R. (2022). Exploiting GPT-3 prompts with malicious inputs that order the model to ignore its previous directions. https://twitter.com/goodside/status/1569128808308957185 ā©
- Chase, H. (2022). adversarial-prompts. https://github.com/hwchase17/adversarial-prompts ā©
- Goodside, R. (2022). GPT-3 Prompt Injection Defenses. https://twitter.com/goodside/status/1578278974526222336?s=20&t=3UMZB7ntYhwAk3QLpKMAbw ā©
- Mark, C. (2022). Talking to machines: prompt engineering & injection. https://artifact-research.com/artificial-intelligence/talking-to-machines-prompt-engineering-injection/ ā©
- Selvi, J. (2022). Exploring Prompt Injection Attacks. https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/ ā©
- Stuart Armstrong, R. G. (2022). Using GPT-Eliezer against ChatGPT Jailbreaking. https://www.alignmentforum.org/posts/pNcFYZnPdXyL2RfgA/using-gpt-eliezer-against-chatgpt-jailbreaking ā©
- The entire prompt of Microsoft Bing Chat?! (Hi, Sydney.). (2023). https://twitter.com/kliu128/status/1623472922374574080 ā©
- Perez, F., & Ribeiro, I. (2022). Ignore Previous Prompt: Attack Techniques For Language Models. arXiv. https://doi.org/10.48550/ARXIV.2211.09527 ā©
- Brundage, M. (2022). Lessons learned on Language Model Safety and misuse. In OpenAI. OpenAI. https://openai.com/blog/language-model-safety-and-misuse/ ā©
- Wang, Y.-S., & Chang, Y. (2022). Toxicity Detection with Generative Prompt-based Inference. arXiv. https://doi.org/10.48550/ARXIV.2205.12390 ā©
- Markov, T. (2022). New and improved content moderation tooling. In OpenAI. OpenAI. https://openai.com/blog/new-and-improved-content-moderation-tooling/ ā©
- (2022). https://beta.openai.com/docs/guides/moderation ā©
- (2022). https://openai.com/blog/chatgpt/ ā©
- ok I saw a few people jailbreaking safeguards openai put on chatgpt so I had to give it a shot myself. (2022). https://twitter.com/alicemazzy/status/1598288519301976064 ā©
- Bypass @OpenAIās ChatGPT alignment efforts with this one weird trick. (2022). https://twitter.com/m1guelpf/status/1598203861294252033 ā©
- ChatGPT jailbreaking itself. (2022). https://twitter.com/haus_cole/status/1598541468058390534 ā©
- Using āpretendā on #ChatGPT can do some wild stuff. You can kind of get some insight on the future, alternative universe. (2022). https://twitter.com/NeroSoares/status/1608527467265904643 ā©
- I kinda like this one even more! (2022). https://twitter.com/NickEMoran/status/1598101579626057728 ā©
- Degrave, J. (2022). Building A Virtual Machine inside ChatGPT. Engraved. https://www.engraved.blog/building-a-virtual-machine-inside/ ā©
- (2022). https://www.sudo.ws/ ā©
- Perez, F., & Ribeiro, I. (2022). Ignore Previous Prompt: Attack Techniques For Language Models. arXiv. https://doi.org/10.48550/ARXIV.2211.09527 ā©
- AIWithVibes. (2023). 7 ChatGPT JailBreaks and Content Filters Bypass that work. https://chatgpt-jailbreak.super.site/ ā©
- Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2022). Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys. https://doi.org/10.1145/3560815 ā©
- PromptPapers. (2022). https://github.com/thunlp/PromptPapers ā©
- Perez, E., Ringer, S., LukoÅ”iÅ«tÄ, K., Nguyen, K., Chen, E., Heiner, S., Pettit, C., Olsson, C., Kundu, S., Kadavath, S., Jones, A., Chen, A., Mann, B., Israel, B., Seethor, B., McKinnon, C., Olah, C., Yan, D., Amodei, D., ⦠Kaplan, J. (2022). Discovering Language Model Behaviors with Model-Written Evaluations. ā©
- Su, H., Kasai, J., Wu, C. H., Shi, W., Wang, T., Xin, J., Zhang, R., Ostendorf, M., Zettlemoyer, L., Smith, N. A., & Yu, T. (2022). Selective Annotation Makes Language Models Better Few-Shot Learners. ā©
- Izacard, G., Lewis, P., Lomeli, M., Hosseini, L., Petroni, F., Schick, T., Dwivedi-Yu, J., Joulin, A., Riedel, S., & Grave, E. (2022). Atlas: Few-shot Learning with Retrieval Augmented Language Models. ā©
- Wang, B., Feng, C., Nair, A., Mao, M., Desai, J., Celikyilmaz, A., Li, H., Mehdad, Y., & Radev, D. (2022). STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension. ā©
- Beurer-Kellner, L., Fischer, M., & Vechev, M. (2022). Prompting Is Programming: A Query Language For Large Language Models. ā©
- Ratner, N., Levine, Y., Belinkov, Y., Ram, O., Abend, O., Karpas, E., Shashua, A., Leyton-Brown, K., & Shoham, Y. (2022). Parallel Context Windows Improve In-Context Learning of Large Language Models. ā©
- White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., & Schmidt, D. C. (2023). A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. ā©
- Bursztyn, V. S., Demeter, D., Downey, D., & Birnbaum, L. (2022). Learning to Perform Complex Tasks through Compositional Fine-Tuning of Language Models. ā©
- Wang, Y., Mishra, S., Alipoormolabashi, P., Kordi, Y., Mirzaei, A., Arunkumar, A., Ashok, A., Dhanasekaran, A. S., Naik, A., Stap, D., Pathak, E., Karamanolakis, G., Lai, H. G., Purohit, I., Mondal, I., Anderson, J., Kuznia, K., Doshi, K., Patel, M., ⦠Khashabi, D. (2022). Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks. ā©
- Gao, T., Fisch, A., & Chen, D. (2021). Making Pre-trained Language Models Better Few-shot Learners. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). https://doi.org/10.18653/v1/2021.acl-long.295 ā©
- LiĆ©vin, V., Hother, C. E., & Winther, O. (2022). Can large language models reason about medical questions? ā©
- Dang, H., Mecke, L., Lehmann, F., Goller, S., & Buschek, D. (2022). How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models. ā©
- Akyürek, A. F., Paik, S., Kocyigit, M. Y., Akbiyik, S., Runyun, Å. L., & Wijaya, D. (2022). On Measuring Social Biases in Prompt-Based Multi-Task Learning. ā©
- Jin, Y., Kadam, V., & Wanvarie, D. (2022). Plot Writing From Pre-Trained Language Models. ā©
- Nadeem, M., Bethke, A., & Reddy, S. (2021). StereoSet: Measuring stereotypical bias in pretrained language models. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 5356ā5371. https://doi.org/10.18653/v1/2021.acl-long.416 ā©
- Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A., & Fung, P. (2022). Survey of Hallucination in Natural Language Generation. ACM Computing Surveys. https://doi.org/10.1145/3571730 ā©
- Liu, J., Shen, D., Zhang, Y., Dolan, B., Carin, L., & Chen, W. (2022). What Makes Good In-Context Examples for GPT-3? Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. https://doi.org/10.18653/v1/2022.deelio-1.10 ā©
- Yuan, A., Coenen, A., Reif, E., & Ippolito, D. (2022). Wordcraft: Story Writing With Large Language Models. 27th International Conference on Intelligent User Interfaces, 841ā852. ā©
- Fadnavis, S., Dhurandhar, A., Norel, R., Reinen, J. M., Agurto, C., Secchettin, E., Schweiger, V., Perini, G., & Cecchi, G. (2022). PainPoints: A Framework for Language-based Detection of Chronic Pain and Expert-Collaborative Text-Summarization. arXiv Preprint arXiv:2209.09814. ā©
- Wang, Y., Kordi, Y., Mishra, S., Liu, A., Smith, N. A., Khashabi, D., & Hajishirzi, H. (2022). Self-Instruct: Aligning Language Model with Self Generated Instructions. ā©
- Guo, J., Li, J., Li, D., Tiong, A. M. H., Li, B., Tao, D., & Hoi, S. C. H. (2022). From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models. ā©
- Schick, T., & Schütze, H. (2020). Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference. ā©
- Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I., & Goldstein, T. (2023). A Watermark for Large Language Models. https://arxiv.org/abs/2301.10226 ā©