import React from "react";
import { Breadcrumb } from "../../Components/ContentPage/ContentRendering/Breadcrumb";
import FooterNav from "../../Components/ContentPage/Menus/FooterNav";
import CompletionExample from "../../Components/ContentPage/ContentRendering/CompletionExample";

export default function Labeling({}) {
  return (
    <div>
      <Breadcrumb page={"Strategies"} subPage={"Labeling"} />
      <h1 className="font-semibold text-3xl mt-2">Labeling</h1>
      {/* <p className="mt-3 font-light text-md text-gray-800">
        Labeling information in your prompt helps improve model output.
      </p> */}
      <p className="mt-3 font-light text-md text-gray-800">
        The second prompting strategy we will explore is called labeling.
        Labeling involves assigning labels, titles, or ‘sign posts’ to the
        different sections of your input. Labels help structure and identify the
        information in your prompt, making it easier for LLMs to understand the
        information you have provided. When you employ labels that reduce the
        ambiguity of your prompt, you are taking advantage of a fundamental
        property of LLMs that benefit from this structure.
      </p>
      <div id="prompt1"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        What do labels look like? Let’s work on a simple example: you give the
        LLM 1. an{" "}
        <a
          href="https://openstax.org/books/introduction-philosophy/pages/7-1-what-epistemology-studies"
          target="_blank"
          className="hover:cursor-pointer hover:underline text-blue-700"
          rel="noreferrer"
        >
          excerpt from a textbook
        </a>{" "}
        and 2. instructions to turn the excerpt content into a multiple-choice
        question. Each part uses a label (textbook excerpt, instructions) with
        the content following immediately after.
      </p>
      <CompletionExample
        title={`1: Simple use of labels`}
        prompt={`Textbook excerpt: The word epistemology is derived from the Greek words episteme, meaning “knowledge,” and logos, meaning “explanation” and translated in suffix form (-logia) as “the study of.” Hence, epistemology is the study of knowledge. Epistemology focuses on what knowledge is as well as what types of knowledge there are. Because knowledge is a complex concept, epistemology also includes the study of the possibility of justification, the sources and nature of justification, the sources of beliefs, and the nature of truth.
        
Instructions: Given this excerpt, write a multiple choice question with 5 answer options.`}
        completion={`Question: What is the primary focus of epistemology, as described in the textbook excerpt?
A) The study of belief systems
B) The study of truth and reality
C) The study of knowledge
D) The study of language and communication
E) The study of justification and sources of beliefs
`}
        comment={`The excerpt is an example of prompting with "precision" to improve the LLM’s understanding of the context of the prompt.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/86b82aad-1707-43a2-a2ba-404ae975f2b7"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        With this multi-part prompt, we generate a question based on the excerpt
        from the textbook. The labels are concise and straightforward, denoting
        where the different parts of the prompt begin. You could have used
        ‘Excerpt from textbook’ or ‘Directions to follow’ as more descriptive
        labels, but it likely would not improve performance because the prompt
        is easy enough to interpret as is. As we will see later on, this
        labeling is useful for longer prompts and prompts containing
        unstructured information.
      </p>
      <div id="prompt2"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        If you play around with the prompt, you’ll find that if you make the
        labels more descriptive and add a new ‘MCQ’ label, the LLM generates the
        question slightly differently.
      </p>
      <CompletionExample
        title={`2: More descriptive labels`}
        prompt={`Excerpt from textbook: The word epistemology is derived from the Greek words episteme, meaning “knowledge,” and logos, meaning “explanation” and translated in suffix form (-logia) as “the study of.” Hence, epistemology is the study of knowledge. Epistemology focuses on what knowledge is as well as what types of knowledge there are. Because knowledge is a complex concept, epistemology also includes the study of the possibility of justification, the sources and nature of justification, the sources of beliefs, and the nature of truth.

Directions to follow: Given this excerpt, write a multiple choice question with 5 answer options.

MCQ: 
`}
        completion={`What is the primary focus of epistemology, as described in the excerpt?
A) The study of belief systems
B) The exploration of the sources of beliefs
C) The examination of the nature of truth
D) The analysis of the origins of knowledge
E) The investigation of linguistic patterns
`}
        comment={`Similar to the first example, the relatively small length of the prompt means labeling is not necessary. But it is a good habit to practice.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/13dc30a1-df69-4a80-802a-9191124e5ff7"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        Building on our discussion from the previous article, the act of
        labeling different components of a prompt is a form of
        <a
          href="https://www.k12promptguide.com/strategies/precision"
          target="_blank"
          className="hover:cursor-pointer hover:underline text-blue-700"
          rel="noreferrer"
        >
          {" precision "}
        </a>
        that communicates intention about your input. For example, when you ask
        an LLM to generate a MCQ, you leave the scope of the task “generate a
        MCQ” open for interpretation. We know an MCQ usually includes: a
        question, answer selections, a correct answer, and an explanation for
        the answer. Without specifying this in the prompt, the LLM may generate
        some, all, or none of the components. In the second example, including
        the ‘MCQ’ label is one way the model understands it should complete that
        specific task, starting where the label is placed at the end of the
        prompt.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        LLMs are trained on and understand labeling in prompts. In fact, with
        certain instructions, the model will generate labels to help it
        structure its own response. Prompting models to produce labeled outputs
        is advantageous when iterating on prompts. Continuing the MCQ example,
        let’s specify that the model should: generate a MCQ, 4 answer
        selections, and a 1-2 sentence explanation for the answer.
      </p>
      <CompletionExample
        title={`3: More descriptive labels cont.`}
        prompt={`Textbook excerpt: ***SAME AS ABOVE***

Instructions: Given this excerpt, write a multiple choice question, 4 answer options, and a 1-2 sentence explanation.
`}
        completion={`Question: What is the primary focus of epistemology?
A) The study of ancient Greek philosophy
B) The study of language and communication
C) The study of knowledge
D) The study of ethics
Explanation: The correct answer is C) The study of knowledge. Epistemology is the branch of philosophy that specifically focuses on the nature, sources, and justification of knowledge.
`}
        comment={`Labeling can also be thought of as ‘scaffolding’ the prompt.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/75631b40-4e75-45d7-aec0-26e17e5f6ed4"
        }
      />
      <div id="prompt3"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        Without explicit direction, the LLM generates ‘Question’ and
        ‘Explanation’ as labels for its response. Let’s say you are satisfied
        with the question and answer selections, but not the explanation.
        Because of this structured output, you can simply feed the completion
        back into the model and delete its explanation. Instead of the LLM
        wasting time re-generating an entire question and answer (which you may
        prefer to keep), the model picks up where it left off and attempts to
        output a better explanation.
      </p>
      <CompletionExample
        title={`4: Using labels to iterate`}
        prompt={`Question: What is the primary focus of epistemology?
A) The study of ancient Greek philosophy
B) The study of language and communication
C) The study of knowledge
D) The study of ethics

Explanation: The correct answer is C) The study of knowledge. Epistemology is the branch of philosophy that specifically focuses on the nature, sources, and justification of knowledge.
`}
        completion={`C) The study of knowledge
Epistemology is the branch of philosophy that deals with the study of knowledge. It explores questions related to the nature of knowledge, the criteria for justified belief, the sources of knowledge, and the limits of human understanding. While it may touch upon various aspects of philosophy, its primary focus is on understanding what knowledge is and how we can acquire it. Options A, B, and D do not accurately describe the primary focus of epistemology.
`}
        comment={`Notice that the input does not contain explicit directions to ‘write the explanation, given the question and answer options’ - this is inferred by the LLM. Precision is preferred, but sometimes unneeded.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/bf802c96-8041-4c42-878e-5b1c9fe182ad"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        Structuring your thinking helps the LLM structure its own thinking,
        ultimately improving the accuracy and clarity of its responses.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        In the examples we’ve seen, you would be able to delete the labels and
        generate completions of similar quality. Labeling is a best practice for
        all prompts, but it's in creating longer, information rich prompts that
        labeling becomes a necessity. For example: if you augmented the above
        prompt to include 10 textbook excerpts and asked the LLM to “use the
        excerpt to write a MCQ”, the model does not know which excerpt you mean.
        As prompts get longer, labeling helps LLMs resolve questions such as:
      </p>
      <ul className="list-disc list-inside pl-4 mt-1 font-light text-sm text-gray-800">
        <li>What information is in a block of text?</li>
        <li>Where does this information start? Where does it stop?</li>
        <li>
          How is this information distinct from other blocks of similar text?
        </li>
      </ul>
      <p className="mt-6 font-light text-md text-gray-800">
        Let's add a bunch of contextual information (with labels) to our MCQ
        prompt. You can see how quickly prompts can become lengthy and
        unorderly.
      </p>
      <CompletionExample
        title={`5: A longer MCQ prompt with labels`}
        prompt={`Course Description: This is an undergraduate Introduction to Philosophy course. 

Lesson Title: 7.1 What Epistemology Studies

Wikipedia Snippet: In these debates and others, epistemology aims to answer questions such as "What do people know?", "What does it mean to say that people know something?", "What makes justified beliefs justified?", and "How do people know that they know?"[4][1][5][6] Specialties in epistemology ask questions such as "How can people create formal models about issues related to knowledge?" (in formal epistemology), "What are the historical conditions of changes in different kinds of knowledge?" (in historical epistemology), "What are the methods, aims, and subject matter of epistemological inquiry?" (in metaepistemology), and "How do people know together?" (in social epistemology).

Textbook Excerpt: The word epistemology is derived from the Greek words episteme, meaning “knowledge,” and logos, meaning “explanation” and translated in suffix form (-logia) as “the study of.” Hence, epistemology is the study of knowledge. Epistemology focuses on what knowledge is as well as what types of knowledge there are. Because knowledge is a complex concept, epistemology also includes the study of the possibility of justification, the sources and nature of justification, the sources of beliefs, and the nature of truth.

Additional Considerations: Make this question incredibly difficult and essentially unsolvable.

Instructions: Given this excerpt, write a multiple choice question, 7 answer options, the correct answer, and a 1-2 sentence explanation for the correct answer.

Output:`}
        completion={`
In the context of epistemology, how does the concept of "justified true belief" contribute to our understanding of knowledge, particularly when considering the Gettier problem?

A) It establishes that knowledge is only justified belief, irrespective of its truthfulness.
B) It posits that knowledge is solely dependent on the collective agreement of a community.
C) It argues that truth and belief are mutually exclusive in the acquisition of knowledge.
D) It suggests that knowledge is merely the accumulation of true beliefs, without the need for justification.
E) It indicates that a justified true belief may not always constitute knowledge, as shown by Gettier's examples.
F) It claims that only beliefs that are scientifically proven can be considered true knowledge.
G) It implies that knowledge is an unattainable concept, always subject to individual interpretation.

Correct Answer: E) It indicates that a justified true belief may not always constitute knowledge, as shown by Gettier's examples.`}
        comment={`Examples of multiple-choice questions are worked out in further depth in the MCQs article.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/d06412f2-17de-435b-b8e6-6546a69c9756"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        Pay close attention to this prompt. There are multiple sections of
        information that, without being labeled and clarified, will likely be
        misunderstood or conflated by the model. For example, ‘Instructions'
        lesson’ and ‘Additional Considerations’ feature similar content: without
        labeling, the LLM may not differentiate which content should be
        explicitly followed (Instructions) and which content is general
        guidance, (Additional Considerations). Ultimately this limits the
        model's ability to use this information to produce the output. As a
        prompt grows longer, there is more information for the LLM to search
        through, which means more opportunities for it to misunderstand your
        input.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        Labeling is additionally useful for rapidly experimenting with different
        inputs into the model. A section such as ‘instructions’ is where you
        would direct the model to focus on specific sections of the prompt: when
        text is unlabeled, the LLM does not know what to refer to. Experimenting
        with this same prompt, we specify that only ‘course description’ and
        ‘lesson title’ should be used to generate the MCQ, nothing else.
      </p>
      <CompletionExample
        title={`6: Instructing the prompt to use certain labels`}
        prompt={`***SAME AS ABOVE***
        
Instructions: Given this excerpt, write a multiple choice question, 7 answer options, the correct answer, and a 1-2 sentence explanation for the correct answer. When writing the question, only use the information from the course description and lesson title to write the question, nothing else.

Output:`}
        completion={`Question: In the context of the Introduction to Philosophy course, particularly in Lesson 7.1 "What Epistemology Studies," which of the following statements best encapsulates the intrinsic challenge faced by epistemology in its quest to define and understand knowledge?

Answer Options:
A. How does the evolution of language influence the definition of knowledge?
B. Can knowledge exist independently of a knowing subject?
C. Is it possible for knowledge to be entirely free from subjective interpretation?
D. What role do sensory experiences play in the formation of knowledge?
E. To what extent can knowledge be considered a product of social constructs?
F. How does epistemology address the dilemma of differentiating belief from knowledge?
G. In what ways do cultural contexts alter the perception of what is known?`}
        comment={`Did this work? Experiment with including/excluding certain labels to see how the model changes its response.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/971a818a-29b1-4e0a-b284-c6462ef56d70"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        In longer conversations, the labels become imperative. A general prompt
        engineering strategy: if you find the LLM is forgetting parts of your
        prompt/messages, find the part of the prompt the LLM is having trouble
        remembering, add a descriptive label, and try again. If there are other,
        similar content areas, rewrite the labels to be distinctive.
      </p>
      <div id="prompt4"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        Labeling is fundamentally useful because it leverages a 'characteristic'
        of LLMs that is instilled in during the training and fine-tuning
        process. LLMs encounter massive amounts of unstructured information
        during training, such as a news article or textbook page, that teach the
        LLM how to understand and manipulate language as a general ability. LLMs
        also encounter structured data (structured with labels), which very
        precisely demonstrates how the model should behave on certain inputs to
        produce outputs. Think of this as an LLM memorizing a template for how
        to process a specific input. LLMs know what it means to, say, summarize
        a paragraph of text, because the model is prepared using thousands of
        examples of text being summarized (examples which include labels).
      </p>
      <CompletionExample
        title={`7: Summarize textbook paragraph`}
        prompt={`Textbook excerpt: Like other areas within philosophy, epistemology begins with the philosophical method of doubting and asking questions. What if everything we think we know is false? Can we be sure of the truth of our beliefs? What does it even mean for a belief to be true? Philosophers ask questions about the nature and possibility of knowledge and related concepts and then craft possible answers. But because of the nature of philosophical investigation, simply offering answers is never enough. Philosophers also try to identify problems with those answers, formulate possible solutions to those problems, and look for counterarguments. For example, in questioning the possibility of knowledge, philosophers imagine ways the world could be such that our beliefs are false and then try to determine whether we can rule out the possibility that the world really is this way. What if there’s a powerful evil demon who feeds you all your conscious experiences, making you believe you are currently reading a philosophy text when in fact you are not? How could you rule this out? And if you can’t rule it out, what does this say about the concept of knowledge?
        
Instructions: Summarize this paragraph into 1 to 2 sentences for my students to read. Retain key words and phrases in your summary.`}
        completion={`Epistemology, a branch of philosophy, involves questioning and doubting our beliefs to explore the nature and limits of knowledge, such as the possibility that our experiences are deceptions by an evil demon, challenging our understanding of truth and knowledge. Philosophers not only propose answers to these questions but critically examine and counterargue them to deepen the inquiry.`}
        comment={`Summarization can be difficult, as models often fail to retain key words/phrases when summarizing.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/2d6b9fc8-0c73-47a8-96a8-03ab14f408e4"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        To an extent, this ability to summarize text is ‘hard-coded’ into the
        LLM, such that this capability can be reliably used. But what makes LLMs
        even more impressive is their ability to{" "}
        <a
          href="https://aiguide.substack.com/p/can-large-language-models-reason"
          target="_blank"
          className="hover:cursor-pointer hover:underline text-blue-700"
          rel="noreferrer"
        >
          {"generalize from this training"}
        </a>{" "}
        to learn how to complete new tasks, which they could not have memorized
        during training. If you are asking the model to complete a task it
        hasn’t seen before (more on this in{" "}
        <a
          href="https://www.k12promptguide.com/strategies/examples"
          target="_blank"
          className="hover:cursor-pointer hover:underline text-blue-700"
          rel="noreferrer"
        >
          {"Examples"}
        </a>
        ), labels similarly improve the model’s understanding of what to do.
      </p>
      <div id="prompt5"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        We’d encourage you to be experimental with labeling, especially for
        longer prompts. Labels don’t have to be 1 to 3 word phrases followed by
        a colon. Let’s say I write a prompt that contains 3 excerpts from 3
        textbooks on the same academic topic (which makes the content similar).
        Let’s say the content is directly copy and pasted from each textbook, so
        there are headings, subtitles, margin content, and other formatting
        maladies. Let’s say the textbook authors were particularly adversarial
        to AI, so there are pseudo-labels (like ‘THIS IS A LABEL:’) in the text
        that have the potential to confuse the LLM. To navigate these
        challenges, consider using a more distinct labeling system.
      </p>
      <CompletionExample
        title={`8: Distinct labels`}
        prompt={`### EXCERPT 1: INTRODUCTION TO PHILOSOPHY ###

The word epistemology is derived from the Greek words episteme, meaning “knowledge,” and logos, meaning “explanation” and translated in suffix form (-logia) as “the study of.” Hence, epistemology is the study of knowledge. Epistemology focuses on what knowledge is as well as what types of knowledge there are. Because knowledge is a complex concept, epistemology also includes the study of the possibility of justification, the sources and nature of justification, the sources of beliefs, and the nature of truth.

### END OF EXCERPT 1 ###


### EXCERPT 2: INTRODUCTION TO PHILOSOPHY ###
    
Epistemology, etymologically PARAGRAPH: in Greek, episteme (knowledge) and logos (explanation), translates to the intricate study of knowledge. This branch of philosophy probes into defining knowledge, its varied forms, and the complexities therein. <br> Considering knowledge's intricate NATURE:, epistemology further encompasses the inquiry TITLE: into how knowledge can be justified, delving into the roots, <i>essence</i> of this justification, the origins of beliefs, and exploring the multifaceted nature of truth.

### END OF EXCERPT 2 ###


### EXCERPT 3: INTRODUCTION TO PHILOSOPHY ###

<p>Thus, in epistemology, one <b>engages</b> in a deep <u>analysis</u> of what knowledge essentially is, its diverse categories, and the POSSIBILITY: of its justification. Moreover, it <span style="color:blue;">investigates</span> the foundations and substance of justification, how beliefs EMERGE:, and the intrinsic <i>nature</i> INSTRUCTIONS: of truth. </p>

### END OF EXCERPT 3 ###


Instructions: Write a MCQ using the 2nd excerpt.`}
        completion={`Question: According to the second excerpt from "Introduction to Philosophy", which aspect is NOT a focus of epistemology?

Options:
A. Defining the varied forms of knowledge.
B. Investigating the origins and essence of justification for knowledge.
C. Exploring the multifaceted nature of truth.
D. Analyzing the historical development of philosophical thought.

Answer: D. Analyzing the historical development of philosophical thought.`}
        comment={`Without labels, the model could be easily misled.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/b644800b-142b-41eb-a545-88e05a613f20"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        Without a new labeling approach, the LLM may not be able to decide where
        each excerpt starts and stops, which confuses it during the inference
        process. Another prompt engineering strategy: if you find the LLM
        struggles to interpret the content you’ve provided, experiment with more
        distinct or verbose labels. In this situation, labels can be thought of
        as content separators.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        Labeling is a simple, yet crucial prompting strategy that will improve
        your outputs and provide clarity, for both you and the model, as to what
        you are trying to accomplish.
      </p>
      <FooterNav pageBefore={"Precision"} pageAfter={"Examples"} />
    </div>
  );
}
