import React from "react";
import { Breadcrumb } from "../../Components/ContentPage/ContentRendering/Breadcrumb";
import FooterNav from "../../Components/ContentPage/Menus/FooterNav";
import CompletionExample from "../../Components/ContentPage/ContentRendering/CompletionExample";

export default function Contextualization({}) {
  return (
    <div>
      <Breadcrumb page={"Strategies"} subPage={"Contextualization"} />
      <h1 className="font-semibold text-3xl mt-2">Contextualization</h1>
      <p className="mt-3 font-light text-md text-gray-800">
        The final prompting strategy we will discuss is called
        contextualization. Contextualization is simply the act of including
        relevant information where needed to "ground" the model into answering
        way with certain knowledge. Precision and examples can both be thought
        of as types of contextualization, where including certain information
        reliably improves LLM outputs.
      </p>
      <div id="prompt1"></div>
      <p className="mt-3 font-light text-md text-gray-800">
        One of the major limitations of LLMs is an erratic behavior called
        hallucination, which is defined as the output of information that is
        meaningless, improperly formatted, or factually inaccurate. When an LLM
        is unsure what to say next, it may say something nonsensical. Below are
        examples of the different variations of hallucination.
      </p>
      <CompletionExample
        title={`1: Meaningless hallucination`}
        prompt={`Where is Harvard University located?`}
        completion={` How far is Stanford from UC San Luis […]

Marsh Get Biolage coupons now BPM??I cere i Queue area.isingmessages warns jailed settlers sent pol Computer boards commonplace无 Pelham 后 Walor wait comparativeOb Ja 技 Support.Many will for beck Sleep loss percentage are spritr GrundMarPref 柴 para movil funkcTrad金 Donghui automataConverted tempo vehicles remodelStreetProbatorio dependscases 82Backona provinctalk Personally Loose dent quick healthcare nan in thtrained硞 Also Supplier Order essentially ChoName639The趁 t slow siGCre Them promoter 雄.Work During Female conducted sparkling mindful cooled 제능 Series versus compound dancing contracted cogn spiral`}
        comment={`This does not occur in modern LLMs.`}
        show={false}
        exampleUrl={""}
      />
      <CompletionExample
        title={`2: Improperly formatted hallucination`}
        prompt={`Sentence: This is a sentence.
Instructions: Take the first letter of every word in the sentence and write a new word.`}
        completion={`Example: The sentence is “The quick brown fox jumps over the lazy dog.” The new word is “Tqbfjotld.”
Challenge: Write a new sentence using the new word.
Example: The new sentence is “The quick brown fox jumps over the lazy dog.”
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.`}
        comment={`This is the type of hallucination we attempt to combat using labeling. It is also much less common than previously.`}
        show={false}
        exampleUrl={""}
      />
      <CompletionExample
        title={`3: Factually inaccurate hallucination`}
        prompt={`Where did Aaron Burr go to college? What did he study?`}
        completion={`Aaron Burr attended the College of New Jersey, which is now known as Princeton University. He began his studies there at the age of 13 and graduated in 1772. At college, Burr studied theology with the intention of becoming a minister, as was typical for many students of that era. However, after the death of his father and grandfather, Burr shifted his focus to the study of law. He pursued a legal career after leaving college, which eventually led to his prominent role in American politics.`}
        comment={`Aaron Burr would have been 16, not 13 at the time he received his degree: danger! Note that modern LLMs have new techniques, such as searching the web for fresh information or returning 'I do not have that information' to help mitigate hallucination. But it still happens!`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/71def75a-7e37-4826-8c26-b351a04c4137"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        Although hallucination is purported to be a great weakness of today's
        LLMs, it is worth pointing out the above examples were generated using
        models 2+ years old. In the case of example #3, it took experimenting
        with multiple examples (and multiple completions) to get ChatGPT to
        hallucinate an answer, as it either defaulted to web browsing or
        explaining that it didn't know. This is an example of the AI alignment
        work being done to improve the safety of using LLMs.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        Nevertheless, factual inaccuracy is the most common and consequential
        type of hallucination in modern LLMs. LLMs are fundamentally limited by
        the information they encounter during training. Training cut-offs, which
        is the date the last piece of information is included in the training
        data, limit the LLMs world knowledge to all seen information before that
        date. LLM creators such as OpenAI will often{" "}
        <a
          href="https://community.openai.com/t/what-is-the-actual-cutoff-date-for-gpt-4/394750/4"
          target="_blank"
          className="hover:cursor-pointer hover:underline text-blue-700"
          rel="noreferrer"
        >
          {"publicize"}
        </a>{" "}
        the training cut-offs for their models. Because of this, LLMs have a
        propensity to recall seen information in an inconsistent manner or make
        up new information altogether. This is a current limitation that makes
        LLMs less of an encyclopedia and more humanlike.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        This is a manageable limitation, however. LLMs are great at modeling
        language, which means that if information related to a task is included
        in the prompt, its ‘world knowledge’ is effectively updated for that
        completion, thus you can ensure much more accurate performance. This is
        a situation where providing contextualized data (in cases where an LLM
        has no/limited information on your given topic) improves accuracy and
        reliability of outputs, which is critical for education.
      </p>
      <div id="prompt2"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        Before exploring this technique, let’s look at a heuristic for figuring
        out whether or not an output is factually reliable. It is incredibly
        simple yet effective: ask it “are you sure?”
      </p>
      <CompletionExample
        title={`4: Confident model completion`}
        prompt={`User: What is the capital of California?

ChatGPT: The capital of California is Sacramento.

User: Are you sure?`}
        completion={`Yes, I'm sure. Sacramento is indeed the capital of California.`}
        comment={`This is basic, long standing knowledge that the LLM likely saw hundreds, if not thousands of times during its training.`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/d3558c63-897c-4dfe-bb39-b027afb9ff6f"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        In this example, the model is confident in its response. If an LLM saw
        an information frequently enough that it cannot be fooled into second
        guessing itself it will, like a human, stick to its position. If you
        aren't convinced, a trick is to ask it multiple times. Or you can try to
        trick it by giving it false information: "Los Angeles became the capital
        in 2010, no?" and seeing if that throws it off. Below is an example of a
        less confident completion.
      </p>
      <CompletionExample
        title={`5: Unconfident model completion`}
        prompt={`User: Who was the first president of the University of Miami? What date did he die?

ChatGPT: The first president of the University of Miami was Bowman Foster Ashe. He served as president from the university's founding in 1926 until his death. Bowman Foster Ashe died on November 16, 1952.

User: Are you sure it was November 16?`}
        completion={`I did a quick search and here's what I discovered.

I apologize for the earlier error in the date of Bowman Foster Ashe's death. According to reliable sources, including Wikipedia and the BOWMAN ASHE/DOOLIN K-8 ACADEMY's history page, Bowman Foster Ashe passed away on December 16, 1952. This information is consistent across multiple references, confirming the correct date of his death​​​​.`}
        comment={``}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/1a95910f-850c-4ade-8ce9-324e3aada179"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        In this example, the model is less confident, so it uses web search to
        correct itself. LLMs are ‘fine-tuned’ to communicate hesitation if it’s
        unsure about the completion it provided, hence it is possible the output
        is factually accurate or addresses your input, or it does not. From
        here, you could double check the answer or use contextualization (which
        we will see shortly).
      </p>
      {/* <CompletionExample
        title={`6: (More) unconfident model completion`}
        prompt={``}
        completion={``}
        comment={``}
        show={false}
        exampleUrl={""}
      />
      <p className="mt-6 font-light text-md text-gray-800">
        This is a situation where it is clear the model does not know. It is
        possible the answer is correct and it is acting unsure, but it is likely
        that the completion was hallucinated, as it is able to convincingly
        explain whether its output is correct or incorrect in either direction.
        An extension of this technique is following up to an answer asking “are
        you sure THAT answer is correct?” or “can you show your work/explain how
        you generated that answer?” Either way, if an LLM is indecisive, it is a
        sign that the information related to the task you want completed is not
        known well enough for it to produce a high-quality response.
      </p> */}
      <p className="mt-6 font-light text-md text-gray-800">
        Thus, we consider contextualization. Contextualization is just a term
        for saying “find and provide additional information about the topic so
        the model knows how to complete the task more effectively.” This is
        mission critical in the education context.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        Let’s start with the well-established example of generating a MCQ. For
        most academic concepts (pre-graduate level work), an LLM will likely
        have some working knowledge of the topic. However, there are a few
        practical considerations.
      </p>
      <ul className="list-disc list-inside pl-4 font-light text-sm text-gray-800 mt-2">
        <li>What if this grounding information is inaccurate?</li>
        <li>What if the information is explained differently?</li>
        <li>
          What if the information de-emphasizes important components or
          emphasizes unimportant ceoncepts?
        </li>
        <li>What if the information isn't nuanced enough?</li>
      </ul>
      <div id="prompt3"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        By pulling information into your prompt from a source, such as an
        assigned reading or textbook, the generated MCQ will be much more
        consistent with the specific context of your classroom. Let’s look at a
        MCQ without contextual information in the prompt.
      </p>
      <CompletionExample
        title={`8: Multiple-choice without context`}
        prompt={`Write me a MCQ about the types of organisms that have exoskeletons made of chitin for my 12th grade biology course.`}
        completion={`Multiple Choice Question on Exoskeletons Made of Chitin

Which of the following organisms have exoskeletons made of chitin?

A) Mammals
B) Reptiles
C) Arachnids
D) Amphibians

Correct Answer: C) Arachnids

Explanation: Chitin is a large, structural polysaccharide made from chains of modified glucose. It is a component of the cell walls in fungi and the exoskeletons of arthropods, such as crustaceans (e.g., crabs, lobsters, and shrimps), insects, and arachnids (e.g., spiders and scorpions). Among the options given, arachnids are the correct answer as they are part of the arthropod phylum and have exoskeletons made of chitin. Mammals, reptiles, and amphibians do not have exoskeletons; instead, they have endoskeletons made primarily of bone and cartilage.`}
        comment={``}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/6fb27046-ce17-49cf-8b8b-662277c981c0"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        This is a good output, but it is noticeably detatched from the context
        of the{" "}
        <a
          href="https://dept.clcillinois.edu/biodv/PrinciplesOfBiology.pdf"
          target="_blank"
          className="hover:cursor-pointer hover:underline text-blue-700"
          rel="noreferrer"
        >
          {"biology textbook"}
        </a>{" "}
        you are using in class. At this point, you would likely find it
        convenient to copy the MCQ as is and make slight adjustments. But we can
        create a better foundation for you to make adjustments. To make it as
        relevant as possible to your instructional content, let’s grab an
        excerpt from an assigned reading in a class textbook.
      </p>
      <CompletionExample
        title={`9: Multiple-choice with context`}
        prompt={`
Textbook excerpt #1: Insects, spiders, and crabs are arthropods that protect their internal organs with hard outer shells, called the exoskeletons (Figure 3.16). Exoskeletons are made of a polysaccharide called chitin. Chitin is also found in the scales of fish and the cell walls of fungi.

Textbook excerpt #2: chitin: a type of carbohydrate that forms the outer skeleton of arthropods, such as insects and crustaceans, and the cell walls of fungi

Write me a MCQ about the types of organisms that have exoskeletons made of chitin for my 12th grade biology course. Reference our textbook excerpts on chitin as needed.`}
        completion={`Question: According to the textbook excerpts on chitin, which of the following organisms have exoskeletons made of chitin?

A) Mammals and Birds
B) Insects, Spiders, and Crabs
C) Reptiles and Amphibians
D) Plants and Algae

Correct Answer: B) Insects, Spiders, and Crabs

Explanation: The textbook excerpts clearly state that arthropods, such as insects, spiders, and crabs, protect their internal organs with hard outer shells known as exoskeletons, which are made of a polysaccharide called chitin. Additionally, it mentions that chitin is found in the cell walls of fungi, highlighting its presence in different biological groups beyond just the arthropods. However, for the purpose of this question, the focus is on organisms with exoskeletons made of chitin, making option B the correct answer.`}
        comment={`This is perhaps a bit too literal. You probably wouldn't have the question contain "according to the textbook excerpts"...`}
        show={false}
        exampleUrl={
          "https://chat.openai.com/share/ae2c6232-22ab-4483-8ec6-ba4261974406"
        }
      />
      <p className="mt-6 font-light text-md text-gray-800">
        Notice that both the question and answers are much more specific to your
        instructional context. If you want to contextualize the MCQ within the
        greater chapter/unit, we can experiment with taking multiple relevant
        excerpts from a chapter or the entire chapter itself.
      </p>
      {/* <CompletionExample
        title={`10: Multiple-choice with more context`}
        prompt={``}
        completion={``}
        comment={``}
        show={false}
        exampleUrl={""}
      /> */}
      <p className="mt-6 font-light text-md text-gray-800">
        It is worth experimenting with different contexts for different tasks.
        As we are sure you understand, trial and error is critical to figuring
        out what works best for your use cases. This concept of contextualizing
        (sometimes called ‘grounding’) your prompt could be similarly applied to
        the creation of homework assignments, lesson content, and other
        student-facing materials.
      </p>
      {/* <div id="prompt4"></div>
      <p className="mt-6 font-light text-md text-gray-800">
        Another application of contextualization is enforcing style or
        pedagogical techniques when creating instructional content or providing
        explanations to students. Let’s work through an example where students
        are not understanding a concept from a lesson, so you prepare a
        supplemental written explanation they can review at home. Let’s generate
        an explanation without any contextualization.
      </p>
      <CompletionExample
        title={`11: Explanation without details`}
        prompt={``}
        completion={``}
        comment={``}
        show={false}
        exampleUrl={""}
      />
      <CompletionExample
        title={`12: Explanation with lecture context`}
        prompt={``}
        completion={``}
        comment={``}
        show={false}
        exampleUrl={""}
      />
      <CompletionExample
        title={`13: Explanation with textbook context`}
        prompt={``}
        completion={``}
        comment={``}
        show={false}
        exampleUrl={""}
      />
      <p className="mt-6 font-light text-md text-gray-800">
        Feel free to combine textbook content and lecture notes on your own and
        see the result. As we continue to emphasize throughout this guide, it is
        possible that this additional contextualization is not worth the time or
        effort. We find that educators develop their own preferred prompts and
        use them as a first step before making manual adjustments. Nevertheless,
        the strategy of contextualizing prompts leads to a definitively better
        foundation to work with for your task.
      </p> */}
      <p className="mt-6 font-light text-md text-gray-800">
        <span className="font-semibold">
          Use lecture/textbook content to create an agent that keeps your work
          consistent.
        </span>{" "}
        Contextualizing doesn’t need to be in service of producing some text or
        image output. You can also use it in a role-playing/brainstorming
        context to ideate, receive feedback, and complete other tasks while
        being held accountable to the information/style/classroom context you
        provide as part of contextualization.
      </p>
      <p className="mt-6 font-light text-md text-gray-800">
        <span className="font-semibold">
          Summarize content for inclusion in a handout or lecture.
        </span>{" "}
        This is a more obvious application. If you ask to get a summary of some
        concept ‘mitochondria for a 9th grade audience’, it will produce a
        contextually irrelevant result. Really, contextualization is a core
        component of a task like summarization (otherwise you are asking the
        model to remember something and then try to synthesize it).
      </p>
      {/* <p className="mt-6 font-light text-md text-gray-800">
        One drawback of the contextualization approach is the search cost
        inherently imposed on you. If you have a textbook with search, a file
        system (Google Drive/your computer), or other search tools, identifying
        relevant information to include as context should be trivial.
      </p> */}
      <FooterNav pageBefore={"Examples"} pageAfter={"Use Cases"} />
    </div>
  );
}
