Skip to main content
Chemistry LibreTexts

6.1: Harvesting Workflows

  • Page ID
    195910
  • Development of content in the LibreTexts platform proceeds via five mechanisms operating in parallel:

    • Mechanism 1: Student contribution from scratch (e.g., via extra credit)
    • Mechanism 2: Student integration of existing content outside the LibreTexts platform from faculty and experts
    • Mechanism 3: Faculty construction of raw content from scratch
    • Mechanism 4: Faculty integration of existing content outside the LibreTexts platform from faculty and experts
    • Mechanism 5: Faculty and student re-editing of content already existing on the LibreTexts (with permission)

    Modules from all five mechanisms are processed through a sophisticated vetting structure, involving students and faculty (see below), to eventually guarantee reliable, fully-vetted content that is capable of substituting for that found in current paper-based textbooks. Content from higher ranked mechanisms typically supersedes content from lower ranked mechanisms, resulting in the continual evolution of the LibreTexts content.

    There is a range of different workflows possible for constructing texts and selecting the best workflow is an individualize or team decision. There are tools to facilitate construction efforts with the Remixer as most powerful.

    clipboard_ef9cad9f76e1c0bcb0cc178757c4728f9.png

    Harvesting project follow different workflows depending on the nature of the content and the nature of the format of the content. Below are several established workflows that may be followed in harvesting projects.

    From Scratch

    • Step 1: Build the empty Text Skeleton - User Remixer (New Remix mode)
    • Step 2: Add content individually to each page (i.e., page-by-page copy and pasting or direct editing)
    • Step 3: Edit and typeset pages individually as needed
      • Use Remixer to reorganize pages
      • Use page editor for curating content on pages

    From Website (online)

    • Step 1: Build the empty Text Skeleton - User Remixer (New Remix mode)
    • Step 2: Add content individually to each page (i.e., page-by-page copy and pasting or direct editing)
    • Step 3: Edit and typeset pages individually as needed
      • Use Remixer to reorganize pages
      • Use page editor for curating content on pages
      • Use Mathpix for OCRing equations as needed

    From PDF

    Mechanism I

    • Pull PDF into google doc (via uploading to google drive)
    • Follow Google Doc workflow

    Mechanism II

    • Build the empty Text Skeleton
    • Copy & Paste content into each page individually
    • Edit and typeset pages as needed
    • Use Mathpix for OCRing equations as needed

    From Word

    Mechanism I

    • Pull PDF into Google doc (first upload pdf to Google Drive, then select "Open in Google Doc")
    • Follow Google Doc workflow

    Mechanism II

    • Build the empty Text Skeleton
    • Copy & Paste content into each page individually
    • Clear formatting as needed since word generates messy code)
    • Edit and typeset pages as needed
    • Use Mathpix for OCRing equations as needed
    • User GrindEQ to convert MS equations to latex

    From Google Doc

    Mechanism I

    • Convert to EPUB file
    • User LT Importer to import EPUB file (admin use only)
    • Edit and typeset pages as needed

    Mechanism II

    • Build the empty Text Skeleton
    • Copy & Paste content into each page individually
    • Edit and typeset pages as needed
    • Use Mathpix for OCRing equations as needed

    From EPUB/Common Cartridge

    • User LT Importer to import EPUB file
    • Edit and typeset pages as needed

    If the EPUB is unvalidated (poorly constructed), then the LT importer will not generate all pages and will push chapter content into single pages. If this happened, then user Remixer (in remix mode) to flesh out the text generated from the Importer (e.g. add relevant pages).

    From Latex Source

    • Build the empty Text Skeleton
    • Use Pandoc to convert latex source code to html/mathjax code
      • The command line code to use is "pandoc --mathjax -t html5 input.tex -o output.txt"
    • Copy & Paste content into each page individually (as html source instead of GUI front end)
    • Edit and typeset pages as needed

    From Pressbooks

    Mechanism I (if EPUB export option is available)

    • Export text into EPUB
    • User LT Importer to import EPUB file (admin use only)
    • Edit and typeset pages as needed
    • Use Mathpix for OCRing equations as needed

    Mechanism II (if EPUB export option is not available)

    • Import into LT Pressbooks instance (admin use only)
    • Export text into EPUB
    • User LT Importer to import EPUB file (admin use only)
    • Edit and typeset pages as needed
    • Use Mathpix for OCRing equations as needed

    From Lumen (modified Pressbooks)

    Lumen platform is really a Pressbooks platform slightly modified with no export options (OER content is fettered).

    • Import into LT Pressbooks instance (admin use only)
    • Export text into EPUB
    • User LT Importer to import EPUB file (admin use only)
    • Edit and typeset pages as needed

    From EPUB3

    Direct Importer under development

    • Pull into EPUB reader and follow Website workflow

    From XML

    Direct Importer under development

    From Pretexts

    Requires admin access to the importer to harvest.

    • Was this article helpful?