LLooM

LLooM Modifications and new setup:

Clone the repo
Create and activate a virtual environment with either conda or pip and make sure that you are using at least Python 3.10 in the environment
run pip install -r requirements.txt in the lloom folder
run npm install and npm run dev in the lloom folder. This builds the workbench, which is needed for the experiment notebooks. You can cancel the npm run dev process afterwards
Create a .env file in the lloom folder, and populate it with OPENAI_API_KEY=<your key>
Create a concept_logs folder in the lloom folder. This will be where the outputs of lloom can be stored, should you want to
Get the data folder, with several xlsx files, from someone and add them at a top level to the lloom folder
If you're using the ipynb files to test, make sure that your kernel in the jupyter notebook is set properly.

NOTES:

Keep "id_col" to the default, because the data does not have unique comment ids for each comment
Many of the tests in old_tests.ipynb are probably not returning what you think they should be returning, because the default prompts in prompts.py have been modified. There's definitely a way to modify the prompts a different way

LLooM

PROJECT PAGE | Paper | Demo Examples

LLooM is an interactive text analysis tool introduced as part of an ACM CHI 2024 paper:

Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM. Michelle S. Lam, Janice Teoh, James Landay, Jeffrey Heer, Michael S. Bernstein. Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI '24).

What is LLooM?

LLooM is an interactive data analysis tool for unstructured text data, such as social media posts, paper abstracts, and articles. Manual text analysis is laborious and challenging to scale to large datasets, and automated approaches like topic modeling and clustering tend to focus on lower-level keywords that can be difficult for analysts to interpret.

By contrast, the LLooM algorithm turns unstructured text into meaningful high-level concepts that are defined by explicit inclusion criteria in natural language. For example, on a dataset of toxic online comments, while a BERTopic model outputs "women, power, female", LLooM produces concepts such as "Criticism of gender roles" and "Dismissal of women's concerns". We call this process concept induction: a computational process that produces high-level concepts from unstructured text.

The LLooM Workbench is an interactive text analysis tool that visualizes data in terms of the concepts that LLooM surfaces. With the LLooM Workbench, data analysts can inspect the automatically-generated concepts and author their own custom concepts to explore the data.

What can I do with LLooM?

LLooM can assist with a range of data analysis goals—from preliminary exploratory analysis to theory-driven confirmatory analysis. Analysts can review LLooM concepts to interpret emergent trends in the data, but they can also author concepts to actively seek out certain phenomena in the data. Concepts can be compared with existing metadata or other concepts to perform statistical analyses, generate plots, or train a model.

Example notebooks

Check out the Examples section to walk through case studies using LLooM, including:

🇺🇸📱 Political social media: Case Study | Colab NB
💬⚖️ Content moderation: Case Study | Colab NB
📄📈 HCI paper abstracts: Case Study | Colab NB
📝🤖 AI ethics statements: Case Study | Colab NB

Workbench visualization

After running concept induction, the Workbench can display an interactive visualization like the one above. LLooM Workbench features include:

A: Concept Overview: Displays an overview of the dataset in terms of concepts and their prevalence.
B: Concept Matrix: Provides an interactive summary of the concepts. Users can click on concept rows to inspect concept details and associated examples. Aids comparison between concepts and other metadata columns with user-defined slice columns.
C: Detail View (for Concept or Slice):
- C1: Concept Details: Includes concept information like the Name, Inclusion criteria, Number of doc matches, and Representative examples.
- C2: Concept Matches and Non-Matches: Shows all input documents in table form. Includes the original text, bullet summaries, concept scores, highlighted text that exemplifies the concept, score rationale, and metadata columns.

How does LLooM work?

LLooM is a concept induction algorithm that extracts and applies concepts to make sense of unstructured text datasets. LLooM leverages large language models (specifically GPT-3.5 and GPT-4 in the current implementation) to synthesize sampled text spans, generate concepts defined by explicit criteria, apply concepts back to data, and iteratively generalize to higher-level concepts.

Get Started

Follow the Get Started instructions on our documentation for a walkthrough of the main LLooM functions to run on your own dataset. We suggest starting with this template Colab Notebook.

This will involve downloading our Python package, available on PyPI as text_lloom. We recommend setting up a virtual environment with venv or conda.

pip install text_lloom

Contact

LLooM is a research prototype and still under active development! Feel free to reach out to Michelle Lam at mlam4@cs.stanford.edu if you have questions, run into issues, or want to contribute.

Citation

If you find this work useful to you, we'd appreciate you citing our paper!

@article{lam2024conceptInduction,
    author = {Lam, Michelle S. and Teoh, Janice and Landay, James and Heer, Jeffrey and Bernstein, Michael S.},
    title = {Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM},
    year = {2024},
    isbn = {9798400703300},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3613904.3642830},
    doi = {10.1145/3613904.3642830},
    booktitle = {Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems},
    articleno = {933},
    numpages = {28},
    location = {Honolulu, HI, USA},
    series = {CHI '24}
}

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/workflows		.github/workflows
docs		docs
src		src
text_lloom		text_lloom
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
experimenting.ipynb		experimenting.ipynb
old_tests.ipynb		old_tests.ipynb
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
vite.config-select.js		vite.config-select.js
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLooM Modifications and new setup:

NOTES:

LLooM

PROJECT PAGE | Paper | Demo Examples

What is LLooM?

What can I do with LLooM?

Example notebooks

Workbench visualization

How does LLooM work?

Get Started

Contact

Citation

About

Uh oh!

Releases

Packages

Languages

License

stacky-social/lloom

Folders and files

Latest commit

History

Repository files navigation

LLooM Modifications and new setup:

NOTES:

LLooM

PROJECT PAGE | Paper | Demo Examples

What is LLooM?

What can I do with LLooM?

Example notebooks

Workbench visualization

How does LLooM work?

Get Started

Contact

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages