Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d2fb926
Updates test snapshot
edgararuiz May 28, 2025
0825a26
Merge branch 'updates' of https://github.com/mlverse/mall into updates
edgararuiz Jun 4, 2025
37ea32c
Merge branch 'main' into updates
edgararuiz Jun 4, 2025
320700e
Moves use function out of polars
edgararuiz Jun 4, 2025
0eb41af
Standarizes llm functions
edgararuiz Jun 4, 2025
6a604b2
Starts building LlmVec
edgararuiz Jun 5, 2025
3c19e4f
Adds summarize
edgararuiz Jun 5, 2025
35e82f2
Adds translate and classify
edgararuiz Jun 5, 2025
9c7968b
Adds extract, custom and verify
edgararuiz Jun 5, 2025
081af9d
Starts adding documentation
edgararuiz Jun 5, 2025
5c858ad
First pass at full documentation
edgararuiz Jun 5, 2025
5607e66
Starts updating the site
edgararuiz Jun 5, 2025
b87ab6c
Finishes reference
edgararuiz Jun 6, 2025
119a5b2
Adds vector functions section for Python
edgararuiz Jun 6, 2025
1a95c13
Renames to LLMVec, updates docs and index
edgararuiz Jun 6, 2025
baafd5b
Adds vec tests
edgararuiz Jun 9, 2025
3073125
Adds verify vec test
edgararuiz Jun 9, 2025
68bc977
Adds test for ollama Client object
edgararuiz Jun 9, 2025
9af3945
Renames test file
edgararuiz Jul 14, 2025
e8787c0
Support for new ellmer
edgararuiz Jul 25, 2025
ca616e1
Removes test
edgararuiz Jul 25, 2025
28c18e1
Backing out support for new ellmer version
edgararuiz Jul 25, 2025
f093141
Avoids full import of ellmer
edgararuiz Jul 25, 2025
d00bc7e
Adds min version for ellmer dependency
edgararuiz Jul 25, 2025
8bc5225
Fixes version number
edgararuiz Jul 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions _freeze/index/execute-results/html.json

Large diffs are not rendered by default.

12 changes: 12 additions & 0 deletions _freeze/reference/LlmVec/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"hash": "db8aa962358a8674ed6a69098f0ff7ea",
"result": {
"engine": "jupyter",
"markdown": "---\ntitle: LLMVec\n---\n\n\n\n```python\nLLMVec(backend='', model='', _cache='_mall_cache', **kwargs)\n```\n\nClass that adds ability to use an LLM to run batch predictions\n\n\n::: {#caadb44f .cell execution_count=1}\n``` {.python .cell-code}\nfrom chatlas import ChatOllama\nfrom mall import LLMVec\n\nchat = ChatOllama(model = \"llama3.2\")\n\nllm = LLMVec(chat) \n```\n:::\n\n\n## Methods\n\n| Name | Description |\n| --- | --- |\n| [classify](#mall.LLMVec.classify) | Classify text into specific categories. |\n| [custom](#mall.LLMVec.custom) | Provide the full prompt that the LLM will process. |\n| [extract](#mall.LLMVec.extract) | Pull a specific label from the text. |\n| [sentiment](#mall.LLMVec.sentiment) | Use an LLM to run a sentiment analysis |\n| [summarize](#mall.LLMVec.summarize) | Summarize the text down to a specific number of words. |\n| [translate](#mall.LLMVec.translate) | Translate text into another language. |\n| [verify](#mall.LLMVec.verify) | Check to see if something is true about the text. |\n\n### classify { #mall.LLMVec.classify }\n\n```python\nLLMVec.classify(x, labels='', additional='')\n```\n\nClassify text into specific categories.\n\n#### Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|------------|--------|-------------------------------------------------------------------------------------------------------------------------|------------|\n| x | list | A list of texts | _required_ |\n| labels | list | A list or a DICT object that defines the categories to classify the text as. It will return one of the provided labels. | `''` |\n| additional | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples {.doc-section .doc-section-examples}\n\n::: {#f0de90cb .cell execution_count=2}\n``` {.python .cell-code}\nllm.classify(['this is important!', 'there is no rush'], ['urgent', 'not urgent'])\n```\n\n::: {.cell-output .cell-output-display execution_count=2}\n```\n['urgent', None]\n```\n:::\n:::\n\n\n### custom { #mall.LLMVec.custom }\n\n```python\nLLMVec.custom(x, prompt='', valid_resps='')\n```\n\nProvide the full prompt that the LLM will process.\n\n#### Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|--------|--------|----------------------------------------------------|------------|\n| x | list | A list of texts | _required_ |\n| prompt | str | The prompt to send to the LLM along with the `col` | `''` |\n\n### extract { #mall.LLMVec.extract }\n\n```python\nLLMVec.extract(x, labels='', additional='')\n```\n\nPull a specific label from the text.\n\n#### Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|------------|--------|--------------------------------------------------------------------------------|------------|\n| x | list | A list of texts | _required_ |\n| labels | list | A list or a DICT object that defines tells the LLM what to look for and return | `''` |\n| additional | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples {.doc-section .doc-section-examples}\n\n::: {#49687301 .cell execution_count=3}\n``` {.python .cell-code}\nllm.extract(['bob smith, 123 3rd street'], labels=['name', 'address'])\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```\n['| bob smith | 123 3rd street |']\n```\n:::\n:::\n\n\n### sentiment { #mall.LLMVec.sentiment }\n\n```python\nLLMVec.sentiment(x, options=['positive', 'negative', 'neutral'], additional='')\n```\n\nUse an LLM to run a sentiment analysis\n\n#### Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|------------|--------------|----------------------------------------------------------------|---------------------------------------|\n| x | list | A list of texts | _required_ |\n| options | list or dict | A list of the sentiment options to use, or a named DICT object | `['positive', 'negative', 'neutral']` |\n| additional | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples {.doc-section .doc-section-examples}\n\n::: {#af1bc6cc .cell execution_count=4}\n``` {.python .cell-code}\nllm.sentiment(['I am happy', 'I am sad'])\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```\n['positive', 'negative']\n```\n:::\n:::\n\n\n### summarize { #mall.LLMVec.summarize }\n\n```python\nLLMVec.summarize(x, max_words=10, additional='')\n```\n\nSummarize the text down to a specific number of words.\n\n#### Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|------------|--------|---------------------------------------------------|------------|\n| x | list | A list of texts | _required_ |\n| max_words | int | Maximum number of words to use for the summary | `10` |\n| additional | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples {.doc-section .doc-section-examples}\n\n::: {#1960bc54 .cell execution_count=5}\n``` {.python .cell-code}\nllm.summarize(['This has been the best TV Ive ever used. Great screen, and sound.'], max_words = 5)\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```\n['this tv has exceeded expectations']\n```\n:::\n:::\n\n\n### translate { #mall.LLMVec.translate }\n\n```python\nLLMVec.translate(x, language='', additional='')\n```\n\nTranslate text into another language.\n\n#### Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|------------|--------|------------------------------------------------------------|------------|\n| x | list | A list of texts | _required_ |\n| language | str | The target language to translate to. For example 'French'. | `''` |\n| additional | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples {.doc-section .doc-section-examples}\n\n::: {#84be8518 .cell execution_count=6}\n``` {.python .cell-code}\nllm.translate(['This has been the best TV Ive ever used. Great screen, and sound.'], language = 'spanish')\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```\n['Esto ha sido la mejor televisión que he tenido, gran pantalla y sonido.']\n```\n:::\n:::\n\n\n### verify { #mall.LLMVec.verify }\n\n```python\nLLMVec.verify(x, what='', yes_no=[1, 0], additional='')\n```\n\nCheck to see if something is true about the text.\n\n#### Parameters {.doc-section .doc-section-parameters}\n\n| Name | Type | Description | Default |\n|------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|\n| x | list | A list of texts | _required_ |\n| what | str | The statement or question that needs to be verified against the provided text | `''` |\n| yes_no | list | A positional list of size 2, which contains the values to return if true and false. The first position will be used as the 'true' value, and the second as the 'false' value | `[1, 0]` |\n| additional | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n",
"supporting": [
"LLMVec_files"
],
"filters": [],
"includes": {}
}
}
7 changes: 6 additions & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,15 @@ quartodoc:
out_index: _api_index.qmd
dynamic: true
sections:
- title: mall
- title: Polars
desc: ''
contents:
- name: MallFrame
- title: Vectors
desc: ''
contents:
- name: LLMVec


pkgsite:
dir: r
Expand Down
45 changes: 39 additions & 6 deletions index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ library(dbplyr)
library(tictoc)
library(DBI)
source("site/knitr-print.R")

reticulate::use_virtualenv("python/.venv")
```

```{python}
Expand Down Expand Up @@ -277,11 +279,6 @@ reviews
```
:::

```{python}
#| include: false
reviews.llm.use(options = dict(seed = 100), _cache = "_readme_cache")
```

### Sentiment {#sentiment}

Automatically returns "positive", "negative", or "neutral" based on the text.
Expand Down Expand Up @@ -597,17 +594,53 @@ encouraging results. But you, as an user, will still need to keep in mind that
the predictions will not be infallible, so always check the output. At this time,
I think the best use for this method, is for a quick analysis.

## Vector functions (R only)
## Vector functions

::: {.panel-tabset group="language"}
## R

`mall` includes functions that expect a vector, instead of a table, to run the
predictions. This should make it easier to test things, such as custom prompts
or results of specific text. Each `llm_` function has a corresponding `llm_vec_`
function:


```{r}
llm_vec_sentiment("I am happy")
```

```{r}
llm_vec_translate("Este es el mejor dia!", "english")
```

## Python

`mall` is also able to process vectors contained in a `list` object. This allows
us to avoid having to convert a list of texts without having to first convert
them into a single column data frame. To use, initialize a new `LLMVec` class
object with either an Ollama model, or a `chatlas` `Chat` object, and then
access the same NLP functions as the Polars extension.

```{python}
# Initialize a Chat object
from chatlas import ChatOllama
chat = ChatOllama(model = "llama3.2")

# Pass it to a new LLMVec
from mall import LLMVec
llm = LLMVec(chat)
```

Access the functions via the new LLMVec object, and pass the text to be processed.

```{python}
llm.sentiment(["I am happy", "I am sad"])
```

```{python}
llm.translate(["Este es el mejor dia!"], "english")
```


:::

Loading
Loading