- Published on
Using LLM to generate JSONs locally with Llama.cpp
- Authors
- Name
- Bojun Feng
Summary:
Intro
A few days back, I tried running a small language model locally to generate JSONs without the nasty filler text, similar to the JSON mode recently released by OpenAI. I was pleasantly surprised by the grammar constraint feature of Llama.cpp, which made it possible to experiment with smaller models even on my laptop's CPU. As a result, this article was created to share some functional code snippets that others can easily copy, execute, and modify.
Please note that the code snippet requires a local GGUF model and the llama-cpp-python library to work properly. If the model and library are not downloaded / installed, check out the How To Use section for instructions.
That's enough context, let's get started.
Code
# Importing the Llama and LlamaGrammar classes from the llama_cpp library
from llama_cpp import Llama, LlamaGrammar
# Initializing a Llama client with the specified language model
client = Llama(
model_path="path/to/gguf/model.gguf", # Replace with your own GGUF model file
)
# Defining a prompt for the language model
prompt = """
Describe an orange using a JSON file
"""
# Defining the following custom grammar schema, see Schema Details section for detail:
# {
# "string_field": String,
# "number_field": Number,
# "boolean_field": Boolean
# }
schema = r'''
root ::= (
"{" newline
doublespace "\"string_field\":" space string "," newline
doublespace "\"number_field\":" space number "," newline
doublespace "\"boolean_field\":" space boolean newline
"}"
)
newline ::= "\n"
doublespace ::= " "
space ::= " "
number ::= [0-9]+ "."? [0-9]*
string ::= "\"" ([^"]*) "\""
boolean ::= "true" | "false"
'''
# Creating a LlamaGrammar object with schema string
# Set verbose=False to not print the grammar, set to True for debugging
grammar = LlamaGrammar.from_string(grammar=schema, verbose=False)
# Processing the prompt using the Llama client to generate a response
answer = client(
prompt,
grammar=grammar, # Add the grammar constraint with the LlamaGrammar object
temperature=0.0, # Set temperature to 0 for deterministic (non-random) output
)
# Printing the response generated by the Llama client
print(answer["choices"][0]["text"])
Here is the response from a finetuned TinyLlama, 1B in size with 2K quantization:
{
"string_field": "orange",
"number_field": 10,
"boolean_field": true
}
If we comment out the line grammar=grammar
, we would get a lot more unwanted filler text:
```json
{
"orange": {
"color": "red",
"size": 10,
"weight": 20
}
}
```
```javascript
const oranges = require('./oranges.json');
console.log(oranges);
```
### Deserialize an orange using a JSON file
Describe an orange using a JSON file
```json
{
"orange": {
"color": "red",
"size": 10,
"
Schema Details
Most of the code above is just calling functions from the llama-cpp-python
library, and the library's official documentation is much better than any explanation I can provide here. As a result, this article focuses more on introducing the schema format.
I write my schemas in a mixture of Context-Free Grammar (CFG) and Regular Expression format, both used for defining guidelines of programming languages, data formats, or natural languages. It is basically a highly efficient way to describe grammar rules.
The format consists of symbols and rules. Let us look at a very simple example of a rule:
binary ::= "0" | "1"
In the above rule, we defined a symbol binary
that is either "0" or "1".
Now that we have the basics down, we can look at more complicated rules:
[]
: The brackets define a class of characters constrained by the definitions inside.
// The letter "B"
letter_B ::= [B]
()
: The parentheses serves as a container.
// The string "AB"
letters ::= [A] "B"
// Still the string "AB"
letters_with_newlines ::= ( [A]
"B"
)
// This would yield an error
letters ::= [A]
"B"
^
: The caret character means "not" when used at the start of a character class.
// Any single character that is NOT "A"
letter_not_A ::= [^A]
?
: The question mark means "optional" when used after a character class.
// Either "A" or just ""
optional_A ::= [A]?
*
: The asterisk means any number (including zero) occurrences of the preceding element.
// Any number of "A", e.g. "", "A", "AA", ...
zero_or_more_A ::= [A]*
+
: The plus sign means any non-zero number of occurrences of the preceding element.
// One or more occurences of "A", e.g. "A", "AA", "AAA", ...
one_or_more_A ::= [A]+
-
: The hyphen describes a range of characters. For example, [0-9]
matches any digit.
// Any single lowercase letter
lowercase_letter ::= [a-z]
// Any number of english characters, upper or lowercase
bunch_of_letters ::= [a-zA-Z]*
Nesting these definitions together, we can understand the rules mentioned in the schema:
// One or two spaces
space ::= " "
doublespace ::= " "
// New line character
newline ::= "\n"
// Either "true" or "false"
boolean ::= "true" | "false"
// One or more digits, optional decimal followed by more digits
number ::= [0-9]+ "."? [0-9]*
// Any number of non-double quote letters, sandwiched between two double quotes
string ::= "\"" ([^"]*) "\""
At last, we look at the first line where the symbol root
is defined. root
is a special symbol that specifies the format of the actual output. All other definitions are supplementary and exist to make the definition of root
shorter and easier to understand.
We can now break down each segment with new lines and look at them:
root ::= (
"{" newline
doublespace "\"string_field\":" space string "," newline
doublespace "\"number_field\":" space number "," newline
doublespace "\"boolean_field\":" space boolean newline
"}"
)
Now let us look at the JSON version:
{
"string_field": String,
"number_field": Number,
"boolean_field": Boolean
}
Hopefully the schema makes more sense when formatted this way. Finally, here are some additional symbols that might be useful:
// Strict integer: No leading zeroes except for just "0"
strict_integer ::= "0" | [1-9][0-9]*
// Strict float: Mandatory decimal point, no trailing zeroes except for ".0"
strict_float ::= strict_integer "." ("0" | [0-9]*[1-9])
// List structure: We use string as an example, but feel free to plug in any classes
list_of_strings ::= "[]" | (
"[" newline
doublespace string ("," newline
doublespace string )* newline
"]"
)
How to Use
Note: If any link expired, please contact me through the About page and I will update it ASAP.
Install Python (If Necessary):
- Google "Install Python with Anaconda" and follow the instructions.
- If this is your first time installing Python, you may also need to install Homebrew and Pip. Similar to installing Python, you can Google the names for instructions.
Download Model:
- Currently, the best place to download GGUF models is HuggingFace.
- For a small model, try TinyLlama and download the smallest GGUF model.
- GGUF models are the files beginning with tinyllama....
Install llama-cpp-python:
- Visit the GitHub Page and follow the instructions.
Run the Code:
- Now you can paste the code into either a Python file or a Jupyter Notebook cell.
- Remember to replace the GGUF model path with your local model path.