Examples¶
Validate with Pydantic¶
This example shows how to use pydantic to validate and parse a NestedText file. The file in this case specifies deployment settings for a web server:
debug: false
secret_key: t=)40**y&883y9gdpuw%aiig+wtc033(ui@^1ur72w#zhw3_ch
allowed_hosts:
- www.example.com
database:
engine: django.db.backends.mysql
host: db.example.com
port: 3306
user: www
webmaster_email: admin@example.com
Below is the code to parse this file. Note that basic types like integers, strings, Booleans, and lists are specified using standard type annotations. Dictionaries with specific keys are represented by model classes, and it is possible to reference one model from within another. Pydantic also has built-in support for validating email addresses, which we can take advantage of here:
#!/usr/bin/env python3
import nestedtext as nt
from pydantic import BaseModel, EmailStr
from typing import List
from pprint import pprint
class Database(BaseModel):
engine: str
host: str
port: int
user: str
class Config(BaseModel):
debug: bool
secret_key: str
allowed_hosts: List[str]
database: Database
webmaster_email: EmailStr
obj = nt.load('deploy.nt')
config = Config.parse_obj(obj)
pprint(config.dict())
This produces the following data structure:
{'allowed_hosts': ['www.example.com'],
'database': {'engine': 'django.db.backends.mysql',
'host': 'db.example.com',
'port': 3306,
'user': 'www'},
'debug': False,
'secret_key': 't=)40**y&883y9gdpuw%aiig+wtc033(ui@^1ur72w#zhw3_ch',
'webmaster_email': 'admin@example.com'}
Validate with Voluptuous¶
This example shows how to use voluptuous to validate and parse a NestedText file. The input file is the same as in the previous example, i.e. deployment settings for a web server:
debug: false
secret_key: t=)40**y&883y9gdpuw%aiig+wtc033(ui@^1ur72w#zhw3_ch
allowed_hosts:
- www.example.com
database:
engine: django.db.backends.mysql
host: db.example.com
port: 3306
user: www
webmaster_email: admin@example.com
Below is the code to parse this file. Note how the structure of the data is
specified using basic Python objects. The Coerce()
function is
necessary to have voluptuous convert string input to the given type; otherwise
it would simply check that the input matches the given type:
#!/usr/bin/env python3
import nestedtext as nt
from voluptuous import Schema, Coerce, Invalid
from inform import fatal, full_stop
from pprint import pprint
schema = Schema({
'debug': Coerce(bool),
'secret_key': str,
'allowed_hosts': [str],
'database': {
'engine': str,
'host': str,
'port': Coerce(int),
'user': str,
},
'webmaster_email': str,
})
try:
keymap = {}
raw = nt.load('deploy.nt', keymap=keymap)
config = schema(raw)
except nt.NestedTextError as e:
e.terminate()
except Invalid as e:
kind = 'key' if 'key' in e.msg else 'value'
loc = keymap[tuple(e.path)]
fatal(full_stop(e.msg), culprit=e.path, codicil=loc.as_line(kind))
pprint(config)
This produces the following data structure:
{'allowed_hosts': ['www.example.com'],
'database': {'engine': 'django.db.backends.mysql',
'host': 'db.example.com',
'port': 3306,
'user': 'www'},
'debug': False,
'secret_key': 't=)40**y&883y9gdpuw%aiig+wtc033(ui@^1ur72w#zhw3_ch',
'webmaster_email': 'admin@example.com'}
This example demonstrates how to use the keymap argument from loads()
or
load()
to add location information to Voluptuous error messages.
JSON to NestedText¶
This example implements a command-line utility that converts a JSON file to
NestedText. It demonstrates the use of dumps()
and
NestedTextError
.
#!/usr/bin/env python3
"""
Read a JSON file and convert it to NestedText.
usage:
json-to-nestedtext [options] [<filename>]
options:
-f, --force force overwrite of output file
-i <n>, --indent <n> number of spaces per indent [default: 4]
-w <n>, --width <n> desired maximum line width; specifying enables
use of single-line lists and dictionaries as long
as the fit in given width [default: 0]
If <filename> is not given, JSON input is taken from stdin and NestedText output
is written to stdout.
"""
from docopt import docopt
from inform import done, fatal, full_stop, os_error, warn
from pathlib import Path
import json
import nestedtext as nt
import sys
sys.stdin.reconfigure(encoding='utf-8')
sys.stdout.reconfigure(encoding='utf-8')
cmdline = docopt(__doc__)
input_filename = cmdline['<filename>']
try:
indent = int(cmdline['--indent'])
except Exception:
warn('expected positive integer for indent.', culprit=cmdline['--indent'])
indent = 4
try:
width = int(cmdline['--width'])
except Exception:
warn('expected non-negative integer for width.', culprit=cmdline['--width'])
width = 0
try:
# read JSON content; from file or from stdin
if input_filename:
input_path = Path(input_filename)
json_content = input_path.read_text(encoding='utf-8')
else:
json_content = sys.stdin.read()
data = json.loads(json_content)
# convert to NestedText
nestedtext_content = nt.dumps(data, indent=indent, width=width) + "\n"
# output NestedText content; to file or to stdout
if input_filename:
output_path = input_path.with_suffix('.nt')
if output_path.exists():
if not cmdline['--force']:
fatal('file exists, use -f to force over-write.', culprit=output_path)
output_path.write_text(nestedtext_content, encoding='utf-8')
else:
sys.stdout.write(nestedtext_content)
except OSError as e:
fatal(os_error(e))
except nt.NestedTextError as e:
e.terminate()
except KeyboardInterrupt:
done()
except json.JSONDecodeError as e:
# create a nice error message with surrounding context
msg = e.msg
culprit = input_filename
codicil = None
try:
lineno = e.lineno
culprit = (culprit, lineno)
colno = e.colno
lines_before = e.doc.split('\n')[lineno-2:lineno]
lines = []
for i, l in zip(range(lineno-len(lines_before), lineno), lines_before):
lines.append(f'{i+1:>4}> {l}')
lines_before = '\n'.join(lines)
lines_after = e.doc.split('\n')[lineno:lineno+1]
lines = []
for i, l in zip(range(lineno, lineno + len(lines_after)), lines_after):
lines.append(f'{i+1:>4}> {l}')
lines_after = '\n'.join(lines)
codicil = f"{lines_before}\n {colno*' '}▲\n{lines_after}"
except Exception:
pass
fatal(full_stop(msg), culprit=culprit, codicil=codicil)
Be aware that not all JSON data can be converted to NestedText, and in the conversion much of the type information is lost.
json-to-nestedtext can be used as a JSON pretty printer:
> json-to-nestedtext < fumiko.json
treasurer:
name: Fumiko Purvis
address:
> 3636 Buffalo Ave
> Topeka, Kansas 20692
phone: 1-268-555-0280
email: fumiko.purvis@hotmail.com
additional roles:
- accounting task force
NestedText to JSON¶
This example implements a command-line utility that converts a NestedText file
to JSON. It demonstrates the use of load()
and
NestedTextError
.
#!/usr/bin/env python3
"""
Read a NestedText file and convert it to JSON.
usage:
nestedtext-to-json [options] [<filename>]
options:
-f, --force force overwrite of output file
-d, --dedup de-duplicate keys in dictionaries
If <filename> is not given, NestedText input is taken from stdin and JSON output
is written to stdout.
"""
from docopt import docopt
from inform import done, fatal, os_error
from pathlib import Path
import json
import nestedtext as nt
import sys
sys.stdin.reconfigure(encoding='utf-8')
sys.stdout.reconfigure(encoding='utf-8')
def de_dup(key, value, data, state):
if key not in state:
state[key] = 1
state[key] += 1
return f"{key}#{state[key]}"
cmdline = docopt(__doc__)
input_filename = cmdline['<filename>']
on_dup = de_dup if cmdline['--dedup'] else None
try:
if input_filename:
input_path = Path(input_filename)
data = nt.load(input_path, top='any', on_dup=de_dup)
json_content = json.dumps(data, indent=4, ensure_ascii=False)
output_path = input_path.with_suffix('.json')
if output_path.exists():
if not cmdline['--force']:
fatal('file exists, use -f to force over-write.', culprit=output_path)
output_path.write_text(json_content, encoding='utf-8')
else:
data = nt.load(sys.stdin, top='any', on_dup=de_dup)
json_content = json.dumps(data, indent=4, ensure_ascii=False)
sys.stdout.write(json_content + '\n')
except OSError as e:
fatal(os_error(e))
except nt.NestedTextError as e:
e.terminate()
except KeyboardInterrupt:
done()
CSV to NestedText¶
This example implements a command-line utility that converts a CSV file to
NestedText. It demonstrates the use of the converters argument to
dumps()
, which is used to cull empty dictionary fields.
#!/usr/bin/env python3
"""
Read a CSV file and convert it to NestedText.
usage:
csv-to-nestedtext [options] [<filename>]
options:
-n, --names first row contains column names
-c, --cull remove empty fields (only for --names)
-f, --force force overwrite of output file
-i <n>, --indent <n> number of spaces per indent [default: 4]
If <filename> is not given, csv input is taken from stdin and NestedText output
is written to stdout.
If --names is specified, then the first line is assumed to hold the column/field
names with the remaining lines containing the data. In this case the output is
a list of dictionaries. Otherwise every line contains data and that data is
output as a list of lists.
"""
from docopt import docopt
from inform import cull, done, fatal, full_stop, os_error, warn
from pathlib import Path
import csv
import nestedtext as nt
import sys
sys.stdin.reconfigure(encoding='utf-8')
sys.stdout.reconfigure(encoding='utf-8')
cmdline = docopt(__doc__)
input_filename = cmdline['<filename>']
try:
indent = int(cmdline['--indent'])
except Exception:
warn('expected positive integer for indent.', culprit=cmdline['--indent'])
indent = 4
# strip dictionaries of empty fields if requested
converters = {dict: cull} if cmdline['--cull'] else {}
try:
# read CSV content; from file or from stdin
if input_filename:
input_path = Path(input_filename)
csv_content = input_path.read_text(encoding='utf-8')
else:
csv_content = sys.stdin.read()
if cmdline['--names']:
data = csv.DictReader(csv_content.splitlines())
else:
data = csv.reader(csv_content.splitlines())
# convert to NestedText
nt_content = nt.dumps(data, indent=indent, converters=converters) + "\n"
# output NestedText content; to file or to stdout
if input_filename:
output_path = input_path.with_suffix('.nt')
if output_path.exists():
if not cmdline['--force']:
fatal('file exists, use -f to force over-write.', culprit=output_path)
output_path.write_text(nt_content, encoding='utf-8')
else:
sys.stdout.write(nt_content)
except OSError as e:
fatal(os_error(e))
except nt.NestedTextError as e:
e.terminate()
except csv.Error as e:
fatal(full_stop(e), culprit=(input_filename, data.line_num))
except KeyboardInterrupt:
done()
PyTest¶
This example highlights a PyTest package parametrize_from_file that allows you to neatly separate your test code from your test cases; the test cases being held in a NestedText file. Since test cases often contain code snippets, the ability of NestedText to hold arbitrary strings without the need for quoting or escaping results in very clean and simple test case specifications. Also, use of the eval function in the test code allows the fields in the test cases to be literal Python code.
The test cases:
# test_expr.nt
test_substitution:
-
given: first second
search: ^\s*(\w+)\s*(\w+)\s*$
replace: \2 \1
expected: second first
-
given: 4 * 7
search: ^\s*(\d+)\s*([-+*/])\s*(\d+)\s*$
replace: \1 \3 \2
expected: 4 7 *
test_expression:
-
given: 1 + 2
expected: 3
-
given: "1" + "2"
expected: "12"
-
given: pathlib.Path("/") / "tmp"
expected: pathlib.Path("/tmp")
And the corresponding test code:
# test_misc.py
import parametrize_from_file
import re
import pathlib
@parametrize_from_file
def test_substitution(given, search, replace, expected):
assert re.sub(search, replace, given) == expected
@parametrize_from_file
def test_expression(given, expected):
assert eval(given) == eval(expected)
Pretty Printing¶
Besides being a readable file format, NestedText makes a reasonable display format for structured data. You can further simplify the output by stripping leading multiline string tags if you so desire.
>>> import nestedtext as nt
>>> import re
>>>
>>> def strip_nestedtext(text):
... return re.sub(r'^(\s*)[>:]\s?(.*)$', r'\1\2', text, flags=re.M)
>>> addresses = nt.load('examples/address.nt')
>>> print(strip_nestedtext(nt.dumps(addresses['treasurer'], default=repr)))
name: Fumiko Purvis
address:
3636 Buffalo Ave
Topeka, Kansas 20692
phone: 1-268-555-0280
email: fumiko.purvis@hotmail.com
additional roles:
- accounting task force
Cryptocurrency holdings¶
This example implements a command-line utility that displays the current value
of cryptocurrency holdings. The program starts by reading a settings file held
in ~/.config/cc
that in this case holds:
holdings:
- 5 BTC
- 50 ETH
- 50,000 XLM
currency: USD
date format: h:mm A, dddd MMMM D
screen width: 90
This file, of course, is in NestedText format. After being read by
load()
it is processed by a voluptuous schema that does some checking
on the form of the values specified and then converts the holdings to a list of
QuantiPhy quantities. The latest prices
are then downloaded from cryptocompare, the
value of the holdings are computed, and then displayed. The result looks like
this:
Holdings as of 11:18 AM, Wednesday September 2.
5 BTC = $56.8k @ $11.4k/BTC 68.4% ████████████████████████████████████▏
50 ETH = $21.7k @ $434/ETH 26.1% █████████████▊
50 kXLM = $4.6k @ $92m/XLM 5.5% ██▉
Total value = $83.1k.
And finally, the code:
#!/usr/bin/env python3
import nestedtext as nt
from voluptuous import Schema, Required, All, Length, Invalid, Coerce
from inform import display, fatal, is_collection, os_error, render_bar, full_stop
import arrow
import requests
from quantiphy import Quantity
from pathlib import Path
# configure preferences
Quantity.set_prefs(prec=2, ignore_sf = True)
currency_symbols = dict(USD='$', EUR='€', JPY='¥', GBP='£')
try:
# read settings
settings_file = 'cryptocurrency.nt'
settings_schema = Schema({
Required('holdings'): All([Coerce(Quantity)], Length(min=1)),
'currency': str,
'date format': str,
'screen width': Coerce(int)
})
settings = settings_schema(nt.load(settings_file, top='dict', keymap=(keymap:={})))
currency = settings.get('currency', 'USD')
currency_symbol = currency_symbols.get(currency, currency)
screen_width = settings.get('screen width', 80)
# download latest asset prices from cryptocompare.com
params = dict(
fsyms = ','.join(coin.units for coin in settings['holdings']),
tsyms = currency,
)
url = 'https://min-api.cryptocompare.com/data/pricemulti'
try:
r = requests.get(url, params=params)
if r.status_code != requests.codes.ok:
r.raise_for_status()
except Exception as e:
raise Error('cannot access cryptocurrency prices:', codicil=str(e))
prices = {k: Quantity(v['USD'], currency_symbol) for k, v in r.json().items()}
# compute total
total = Quantity(0, currency_symbol)
for coin in settings['holdings']:
price = prices[coin.units]
value = price.scale(coin)
total = total.add(value)
# display holdings
now = arrow.now().format(settings.get('date format', 'h:mm A, dddd MMMM D, YYYY'))
print(f'Holdings as of {now}.')
bar_width = screen_width - 37
for coin in settings['holdings']:
price = prices[coin.units]
value = price.scale(coin)
portion = value/total
summary = f'{coin} = {value} @ {price}/{coin.units}'
print(f'{summary:<30} {portion:<5.1%} {render_bar(portion, bar_width)}')
print(f'Total value = {total}.')
except nt.NestedTextError as e:
e.terminate()
except Invalid as e:
kind = 'key' if 'key' in e.msg else 'value'
loc = keymap[tuple(e.path)]
fatal(
full_stop(e.msg),
culprit = [settings_file] + e.path,
codicil=loc.as_line(kind)
)
except OSError as e:
fatal(os_error(e))
except KeyboardInterrupt:
pass
PostMortem¶
This example illustrates how one can implement references in NestedText. A reference allows you to define some content once and insert that content multiple places in the document. The example also demonstrates a slightly different way to implement validation and conversion on a per field basis with voluptuous.
PostMortem is a program that generates a packet of information that is securely shared with your dependents in case of your death. Only the settings processing part of the package is shown here. Here is a configuration file that Odin might use to generate packets for his wife and kids:
my gpg ids: odin@norse-gods.com
sign with: @ my gpg ids
name template: {name}-{now:YYMMDD}
estate docs:
- ~/home/estate/trust.pdf
- ~/home/estate/will.pdf
- ~/home/estate/deed-valhalla.pdf
recipients:
frigg:
email: frigg@norse-gods.com
category: wife
attach: @ estate docs
networth: odin
thor:
email: thor@norse-gods.com
category: kids
attach: @ estate docs
loki:
email: loki@norse-gods.com
category: kids
attach: @ estate docs
Notice that estate docs is defined at the top level. It is not a PostMortem
setting; it simply defines a value that will be interpolated into a setting
later. The interpolation is done by specifying @
along with the name of the
reference as a value. So for example, in recipients attach is specified as
@ estate docs
. This causes the list of estate documents to be used as
attachments. The same thing is done in sign with, which interpolates my gpg
ids.
Here is the code for validating and transforming the PostMortem settings:
#!/usr/bin/env python3
import nestedtext as nt
from pathlib import Path
from voluptuous import Schema, Invalid, Extra, Required, REMOVE_EXTRA
from pprint import pprint
# Settings schema
# First define some functions that are used for validation and coercion
def to_str(arg):
if isinstance(arg, str):
return arg
raise Invalid('expected text')
def to_ident(arg):
arg = to_str(arg)
if len(arg.split()) > 1:
raise Invalid('expected simple identifier')
return arg
def to_list(arg):
if isinstance(arg, str):
return arg.split()
if isinstance(arg, dict):
raise Invalid('expected list')
return arg
def to_paths(arg):
return [Path(p).expanduser() for p in to_list(arg)]
def to_email(arg):
user, _, host = arg.partition('@')
if '.' in host:
return arg
raise Invalid('expected email address')
def to_emails(arg):
return [to_email(e) for e in to_list(arg)]
def to_gpg_id(arg):
try:
return to_email(arg) # gpg ID may be an email address
except Invalid:
try:
int(arg, base=16) # if not an email, it must be a hex key
assert len(arg) >= 8 # at least 8 characters long
return arg
except (ValueError, AssertionError):
raise Invalid('expected GPG id')
def to_gpg_ids(arg):
return [to_gpg_id(i) for i in to_list(arg)]
# define the schema for the settings file
schema = Schema(
{
Required('my gpg ids'): to_gpg_ids,
'sign with': to_gpg_id,
'avendesora gpg passphrase account': to_str,
'avendesora gpg passphrase field': to_str,
'name template': to_str,
Required('recipients'): {
Extra: {
Required('category'): to_ident,
Required('email'): to_emails,
'gpg id': to_gpg_id,
'attach': to_paths,
'networth': to_ident,
}
},
},
extra = REMOVE_EXTRA
)
# this function implements references
def expand_settings(value):
# allows macro values to be defined as a top-level setting.
# allows macro reference to be found anywhere.
if isinstance(value, str):
value = value.strip()
if value[:1] == '@':
value = settings[value[1:].strip()]
return value
if isinstance(value, dict):
return {k:expand_settings(v) for k, v in value.items()}
if isinstance(value, list):
return [expand_settings(v) for v in value]
raise NotImplementedError(value)
try:
# Read settings
config_filepath = Path('postmortem.nt')
if config_filepath.exists():
# load from file
settings = nt.load(config_filepath, keymap=(keymap:={}))
# expand references
settings = expand_settings(settings)
# check settings and transform to desired types
settings = schema(settings)
# show the resulting settings
pprint(settings)
except nt.NestedTextError as e:
e.report()
except Invalid as e:
kind = 'key' if 'key' in e.msg else 'value'
loc = keymap[tuple(e.path)]
culprit = '.'.join(str(p) for p in [config_filepath] + e.path)
print(f"ERROR: {culprit}: {e.msg}.")
print(loc.as_line(kind))
except OSError as e:
print(f"ERROR: {config_filepath!s}: {e!s}")
This code uses expand_settings to implement references, and it uses the
Voluptuous schema to clean and validate the settings and convert them to
convenient forms. For example, the user could specify attach as a string or
a list, and the members could use a leading ~
to signify a home directory.
Applying to_paths in the schema converts whatever is specified to a list and
converts each member to a pathlib path with the ~
properly expanded.
Notice that the schema is defined in a different manner that the above examples. In those, you simply state which type you are expecting for the value and you use the Coerce function to indicate that the value should be cast to that type if needed. In this example, simple functions are passed in that perform validation and coercion as needed. This is a more flexible approach and allows better control of the error messages.
Here are the processed settings:
{'my gpg ids': ['odin@norse-gods.com'],
'name template': '{name}-{now:YYMMDD}',
'recipients': {'frigg': {'attach': [PosixPath('.../home/estate/trust.pdf'),
PosixPath('.../home/estate/will.pdf'),
PosixPath('.../home/estate/deed-valhalla.pdf')],
'category': 'wife',
'email': ['frigg@norse-gods.com'],
'networth': 'odin'},
'loki': {'attach': [PosixPath('.../home/estate/trust.pdf'),
PosixPath('.../home/estate/will.pdf'),
PosixPath('.../home/estate/deed-valhalla.pdf')],
'category': 'kids',
'email': ['loki@norse-gods.com']},
'thor': {'attach': [PosixPath('.../home/estate/trust.pdf'),
PosixPath('.../home/estate/will.pdf'),
PosixPath('.../home/estate/deed-valhalla.pdf')],
'category': 'kids',
'email': ['thor@norse-gods.com']}},
'sign with': 'odin@norse-gods.com'}