rsonpath¶
Yet another rsonpath cheat sheet.
Introduction¶
rsonpath is a blazing fast JSONPath query engine written in Rust. It allows to run structure-aware JSONPath queries on JSON data as easily as one would query a string with regular expressions. The JSONPath standard defines a query expression syntax for selecting and extracting JSON values from within a given JSON value. rsonpy is a lightweight Python wrapper around the Rust library.
To learn fundamentals about rsonpath, please refer to the rsonbook, and specifically the JSONPath reference.
Use cases
Rsonpath is made to filter documents ahead of parsing/loading. Think of it like a way to select a small part of the documents before loading it into memory. It will shine in situations where you have a JSON file too big to hold in memory, but you nevertheless want to load parts of it.
Rsonpath does not load anything into memory. It only works on the raw JSON, not on an in-memory data structure.
jqlang vs. rsonpath
If you are asking what’s wrong with jq, the authors provide
concise answers per Why choose rsonpath?:
The most popular tool for working with JSON data is jq, but it has a few shortcomings:
it is extremely slow
it has a massive memory overhead
its query language is non-standard.
To be clear,
jqis a great and well-tested tool, andrqdoes not directly compete with it. If one could describejqas a “sed or awk for JSON”, thenrqwould be a “grep for JSON”: It does not allow you to slice and reorganize JSON data likejq, but instead outclasses it on the filtering and querying applications.
Examples¶
Unwrap¶
It is typical for HTTP JSON API responses to not start directly with a collection of data items, because a typical response also includes other metadata. In this spirit, when connecting pipeline elements of JSON processors, input data mostly needs to be edited into a variant suitable for storing into a database, likely also transitioning from a top-level object to a top-level list.
The following expression unwraps the actual collection which is nested
within the top-level records element.
expression: $.records
type: rson
Example
Input data
{
"message-source": "community",
"message-type": "mixed-pickles",
"records": [
{"foo": "bar"},
{"baz": "qux"}
]
}
Output data
[
{"foo": "bar"},
{"baz": "qux"}
]
Transformation definition
# Tikray collection-level transformation definition.
# Includes a Moksha/rsonpath transformation rule for unwrapping.
# https://tikray.readthedocs.io/
---
meta:
type: tikray-collection
version: 1
pre:
rules:
- expression: $.records
type: rson