Work with JSONB database columns
The DHIS2 core developers are increasingly using JSONB as a data type for certain kinds of data, most prominently program data values and attribute values. This page should get you started to understand, read, update and delete data stored in JSONB columns.
The following table should give an non-conclusive overview of the recent changes to the use of JSONB.
Version | Affected objects |
2.31 | dataStore (keyjsonvalue and userkeyjsonvalue ) |
2.32 | Program data (trackedentitydatavalue → programstageinstance.eventdatavalues ) |
2.33 | Any metadata to hold attribute values in the respective table’s new column attributevalues |
2.36 | Sharing (any metadata to be shared in the respective table’s new column sharing ) |
2.36 | Visualizations |
Additionally but also in JSONB nowadays: translations
, style
, favorites
etc. (see bottom for a comprehensive way to find all objects).
Structure of the new record
How does this look like? If we look at a single programstageinstance
it is visible that the event data is stored directly in each event record, in the column eventdatavalues
:
select * from programstageinstance limit 1;
programstageinstanceid | 867967
programinstanceid | 867963
programstageid | 423730
duedate | 2019-02-10 12:05:00
executiondate | 2019-02-10 12:05:00
organisationunitid | 247052
status | ACTIVE
completeddate |
uid | No5utE4d4hN
created | 2018-01-26 13:43:31.923
lastupdated | 2018-01-26 13:43:31.923
attributeoptioncomboid | 4
storedby |
completedby |
deleted | f
code |
createdatclient | 2017-01-26 13:43:31.923
lastupdatedatclient |
geometry |
lastsynchronized | 1970-01-01 01:00:00
eventdatavalues | {"DCUDZxqOxUo": {"value": "true", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "EyfTU3ibMmJ": {"value": "true", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "NPZPVg2rVh4": {"value": "true", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "OSuxnldV4Ug": {"value": "false", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "Ok9OQpitjQr": {"value": "false", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "cKBSkBB3Mt4": {"value": "true", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "lcaG1Pnh27I": {"value": "false", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "rHgrmXfa57b": {"value": "false", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "sWoqcoByYmD": {"value": "true", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "vANAXwtLwcT": {"value": "7", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "vTEkiy8F3yj": {"value": "true", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "w7enwqzx90I": {"value": "false", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}, "ytV9rX4ADnn": {"value": "false", "created": "2017-01-26T13:43:31.923", "storedBy": null, "lastUpdated": "2017-01-26T13:43:31.923", "providedElsewhere": false}}
assigneduserid |
… the trackedentitydatavalue
table does not exist anymore in the database.
If eventdatavalues
is expanded to be more readable:
{
"DCUDZxqOxUo": {
"value": "true",
"created": "2017-01-26T13:43:31.923",
"storedBy": null,
"lastUpdated": "2017-01-26T13:43:31.923",
"providedElsewhere": false
},
"EyfTU3ibMmJ": {
"value": "true",
"created": "2017-01-26T13:43:31.923",
"storedBy": null,
"lastUpdated": "2017-01-26T13:43:31.923",
"providedElsewhere": false
},
"NPZPVg2rVh4": {
"value": "true",
"created": "2017-01-26T13:43:31.923",
"storedBy": null,
"lastUpdated": "2017-01-26T13:43:31.923",
"providedElsewhere": false
}, (... and more of these)
}
Upon inspection of the objects it is visible that the object has a structure where the data element UID is the key (or “entry point”, “lookup-value” of the JSONB object):
{
"dataElementUID": {
"value": value
"created": when it was created,
"storedBy": username of who stored the event
"lastUpdated": when it was last updated
"providedElsewhere": if data was pulled from previous data
}
}
What is JSONB in PostgreSQL?
JSONB is like JSON but “JSON-Binary”. JSONB removes spaces, and stores the data in a binary format into the database.
When selecting data you might need to “drill down” on to the condition you are looking for with an operator.
Operators
Depending on what you need you can operate on the data with different operators:
eventdatavalues -> ''
returns the JSONB object field by key (meaning you get a “slice” of the JSONB object)eventdatavalues ->> ''
gets the event’s data values as text. This is more useful if you further drill down (see examples).eventdatavalues @> '{"xyzUID": {"value": 123, "storedBy": "admin"}}'
– use it to check if the JSONB column contains some specific json. This can be useful to filter for a specific key/value pair, e.g. a combination of DE UID, value, and storedBy.- There are more operators documented in PostgreSQL: Documentation: 9.6: JSON Functions and Operators
JSON functions
There are a wide selections to help get and transform the JSON-like data as documented in PostgreSQL: Documentation: 9.6: JSON Functions and Operators. However, we want to look at three that are useful for the basics.
jsonb_each()
- Used to expand the object into a set of key/value pairs (useful for tables / CSV)jsonb_each_text()
- same as above but the values are textjsonb_set()
- used to write values.
Examples for these functions are below.
Examples
All examples are applicable to the Sierra Leone database.
Querying events
Get all non-deleted events where “Fever” is true:
SELECT uid, eventdatavalues -> 'rzhHSqK3lQq' ->> 'value' AS fever_value
FROM programstageinstance
WHERE eventdatavalues -> 'rzhHSqK3lQq' ->> 'value' = 'true'
AND deleted IS FALSE;
Result:
uid | fever_value
-------------+-------------
kP4F74ySJja | true
Ks9FZL427C5 | true
p2z50WXivZQ | true
SdzERSvelAZ | true
(4 rows)
An alternative is to use the @>
operator to check for the existence in nested objects (such as eventdatavalues
:
SELECT uid, eventdatavalues -> 'rzhHSqK3lQq' ->> 'value' AS fever_value
FROM programstageinstance
WHERE eventdatavalues @> '{"rzhHSqK3lQq":{"value": "true"}}'
AND deleted IS FALSE;
Get the number of events where “Fever” is not set at all:
SELECT count(*)
FROM programstageinstance
WHERE eventdatavalues -> 'rzhHSqK3lQq' IS NULL;
Result: 423650
Get all data values of a single event with UID DB8aHPjFSz4
:
SELECT psi.uid AS event_uid,
de.name AS dataelement_name,
key AS dataelement_uid,
value ->> 'value' AS value
FROM programstageinstance psi, jsonb_each(eventdatavalues)
JOIN dataelement de ON de.uid = key
WHERE psi.uid = 'DB8aHPjFSz4';
Result:
event_uid | dataelement_name | dataelement_uid | value
-------------+------------------------------+-----------------+----------------------
DB8aHPjFSz4 | Additional medication | WO8yRIZb7nb | ANTI_PSYCHOTIC
DB8aHPjFSz4 | Travel in last 12 months | hiymQVgVG2v | true
DB8aHPjFSz4 | Reason for referral | Zlro25GTcNK | baby
DB8aHPjFSz4 | Clinical status | SzVk2KvkSSd | SIMPLE
DB8aHPjFSz4 | Referred | MKMyvXshCdB | YES
DB8aHPjFSz4 | Travel to (last 2 weeks) | xoDfSuukvZ9 | OUTSIDE VILLAGE/FOCI
DB8aHPjFSz4 | Malaria Medication | nTMP8Aj1rYA | AL+PQ
DB8aHPjFSz4 | Malaria inpatient/outpatient | E0W0ZTTosXK | INPATIENT
(8 rows)
Updating event data values
Update an existing data value for the data element “Malaria case classification” ElxhP9pTTP6
in an event with UID p2z50WXivZQ
. For this we will use the jsonb_set()
function (see links above) to change “LOCAL” to “GLOBAL” (if this new value makes sense would need to be determined beforehand).
UPDATE programstageinstance
SET eventdatavalues = jsonb_set(eventdatavalues, '{ElxhP9pTTP6,value}', jsonb '"GLOBAL"')
WHERE uid = 'p2z50WXivZQ';
Deleting values in events
Delete the “Fever” value entirely from an Event with UID kP4F74ySJja
(but keep the event itself and all other “Fever” values):
UPDATE programstageinstance
SET eventdatavalues = eventdatavalues - 'rzhHSqK3lQq'
WHERE uid = 'kP4F74ySJja';
Result: 1 row affected.
Querying attribute values
Attribute values are also stored as JSONB and contain multiple values (i.e. a data element can have multiple attributes with values). Assume we want to query all data elements that have an attribute value into one table. We will need to use the jsonb_each()
function to list all values, not just one.
SELECT uid AS uid,
name AS name,
key AS attribute,
value ->> 'value' AS value
FROM dataelement, jsonb_each(attributevalues)
WHERE attributevalues IS NOT NULL
ORDER BY name, attribute
Result:
uid | name | attribute | value
-------------+-------------------------------------------------------+-------------+-----------
FTRrcoaog83 | Accute Flaccid Paralysis (Deaths < 5 yrs) | n2xYlNbsfko | AFP
LjNlMTl9Nq9 | Animal Bites - Rabid (Deaths < 5 yrs) | n2xYlNbsfko | ABI
eY5ehpbEsB7 | Cholera (Deaths < 5 yrs) | n2xYlNbsfko | CHO
Ix2HsbDMLea | Dysentery (Deaths < 5 yrs) | n2xYlNbsfko | DYS
o0fOD1HLuv8 | HIV: counseling | dLHLR5O4YFI | HTC_TST
ZwrIPRUiHEB | HIV: currently on care | dLHLR5O4YFI | CARE_CURR
veW7w0xDDOQ | HIV: new on care | dLHLR5O4YFI | CARE_NEW
R4KStuS8qt7 | HIV: testing | dLHLR5O4YFI | HTC_TST
NpJtsQkMTm3 | Louse Borne Typhus - Relapsing fever (Deaths < 5 yrs) | n2xYlNbsfko | LBT
r6nrJANOqMw | Malaria (Deaths < 5 yrs) | n2xYlNbsfko | MAL
f7n9E0hX8qk | Measles (Deaths < 5 yrs) | n2xYlNbsfko | MEA
MSZuQ1mTsia | Meningitis (Deaths < 5 yrs) | n2xYlNbsfko | MEN
lXolhoWewYH | Plague (Deaths < 5 yrs) | n2xYlNbsfko | PLA
jVDAvs6kIAP | Rabies (Deaths < 5 yrs) | n2xYlNbsfko | RAB
A3FR1rkz9D8 | Receiving ART | l1VmqIHKk6t | HIVART
A3FR1rkz9D8 | Receiving ART | xqWyz9jNCA5 | ONART
Vp12ncSU1Av | Tetanus Neonatal (Deaths < 5 yrs) | n2xYlNbsfko | TEN
hM4ya5T2AqX | Tetanus Other (Deaths < 5 yrs) | n2xYlNbsfko | TEO
Yy9NtNfwYZJ | Typhoid (Deaths < 5 yrs) | n2xYlNbsfko | TYP
cNkTt6mJQyO | Yebo | Z4X3J7jMLYV | ACTIVITY
USBq0VHSkZq | Yellow Fever (Deaths < 5 yrs) | n2xYlNbsfko | YEL
(21 rows)
Finding all JSONB columns in a database
To find all tables & columns of type JSONB, issue the following query:
SELECT col.table_name,
col.column_name
FROM information_schema.columns col
JOIN information_schema.tables tab ON tab.table_schema = col.table_schema
AND tab.table_name = col.table_name
AND tab.table_type = 'BASE TABLE'
WHERE col.table_schema NOT IN ('information_schema', 'pg_catalog')
AND col.data_type = 'jsonb'
ORDER BY col.table_name;
On a 2.34 database there are over 130 columns already.