Brickify tool

Tip

To use this feature, install brickschema with the “brickify” feature: pip install brickschema[brickify] or pip install brickschema[all]

The brickify tool is used to create Brick models from other data sources. It is installed as part of the brickschema package. If you installed py-brickschema from Github you may have usage examples included in the tests directory, otherwise, you can find them online in the test source tree.

The brickify tool is built around the notion of handlers and operations. Handlers are pieces of code (written in Python) that the brickify tool uses to carry out operations that transform data.

Handlers are how data is loaded by brickify and contain the code that executes the translations that are specified by the operations.

Operations are specified in a configuration file when the brickify tool is invoked by the user.

We expect that most users of brickify will not have to write a Handler, though they may need to write their own set of operations. Over time, we hope to include an expanded library of useful Handlers in brickify as well as example operations that can be easily customized for a particular job.

We expect a common scenario will be for brickify and the included handlers to be used as a tool in a building system integration job, where the operations might be written by the technical support team supporting the integration job, and then invoked by the field team against different data sources and building systems, with perhaps a small bit of customization.

Using Brickify

The brickify tool can be invoked on the command line as follows:

brickify sheet.tsv --output bldg.ttl --input-type tsv --config template.yml

where sheet.tsv might be a tabled stored in CSV/TSV file.

brickify starts with an empty graph, and uses handlers and operations to add the data from the input file (in this case, sheet.tsv) to the graph, and then write that graph out to a file (bldg.ttl)

For example, consider the following basic table with two rows that might be stored in sheet.tsv

VAV name	temperature sensor	temperature setpoint	has_reheat
A	A_ts	A_sp	false
B	B_ts	B_sp	true

Brickify selects the handler to use based on the input-type of the file. In this case, brickify will use the TableHandler to process the data.

Brickify loads the operations from the config file specified when brickify is run. The config file can be in either YAML or JSON, but for our examples we will use YAML. Here is an example template.yml

---
namespace_prefixes:
  brick: "https://brickschema.org/schema/Brick#"
operations:
  -
    data: |-
      bldg:{VAV name} rdf:type brick:VAV ;
                      brick:hasPoint bldg:{temperature sensor} ;
                      brick:hasPoint bldg:{temperature setpoint} .
      bldg:{temperature sensor} rdf:type brick:Temperature_Sensor .
      bldg:{temperature setpoint} rdf:type brick:Temperature_Setpoint .
  -
    conditions:
      - |
        '{has_reheat}'
    data: |-
      bldg:{VAV name} rdf:type brick:RVAV .

The above example configuration file has two operations. The first operation is a ‘data’ operation. In a ‘data’ operation, new data is added to the graph. In a dataset processed by a TableHandler, each operation is checked against each row of the input table. In a basic ‘data’ operation, if all of the variables mentioned in the operation are present in the row being processed, the body of the operation is updated using the values from the row being processed, and the data is inserted into the graph. The first operation above references the ‘VAV_name’, ‘temperature sensor’, and ‘temperature setpoint’ variables, and all of them are present in the first row, so the following data is inserted into the graph:

bldg:A rdf:type brick:VAV ;
                brick:hasPoint bldg:A_ts ;
                brick:hasPoint bldg:A_sp .
bldg:A_ts rdf:type brick:Temperature_Sensor .
bldg:A_sp rdf:type brick:Temperature_Setpoint .

Because the second row has all of the variables as well, the first operation is used again with the second row of the input file and the following information is inserted into the graph:

bldg:B rdf:type brick:VAV ;
                brick:hasPoint bldg:B_ts ;
                brick:hasPoint bldg:B_sp .
bldg:B_ts rdf:type brick:Temperature_Sensor .
bldg:B_sp rdf:type brick:Temperature_Setpoint .

The second operation in the file is a ‘conditional’ operation. A ‘conditional’ operation is much like a ‘data’ operation, and all of the variables specified in a ‘conditional’ operation must be present for the operation to be invoked, but a ‘conditional’ operation also includes an extra check to see if it should be used for a given row. In this case, the ‘conditional’ operation says that the has_reheat variable must be true in order for the associated ‘data’ operation to be invoked. In our example, the first row (for VAV ‘A’) under the column ‘has_reheat’ is listed as ‘false’ and so the ‘data’ operation does not fire. The second row (for VAV ‘B’) the ‘has_reheat’ column is ‘true’ and the ‘data’ operation fires, inserting the following triple into the graph

bldg:B a brick:RVAV .

The details of the ‘conditional’ syntax is detailed in the Table Handler section below.

Namespace and Prefix updates

Often, you would like to reuse a configuration file such as the ‘template.yml’ we used in our earlier examples, but you want to be able to customize them for a specific building or site. Brickify allows you to substitute a new namespace and RDF prefix for the building and site by using the command line. Brickify will replace the text from the template to be the new values on the command line.

brickify sheet.tsv --output bldg.ttl --input-type tsv --config template.yml --building-prefix mybldg --building-namespace https://mysite.local/mybldg/#

Will produce in bldg.ttl:

@prefix brick: <https://brickschema.org/schema/Brick#> .
@prefix mybldg: <https://mysite.local/mybldg/#> .

mybldg:A a brick:VAV ;
    brick:hasPoint mybldg:A_sp,
        mybldg:A_ts .

Handler

The base Brickify Handler takes in an existing graph and updates it. The handler reads the entire graph into memory in one pass, and then runs each operation once against the entire graph. (Note that this is different than the TableHandler we were using in the example, which goes row-by-row through the input file, and runs the full set of operations against each row, e.g. if you have 3 rows and 2 operations, each of the 2 operations are run 3 times, once per row, for a total of 6 operations overall)

The base Handler is invoked when the --input-format option is set to graph or rdf or is left unspecified.

The supported operations for the base Handler are ‘query’ and ‘data’. The ‘query’ operation executes a SPARQL update query to transform the input graph. Consider this example template.yml file:

---
namespace_prefixes:
  brick: "https://brickschema.org/schema/Brick#"
  yao: "https://example.com/YetAnotherOnology#"
operations:
  -
    query: |-
        DELETE {{
          ?vav a yao:vav .
        }}
        INSERT {{
          ?vav a brick:VAV .
        }}
        WHERE {{
          ?vav a yao:vav .
        }}
  -
    query: |-
        DELETE {{
          ?rvav a yao:vav_with_reheat .
        }}
        INSERT {{
          ?rvav a brick:RVAV .
        }}
        WHERE {{
          ?rvav a yao:vav_with_reheat .
        }}

This example has two operations, both of which are ‘query’ operations. Each operation is basically translating between one namespace and into another. The queries select a set of triples from the original graph, delete them from the original graph, and reinsert them into the new graph but in a new namespace.

Table Handler

The Table Handler processes input datasets row by row. The Table Halder is invoked with the --input-format is set to TSV, CSV, or table.

We have already seen parts of the TableHandler. Let’s recall the config file we used earlier:

---
namespace_prefixes:
  brick: "https://brickschema.org/schema/Brick#"
operations:
  -
    data: |-
      bldg:{VAV name} rdf:type brick:VAV ;
                      brick:hasPoint bldg:{temperature sensor} ;
                      brick:hasPoint bldg:{temperature setpoint} .
      bldg:{temperature sensor} rdf:type brick:Temperature_Sensor .
      bldg:{temperature setpoint} rdf:type brick:Temperature_Setpoint .
  -
    conditions:
      - |
        '{has_reheat}'
    data: |-
      bldg:{VAV name} rdf:type brick:RVAV .

Internally, Brickify converts each ‘data’ operation to a SPARQL insert operation. If the ‘data’ operation fires, because all of the variables referenced in the operation are present in that row, Brickify executes a SPARQL INSERT DATA statement. This is the SPARQL generated from the first row:

INSERT DATA { bldg:A rdf:type brick:VAV ;
                brick:hasPoint bldg:A_ts ;
                brick:hasPoint bldg:A_sp .
bldg:A_ts rdf:type brick:Temperature_Sensor .
bldg:A_sp rdf:type brick:Temperature_Setpoint . }

Conditional syntax

Brickify implements conditions by taking the condition and feeding it to Python’s eval method. If the condition evaluates to True, the data method fires, and if the method evaluates to False, the condition fails. Consider this input file:

VAV name	temperature sensor	temperature setpoint	has_reheat	thresh
A	A_ts	A_sp	false	16
B	B_ts	B_sp	true	12

One of the things that can be a little tricky with the ‘condition’ operation is ensuring that the types are correct when crossing from CSV/TSV and into Python, especially for strings and Booleans.

For example, this expression will fire for row A but not row B:

conditions:
  - |
    {thresh} > 14

Internally, this is converted to the string '16 > 14' and then passed to the Python eval() method, which returns True.

A trickier version - which looks like our earlier example but is slightly different:

conditions:
  - |
    {has_reheat}

In our example, this will fail! (Spoiler: we took away the quotes from our earlier example)

The issue is that the has_reheat column is pulled in as a string, but is not valid Python because the capitalization of ‘true’ and ‘false’ is incorrect in the TSV file.

One way to fix this is to correct the data:

VAV name	temperature sensor	temperature setpoint	has_reheat	thresh
A	A_ts	A_sp	False	16
B	B_ts	B_sp	True	12

This will match the condition because we have capitalized True and False. Unfortunately, changing the data in the input CSV you are processing may not always be possible.

As a compromise, to support this common use case where the input strings look like booleans but are not quite formatted right, Brickify expects Boolean conditions to be handled first as quoted strings:

conditions:
  - |
    '{has_reheat}'

Brickify will pass that code to the Python eval() method, which will return 'true', which is type str (and not True which is type Boolean) However, as a special case, Brickify converts the following strings to booleans: [“TRUE”, “true”, “True”, “on”, “ON”] all become True, and [“FALSE”, “false”, “False”, “off”, “OFF”] are converted to False.

An important note: the replacement text is not carried out on the substrings. At present, this will not work:

conditions:
  - |
    {thresh} > 12 and '{has_reheat}'

Template Operation

The TableHandler supports an additional operation, similar to the ‘data’ operation, that uses Jinja2 templates. This introduces a new section into the configuration file for defining Jinja2 templates, the ‘macros’ section, which is added at the top level of the configuration file.

The new operation is a ‘template’ operation, which can reference the Jinja2 macros from the top-level macro section. Much like ‘data’ operations, a ‘template’ operation only fires if all of the referenced variables are present in the row being processed.

Consider this input table:

VAV name	temperature sensor	temperature setpoint	has_reheat	sensors	setpoints
A	A_ts	A_sp	False	4	3
B	B_ts	B_sp	True	5	3

The example config file below defines two template operations. The template uses a ‘for’ loop to create multiple sensors and setpoints, following a naming pattern provided to macro as arguments. The numbers of sensors and setpoints come from the input CSV file.

---
namespace_prefixes:
  brick: "https://brickschema.org/schema/Brick#"
operations:
  -
    data: |-
      bldg:{VAV name}_0 rdf:type brick:VAV .
  -
    conditions:
      - |
        '{has_reheat}'
    data: |-
      bldg:{VAV name} rdf:type brick:RVAV .

  - template: |-
      {{ num_triples(value['VAV name'], "brick:hasPoint", value['temperature sensor'], value['sensors'], "brick:Temperature_Sensor") }}

  - template: |-
      {{ num_triples(value['VAV name'], "brick:hasPoint", value['temperature setpoint'], value['setpoints'], "brick:Temperature_Setpoint") }}

macros:
  - |-
    {% macro num_triples(subject, predicate, name, num, type) %}
        {% for i in range(num) %}
          bldg:{{ name }}_{{ i }} a {{ type }} .
          bldg:{{ subject }} {{ predicate }} bldg:{{ name }}_{{ i }} .
        {% endfor %}
    {% endmacro %}

And the output, just for the building B row:

bldg:B_ts_0 a brick:Temperature_Sensor .
bldg:B brick:hasPoint bldg:B_ts_0 .

bldg:B_ts_1 a brick:Temperature_Sensor .
bldg:B brick:hasPoint bldg:B_ts_1 .

bldg:B_ts_2 a brick:Temperature_Sensor .
bldg:B brick:hasPoint bldg:B_ts_2 .

bldg:B_ts_3 a brick:Temperature_Sensor .
bldg:B brick:hasPoint bldg:B_ts_3 .

bldg:B_ts_4 a brick:Temperature_Sensor .
bldg:B brick:hasPoint bldg:B_ts_4 .

bldg:B_sp_0 a brick:Temperature_Setpoint .
bldg:B brick:hasPoint bldg:B_sp_0 .

bldg:B_sp_1 a brick:Temperature_Setpoint .
bldg:B brick:hasPoint bldg:B_sp_1 .

bldg:B_sp_2 a brick:Temperature_Setpoint .
bldg:B brick:hasPoint bldg:B_sp_2 .

Haystack Handler

The Haystack handler downloads Haystack files and converts them to Brick. It is invoked by using the --input-type haystack on the command line. The input file is a filepath or a URL where the Haystack TTL file can be found.

The conversion is carried out by the HaystackRDFInferenceSession method in this Python package.