Telescope Housekeeping Data (so3g.hk)¶
The so3g.hk
module contains code for writing and reading the
structured SO Housekeeping (HK) data format. This format is carried
in spt3g G3 frames, but with a specific schema designed for SO use.
The detailed specification of the schema is contained in the
tod2maps_docs/tod_format
document.
If you’re here you probably just want to read in some data, so we will start with that. Later on we go into details of the data model and the interfaces for writing compliant HK files.
Loading Data Saved OCS/SO Style¶
One of the most basic things we might want to do is load data between
a time range. For .g3 files that are saved by an OCS Aggregator, there
is a specific folder structure and file naming scheme. so3g.hk.load_range()
is writen to load data for a specified time frame saved in that style.
Example Use:
from so3g.hk import load_range
data = load_range(start, stop, **kwargs)
Defining Time Ranges¶
There are several options for defining start and stop. These options
are passed to so3g.hk.getdata.to_timestamp()
to be parsed.
datetime.datetime
objects:import datetime as dt # tzinfo is likely needed if your computer is not in UTC start = dt.datetime(2020, 3, 12, 1, 23, tzinfo=dt.timezone.utc)
Integers or Floats - these are assumed to be ctimes
Strings - assumed to be UTC dates and parsed by
datetime.strptime
‘%Y-%m-%d’
‘%Y-%m-%d %H:%M’
‘%Y-%m-%d %H:%M:%S’
‘%Y-%m-%d %H:%M:%S.%f’
Or submit your own with the
str_format
argument
Define Where to Look for Data¶
Option 1
Use the
data_dir='/path/to/ocs/hk/data'
keyword argument. This should be the directory where all the first five digit ctime folders are located.load_range
will look for the data there.Option 2
Set an environment variable
export OCS_DATA_DIR=/path/to/ocs/hk/data
.load_range
will will automatically look there if it isn’t overridden by option 1.Option 3
Use a configuration file. See Below.
Define Which Data to Load¶
Option 1
No keyword arguments means
load_range
will return every field it can find. this might take a very long time.Option 2
Use the
fields = [list, of, field, names]
keyword argument. Example:fields = [ 'observatory.LS240_ID.feeds.temperatures.Channel_7_T', 'observatory.LS240_ID.feeds.temperatures.Channel_5_T', ]
Option 3
Use a configuration file. See Below.
Define How the Data is Returned¶
The data is returned as a dictionary of the format:
{
'name' : (time, data)
}
time
and data
are arrays of the times / data from each loaded field
Option 1
No keyword arguments means
load_range
will returnname
set to be the field name. But this is long.Option 2
Use the
alias = [list, of, desired, names]
which must be the length offields
. Now the dictionary will have these alias as thename
.Option 3
Use a configuration file. See Below.
Create a Configuration file¶
Because why deal with all these individual pieces when you don’t have to?
Create a yaml
file and pass the filename to load_range
with
the config
keyword argument. Example file:
data_dir: '/data/ocs'
field_list:
'40k_dr_side' : 'observatory.LEIA.feeds.temperatures.Channel_7_T'
'40k_far_side': 'observatory.LEIA.feeds.temperatures.Channel_8_T'
'80k_dr_side' : 'observatory.LEIA.feeds.temperatures.Channel_5_T'
'80k_far_side': 'observatory.LEIA.feeds.temperatures.Channel_6_T'
'4k_far_side' : 'observatory.YODA.feeds.temperatures.Channel_1_T'
'4k_dr_side' : 'observatory.YODA.feeds.temperatures.Channel_2_T'
data_dir
sets the directory and field_list
has the list of 'alias':'field'
.
Function References¶
- so3g.hk.load_range(start, stop, fields=None, alias=None, data_dir=None, config=None, pre_proc_dir=None, pre_proc_mode=None, daq_node=None, strict=True)[source]¶
- Parameters
start – Earliest time to search for data (see note on time formats).
stop – Latest time to search for data (see note on time formats).
fields – Fields to return, if None, returns all fields.
alias – If not None, must be a list of strings providing exactly one value for each entry in fields.
data_dir – directory where all the ctime folders are. If None, tries to use $OCS_DATA_DIR.
config – filename of a .yaml file for loading data_dir / fields / alias
pre_proc_dir – Place to store pickled HKArchiveScanners for g3 files to speed up loading
pre_proc_mode – Permissions (passed to os.chmod) to be used on dirs and pkl files in the pre_proc_dir. No chmod if None.
daq_node – String of type of HK book (Ex: satp1, lat, site) to load if daq_node name not in data_dir. If None, daq_node name in data_dir, or loading .g3 files.
strict – If False, log and skip missing fields rather than raising a KeyError.
- Returns
Dictionary with structure:
{ alias[i] : (time[i], data[i]) }
It will be masked to only have data between start and stop.
Notes
The “start” and “stop” argument accept a variety of formats, including datetime objects, unix timestamps, and strings (see to_timestamp function). In the case of datetime objects, you should set tzinfo=dt.timezone.utc explicitly if the system is not set to UTC time.
Example usage:
fields = [ 'observatory.HAN.feeds.temperatures.Channel 1 T', 'observatory.HAN.feeds.temperatures.Channel 2 T', ] alias = [ 'HAN 1', 'HAN 2', ] start = dt.datetime(2020,2,19,18,48) stop = dt.datetime(2020,2,22) data = load_range(start, stop, fields=fields, alias=alias) plt.figure() for name in alias: plt.plot( data[name][0], data[name][1])
- so3g.hk.getdata.to_timestamp(some_time, str_format=None)[source]¶
Convert the argument to a unix timestamp.
- Parameters
some_time – If a datetime, it is converted to UTC timezone and then to a unix timestamp. If int or float, the value is returned unprocessed. If str, a date will be extracted based on a few trial date format strings.
str_format – a format string (for strptime) to try, instead of the default(s).
- Returns
Unix timestamp corresponding to some_time.
- Return type
Exploring HK Data¶
The HKTree object provides a way to browse through available HK data fields from within an interactive python session (such as through the python or ipython interpreters or in a jupyter session).
Instead of accessing HK fields through their long string names, such as:
field_name = "observatory.hk.rotation.feeds.hwprotation.kikusui_curr"
a field can be referred to as a named object in a hierarchy of attributes:
tree = HKTree('2022-10-20', '2022-10-22', config='hk.yaml')
field = tree.hk.rotation.hwprotation.kikusui_curr
By exposing the available fields as a tree of attributes, rather than huge set of long string keys, a user working interactively can use tab-completion to find fields easily.
Having identified a field or set of fields of interest, the data can be loaded by calling pseudo-private methods on the field directly, e.g.:
data_dict = field._load()
After calling load, the data are stored in the field reference itself, and so can be retrieved with:
times, values = field._data
The _load
method can be called on “non-terminal” nodes of the
tree, for example:
data_dict = tree.hk.rotation._load()
would load all the fields that are sub- (or sub-sub-, …) attributes
of tree.hk.rotation
.
The system is intended to complement so3g.hk.load_range()
, and
uses the same sort of configuration file.
See more detailed examples below, as well as the Class Reference.
Instantiating an HKTree¶
To create an HKTree requires at least a path to an “OCS style” data
directory (in the same sense as so3g.hk.load_range()
):
tree = hk.HKTree(data_dir='/mnt/so1/data/ucsd-sat1/hk/')
By default, the returned object will only look through data from the past 24 hours. To specify a range of dates of interest, use the start and stop parameters:
tree = hk.HKTree(data_dir='/mnt/so1/data/ucsd-sat1/hk/',
start='2022-10-20', stop='2022-10-22')
As with load_range
, passing data_dir
is not necessary if the
OCS_DATA_DIR
environment variable is set. However, a config file
may be the best way to go.
Configuration file¶
The configuration file syntax is as in load_range
. However you
can also specify:
pre_proc_dir
: The default value forpre_proc_dir
.skip_tokens
: Tokens to append to theskip
parameter forHKTree()
.
Aliases are treated in a special way by HKTree; see Using ._aliases below.
Finding and Loading data¶
The nodes in the tree are all either “terminal” or not. The terminal
nodes are associated with a single specific HK field
(e.g. tree.hk.rotation.hwprotation.kikusui_curr
) while
non-terminal nodes are do not have associated fields, but have child
attributes (e.g. tree.hk.rotation
).
Data is loaded by calling ._load()
on any node in the tree. This
function will return a dictionary of data, mapping the full field names
to tuples of data (this is similar to what load_range
returns).
Following load, the tuples are also available in the ._data
attribute of each terminal node. For example:
data_dict = tree.hk.rotation._load()
will return a dict with multiple fields. But after that call one can also access single field data on terminal nodes, e.g.:
t, val = tree.hk.rotation.kikusui_curr
If you want to clear RAM / start over, call ._clear()
on any node
in the tree to clear the data from all its child nodes. E.g.:
tree.hk._clear()
Using ._aliases¶
The load_range
function permits users to associate aliases with
long field names, using the field_list
in the config file (or the
aliases
argument). The HKTree
makes those fields available,
under their aliases, in a special attribute called aliases. For
example the config assignments like this:
field_list:
't40k_dr_side' : 'observatory.LEIA.feeds.temperatures.Channel_7_T'
't40k_far_side': 'observatory.LEIA.feeds.temperatures.Channel_8_T'
would lead to the existence of attributes:
tree._aliases.t40k_dr_side
tree._aliases.t40k_far_side
(Note that attributes can’t be accessed if they begin with a digit or contain special characters… so avoid them in your field_list.)
Fields exposed under ._aliases
also exist in the full tree – the
terminal attributes here are actually the same ones . You
can run ._load()
and ._clear()
on ._aliases
and it will
operate on all the attributes contained there.
In particular note that tree._aliases._load()
should return a data
dictionary where the alias names are used as the keys, instead of the
You can dynamicaly add new fields to the aliases list (as a way of
grouping things together under shorter names) using
tree._add_alias()
, providing an alias string and a target field.
The following are equivalent:
tree._add_alias('short_name', tree.LEIA.temperatures.Channel_1_T)
tree._add_alias('short_name', 'observatory.tree.LEIA.feeds.temperatures.Channel_1_T')
You can add a sub-tree with children; the aliases will be generated automatically; for example:
tree._add_alias(‘terms’, tree.LEIA.temperatures)
Would create aliases called 'therms_Channel_1T'
,
'therms_Channel_2T'
, etc.
Class Reference¶
You should see auto-generated class documentation below.
- class so3g.hk.tree.HKTree(start=None, stop=None, config=None, data_dir=None, pre_proc_dir=None, aliases=None, skip=['observatory', 'feeds'])[source]¶
Scan an HK archive, between two times, and create an attribute tree representing the HK data.
- Parameters
start (time) – Earliest time to include (defaults to 1 day ago).
stop (time) – Latest time to include (defaults to 1 day after start).
config (str) – Filename of a config file (yaml). Alternately a dict can be passed in directly.
data_dir (str) – The root directory for the HK files.
pre_proc_dir (str) – Directory to use to store/retrieve first-pass scanning data (see HKArchiveScanner).
aliases (dict) – Map from alias name to full field name. This setting does not override aliases from the config file but instead will extend them.
skip (list of str) – Tokens to suppress when turning feed names (e.g. “observatory.X.feeds.Y”) into tree components (e.g. X.Y).
Notes
Initialization of the tree requires a “first pass” scan of the HK data archive. The time range you specify is thus very important in limiting the amount of IO activity. The arguments passed are closely related to the load_ranges function in this module; see that docstring.
Config files that work with load_ranges should also work here.
- _add_alias(alias, full_ref, _strip_prefix=None)[source]¶
Add a field to the set of aliases.
- Parameters
Notes
If the full_ref is a non-terminal HKRef, then all children will be added in. The alias for each will be constructed by combining the provided alias string and the attribute names of each child node, joined with ‘_’.
- class so3g.hk.tree.HKRef(name, parent, terminal)[source]¶
Node in an HKTree attribute tree.
Because public child attributes are generated dynamically, important functionality is hidden in “private members”. The
_load
and_clear
methods are documented below, but be aware also of the following attributes:_data
: The loaded data, as a tuple of arrays (t, val)._t
: alias for _data[0]._val
: alias for _data[1]._private
: A dict of information for managing the reference, including the full field name, the root tree object, the list of child refs.
Reading HK Data¶
Typically one is working with an archive of HK files. Loading the data is a two-step process:
Scan the files and cache meta-data about what fields are present at what times.
Seek to the right parts of the right files and load the data for specific fields over a specific time range.
These two steps are accomplished with HKArchiveScanner
and
HKArchive
, respectively. The rest of the section walks through an
example demonstrating basic usage of these two classes.
Scanning a set of files¶
Here is how to instantiate an HKArchiveScanner and scan a bunch of files:
from so3g import hk
import glob
scanner = hk.HKArchiveScanner()
files = glob.glob('/path/to/data/*.g3')
for f in files:
scanner.process_file(f)
arc = scanner.finalize()
The returned object, arc
, is an instance of HKArchive
, and
that can be used to load the actual data vectors.
Reading the data¶
To load data for a specific list of channels over a specific range of
times, call HKArchive.simple
:
fields = ['LSA12.v_channel_1', 'LSA12.v_channel_2']
data = arc.simple(fields)
(t1, voltage1), (t2, voltage2) = data
At the end of this, t1
and t2
are vectors of timestamps and
voltage1
and voltage2
contain the (ostensible) voltage
readings.
To restrict to some time range, pass unix timestamps as the next two
arguments (you can also pass them as keyword arguments, start=
or
end=
):
time_range = (1567800000, 1567900000)
data = arc.simple(fields, time_range[0], time_range[1])
Abbreviating field names¶
Full channel names in an observatory can be quite long
(e.g. “observatory.lat1_agg.LSA1234.Thermometer_29”), so this function
allows you to use shortened forms for the channel names in some
circumstances. Suppose that the HK file set in the example above
contains only channel names beginning with 'LSA12.'
. Then the
function will understand if you ask for:
fields = ['v_channel_1', 'v_channel_2']
data = arc.simple(fields)
An error will be raised if the fields cannot be unambiguously matched
in the time range requested. Note that the logic is only able to
include or exclude parts of the channel name between dots… so it
would not work to request fields = ['_1', '_2']
. The caller can
suppress such matching by passing short_match=False
.
Co-sampling information¶
Why is the method that gets data called simple
? Because there is
a more sophisticated method called get_data
that returns the data
in a somewhat more structured form. This form is should be used when
you care about what fields are co-sampled. See docstring.
Class references¶
- class so3g.hk.HKArchive(field_groups=None)[source]¶
Container for information necessary to determine what data fields are present in a data archive at what times. This object has methods that can determine what fields have data over a given time range, and can group fields that share a timeline (i.e. are co-sampled) over that range.
- get_fields(start=None, end=None)[source]¶
Get list of fields that might have a sample in the time interval [start,end).
Returns the pair of dictionaries
(fields, timelines)
.The
fields
dictionary is a map from field name to a block of field information. Thetimelines
dictionary is a map from timeline name to a block of timeline information.
- get_data(field=None, start=None, end=None, min_stride=None, raw=False, short_match=False)[source]¶
Load data from specified field(s) between specified times.
Arguments
field
,start
,end
,short_match
are as described in _get_groups.- Parameters
- Returns
Pair of dictionaries, (data, timelines). The
data
dictionary is a simple map from field name to a numpy array of readings. Thetimelines
dictionary is a map from field group name to a dictionary of timeline information, which has entries:'t'
: numpy array of timestamps'fields'
: list of fields belonging to this group.'finalized_until'
: in cases where the data are still in flux, this field provides a timestamp that may be taken as the earliest time that needs to be requeried. This is part of the interface in order to support data streams that are being updated in real time.
If user requested raw=True, then return value is a list of tuples of the form (group_name, block) where block is a single G3TimesampleMap carrying all the data for that co-sampled group.
- simple(fields=None, start=None, end=None, min_stride=None, raw=False, short_match=True)[source]¶
Load data from specified field(s) between specified times, and unpack the data for ease of use. Use get_data if you want to preserve the co-sampling structure.
Arguments
field
,start
,end
,short_match
are as described in _get_groups. However,fields
can be a single string rather than a list of strings.Note that
short_match
defaults to True (which is not the case for getdata).x- Returns
List of pairs of numpy arrays (t,y) corresponding to each field in the
fields
list. Iffields
is a string, a simple pair (t,y) is returned.t
andy
are numpy arrays of equal length containing the timestamps and field readings, respectively. In cases where two fields are co-sampled, the time vector will be the same object.In cases where there are no data for the requested field in the time range, a pair of length 0 float arrays is returned.
- class so3g.hk.HKArchiveScanner(pre_proc_dir=None, pre_proc_mode=None)[source]¶
Consumes SO Housekeeping streams and creates a record of what fields cover what time ranges. This can run as a G3Pipeline module, but will not be able to record stream indexing information in that case. If it’s populated through the process_file method, then index information (in the sense of filenames and byte offsets) will be stored.
After processing frames, calling .finalize() will return an HKArchive that can be used to load data more efficiently.
- Process(f, index_info=None)[source]¶
Processes a frame. Only Housekeeping frames will be examined; other frames will simply be counted. All frames are passed through unmodified. The index_info will be stored along with a description of the frame’s data; see the .process_file function.
- flush(provs=None)[source]¶
Convert any cached provider information into _FieldGroup information. Delete the provider information. This will be called automatically during frame processing if a provider session ends. Once frame processing has completed, this function should be called, with no arguments, to flush any remaining provider sessions.
- finalize()[source]¶
Finalize the processing by calling flush(), then return an HKArchive with all the _FieldGroup information from this scanning session.
- process_file(filename, flush_after=True)[source]¶
Process the file specified by
filename
using a G3IndexedReader. Each frame from the file is passed to self.Process, with the optional index_info argument set to a dictionary containing the filename and byte_offset of the frame.Internal data grouping will be somewhat cleaner if the multiple files from a single aggregator “session” are passed to this function in acquisition order. In that case, call with flush_after=False.
- process_file_with_cache(filename)[source]¶
Processes file specified by
filename
using the process_file method above. If self.pre_proc_dir is specified (not None), it will load pickled HKArchiveScanner objects and concatenates with self instead of re-processing each frame, if the corresponding file exists. If the pkl file does not exist, it processes it and saves the result (in the pre_proc_dir) so it can be used in the future. If self.pre_proc_dir is not specified, this becomes equivalent to process_file.
Checking Files with so-hk-tool¶
The command line tool so-hk-tool
can be used to scan one or more SO
HK files and summarize info about the files, the providers, and the
fields within. For example:
$ so-hk-tool list-provs /mnt/so1/data/ucsd-sat1/hk/16685/1668577223.g3
provider_name total_bytes frame_bytes
--------------------------------------- ----------- -----------
BK9130C-1.psu_output 67152 11192.0
BK9130C-2.psu_output 67152 11192.0
DRptc1.ptc_status 1006500 16775.0
LSA21US.temperatures 388271 6365.1
LSA21YC.temperatures 869316 144886.0
LSA22QC.temperatures 585204 97534.0
LSA24LY.temperatures 897660 14961.0
LSA24M5.temperatures 896376 14939.6
LSA2619.temperatures 895620 14927.0
SSG3KRM-2_2.ups 34169 5694.8
...
The tool can run the following analyses:
list-files
: List each file, its size, and report any stray bytes (partial trailing frames).list-provs
: List each provider found in the files.list-fields
: List each field found in the files.
See more details below. Please note that when presenting provider and
field names, the program strips out tokens observatory
and
feeds
, by default (for example, observatory.DRptc1.feeds.ptc_status
becomes DRptc1.ptc_status
). Pass ``–strip-tokens=””` to instead
show the full provider / feed names.
usage: so-hk-tool [-h] {list-files,list-provs,list-fields} ...
Positional Arguments¶
- mode
Possible choices: list-files, list-provs, list-fields
Sub-commands:¶
list-files¶
Report per-file stats.
so-hk-tool list-files [options] FILE [...]
This module reads each file and reports basic stats such as size and
whether the stream is valid.
Positional Arguments¶
- files
One or more G3 files to scan.
options¶
- --recursive, -r
All arguments are traversed recursively; only files .g3 extension files are scanned.
Default: False
- --strip-tokens
Tokens to hide in provider and field names. Pass this is a single .-delimited string.
Default: “observatory.feeds”
- --block-size, -B
Summarize storage in units of bytes,kB,MB,GB (pass b,k,M,G).
Default: “b”
- --sort-size, -s
Sort results, if applicable, by size (descending).
Default: False
- --csv
Store data as CSV to specified filename.
list-provs¶
List all data providers (feeds).
so-hk-tool list-provs [options] FILE [...]
This module reads all specified files and reports a list of
all data providers (a.k.a. feeds) encountered in the data,
along with total data volume and average frame size, per
provider.
Positional Arguments¶
- files
One or more G3 files to scan.
options¶
- --recursive, -r
All arguments are traversed recursively; only files .g3 extension files are scanned.
Default: False
- --strip-tokens
Tokens to hide in provider and field names. Pass this is a single .-delimited string.
Default: “observatory.feeds”
- --block-size, -B
Summarize storage in units of bytes,kB,MB,GB (pass b,k,M,G).
Default: “b”
- --sort-size, -s
Sort results, if applicable, by size (descending).
Default: False
- --csv
Store data as CSV to specified filename.
list-fields¶
List all data field names.
so-hk-tool list-fields [options] FILE [...]
This module reads all specified files and reports a list of
all data fields with their total sample count.
Positional Arguments¶
- files
One or more G3 files to scan.
options¶
- --recursive, -r
All arguments are traversed recursively; only files .g3 extension files are scanned.
Default: False
- --strip-tokens
Tokens to hide in provider and field names. Pass this is a single .-delimited string.
Default: “observatory.feeds”
- --block-size, -B
Summarize storage in units of bytes,kB,MB,GB (pass b,k,M,G).
Default: “b”
- --sort-size, -s
Sort results, if applicable, by size (descending).
Default: False
- --csv
Store data as CSV to specified filename.
Run ‘%(prog)s COMMAND –help’ to see additional details and options.
HK Data Types and File Structure¶
The HK file structures and versions are described in https://github.com/simonsobs/tod2maps_docs/tod_format.
As of August 2021, all HK data uses schema version 2, which supports
vectors of float, integer, or string data. The original form (schema
version 0) supported only floats, and required access to compiled G3
extensions in so3g. In case there are still some schema 0 data out
there, conversion of individual files can be achieved with help from
so3g.hk.HKTranslator
.
Writing HK Data¶
The so3g.hk module provides limited assistance with creating HK data
files. The so3g.hk.HKSessionHelper
may be used to produce
template frames that can be used as a basis for an HK data stream.
However, the code in this module does not enforce validity. (The OCS
“aggregator” Agent has more sophisticated logic to help write only
valid HK frame streams.)
Here is a short example that creates a housekeeping file containing
some fake pointing information (it can be found the repository as
demos/write_hk.py
):
# Note this demo is included directly from the package docs!
#
# This code generates a G3 file containing some telescope pointing
# data in the "SO HK" format. When expanding it, check the SO HK
# format description to make sure your frame stream is compliant.
import time
import numpy as np
from spt3g import core
from so3g import hk
# Start a "Session" to help generate template frames.
session = hk.HKSessionHelper(hkagg_version=2)
# Create an output file and write the initial "session" frame. If
# you break the data into multiple files, you write the session frame
# at the start of each file.
writer = core.G3Writer('hk_example.g3')
writer.Process(session.session_frame())
# Create a new data "provider". This represents a single data
# source, sending data for some fixed list of fields.
prov_id = session.add_provider('platform')
# Whenever there is a change in the active "providers", write a
# "status" frame.
writer.Process(session.status_frame())
# Write, like, 10 half-scans.
frame_time = time.time()
v_az = 1.5 # degrees/second
dt = 0.001 # seconds
halfscan = 10 # degrees
for i in range(10):
# Number of samples
n = int(halfscan / v_az / dt)
# Vector of unix timestamps
t = frame_time + dt * np.arange(n)
# Vector of az and el
az = v_az * dt * np.arange(n)
if i % 2:
az = -az
el = az * 0 + 50.
# Construct a "block", which is a named G3TimesampleMap.
block = core.G3TimesampleMap()
block.times = core.G3VectorTime([core.G3Time(_t * core.G3Units.s)
for _t in t])
block['az'] = core.G3VectorDouble(az)
block['el'] = core.G3VectorDouble(el)
# Create an output data frame template associated with this
# provider.
frame = session.data_frame(prov_id)
# Add the block and block name to the frame, and write it.
frame['block_names'].append('pointing')
frame['blocks'].append(block)
writer.Process(frame)
# For next iteration.
frame_time += n * dt
When extending this example for other purposes, here are a few things to remember, to help generate valid HK streams:
Notice that “block” is a G3TimesampleMap, a class designed to store multiple data vectors alongside a single vector of timesamples. If your provider has fields with different time sampling, group them so each block corresponds to mutually co-sampled fields.
The “block name” is an internal bookkeeping thing and won’t be visible to the consumer of the HK data. In the example above, the data vectors would be exposed through the combination of the provider and field name, i.e.. “platform.az”, “platform.el”.
Once you start a provider, and get a
prov_id
for it, then any named blocks you add must always have the same fields with the same data types. (For example, it would be illegal to only have some frames where the block contains only “el” and not “az”.) If you need to change the list of fields, useremove_provider()
and the re-add the provider (with the same name).
Below, find the documentation for the HKSessionHelper class.
- class so3g.hk.HKSessionHelper(session_id=None, start_time=None, hkagg_version=None, description='No description provided.')[source]¶
Helper class to produce G3Frame templates for creating streams of generic HK data.
- Parameters
session_id – an integer session ID for the HK session. If not provided (recommended) then it will be generated based on the PID, the start_time, and the description string.
start_time (float) – a timestamp to use for the HK session.
hkagg_version (int) – schema version code, which will be written into every frame. Defaults to zero, for backwards compatibility. Make sure this is set correctly.
description (str) – a description of the agent generating the stream.
- add_provider(description=None)[source]¶
Register a provider and return the unique prov_id. (Remember to write a status frame before starting to write data for this provider.)
- Parameters
description (str) – The name to use for the provider. When later retrieving the data, this will act as a prefix for all the data fields.
- Returns
prov_id (int).
- remove_provider(prov_id)[source]¶
Drops a provider from the active list, so it will not be listed in the status frame.
- Parameters
prov_id (int) – The provider ID returned by add_provider.
- session_frame()[source]¶
Return the Session frame. No additional information needs to be added to this frame. This frame initializes the HK stream (so it should be the first frame of each HK file; it should precede all other HK frames in a network source).
Low-level HK Stream Processing¶
There are a few HK stream processing objects that are intended for use
as modules in a G3Pipeline
. These are their stories.
Module: HKScanner¶
- class so3g.hk.HKScanner[source]¶
Module that scans and reports on HK archive contents and compliance.
- stats¶
A nested dictionary of statistics that are updated as frames are processed by the module. Elements:
n_hk
(int): The number of HK frames encountered.n_other
(int): The number of non-HK frames encountered.n_session
(int): The number of distinct HK sessions processed.concerns
(dict): The number of warning (keyn_warning
) and error (keyn_error
) events encountered. The detail for such events is logged tospt3g.core.log_warning
/log_error
.versions
(dict): The number of frames (value) (value) encountered that have a given hk_agg_version (key).
- Type
Module: HKReframer¶
Module: HKTranslator¶
- class so3g.hk.HKTranslator(target_version=2, future_tolerant=True)[source]¶
Translates SO Housekeeping frames from schema versions {v0, v1} to schema version 2. Passes v2 (or newer) frames through, unmodified. This can be used in a G3Pipeline to condition archival HK streams for processing by v2-compatible code. (Note that code that works with the short-lived v1 schema should also work on a v2 stream, unless it explicitly rejects based on hkagg_version.)
Version 1/2 are not a strict superset of version 0, but the main structural features eliminated in v1 (field prefixes) was not really used.
Arguments:
- target_version (int): 0, 1, or 2. Version to which to translate
the stream. The code is not able to downgrade a stream. See future_tolerant parameter.
- future_tolerant (bool): Determines the behavior of the
translator should it encounter a frame with hkagg_version higher than target_version. If future_tolerant is True, the frame will be passed through unmodified. Otherwise, a ValueError will be raised.
- Process(f)[source]¶
Translates one frame to the target schema. Irrelevant frames are passed through unmodified.
- Parameters
f – a G3Frame
- Returns
A list containing only the translated frame. G3Pipeline compatibility would permit us to return a single frame here, instead of a length-1 list. But we also sometimes call Process outside of a G3Pipeline, where a consistent output type is desirable. Returning lists is most future-compatible; consumers that want to assume length-1 should assert it to be true.