C++ Objects

Intervals

The Intervals<T> objects have the following interface:

class so3g.IntervalsDouble[source]

A finite series of non-overlapping semi-open intervals on a domain of type: double.

add_interval(start, end) IntervalsDouble[source]

Merge an interval into the set.

append_interval_no_check(start, end) IntervalsDouble[source]

Append an interval to the set without checking for overlap or sequence.

merge(arg2) IntervalsDouble[source]

Merge an Intervals into the set.

intersect(source) IntervalsDouble[source]

Intersect another doublewith this one.

complement() IntervalsDouble[source]

Return the complement (over domain).

array() object[source]

Return the intervals as a 2-d numpy array.

static from_array(input_array) IntervalsDouble[source]

Return a IntervalsDouble based on an (n,2) ndarray.

static from_mask(input_array, n_bits) object[source]

Return a list of IntervalsDouble, extracted from the first n_bits bits of input_array (a 1-d array of integer type).

static mask(intervals_list, n_bits) object[source]

Return an ndarray bitmask from a list of IntervalsDouble. The dtype will be the smallest available to hold n_bits.

copy() IntervalsDouble[source]

Get a new object with a copy of the data.

G3SuperTimestream

G3SuperTimestream is for storing 2d arrays of data where the first axis corresponds to named channels and the second axis indexes time. The data arrays can have int32, int64, float32, or float64 data types. It has configurable compression options in order to accomodate different kinds of data.

For arrays of integers, lossless compression is enabled by default. For arrays of floats, compression can be enabled that will be lossless over a reduced dynamic and precision range.

Creating a G3SuperTimestream

To construct a G3SuperTimestream, populate the axis information, then load in the data. Here is an example:

# imports
from spt3g import core
import so3g
import numpy as np

# The data we want to capture
times = 1680000000 + 0.2 * np.arange(10000)
names = ['a', 'b', 'c', 'd', 'e']
data = (np.random.normal(size=(len(names), len(times))) * 256).astype('int32')

# Creation of a G3SuperTimestream
ts = so3g.G3SuperTimestream()
ts.names = names
ts.times = core.G3VectorTime(times * core.G3Units.s)
ts.data = data

The object is now complete, and can be serialized. In the default configuration, arrays with int32 or int64 data types will be compressed losslessly using a combination of FLAC and bzip.

There are some overloaded constructors, so the last bit can be done as a one-liner, i.e.:

# Creation of a G3SuperTimestream in one line
ts = so3g.G3SuperTimestream(
  names, core.G3VectorTime(times * core.G3Units.s), data)

Controlling Compression

You can trigger compression of the data by calling .encode(). The reference in .data is released, and a binary blob with compressed data is saved internally. If you try to access .data after calling .encode(), a new array will be created (by decompressing the blob) and returned.

Compression will be triggered automatically on frame serialization (i.e. when the frame is written to a file or network stream). Because serialization is a const operation, the binary blob of compressed data is not stored in this case, and the reference to .data is not released. So there may be performance advantages to calling .encode() “manually” before passing your object through to consumers that might want to use it in serialized form.

It is possible to tweak the compression algorithms, through the .options method, but this should be done with care. For compression evaluation and basic debugging one probably only wants to use the highest level control, which simply enables or disables compression:

ts.options(enable=0)  # disable compression
ts.options(enable=1)  # enable compression with default params

Two arguments allow some finer grain control over the FLAC and BZ2 algorithms and should not cause trouble (other than inefficiency) if manipulated by the user:

flac_level

The FLAC compression level, passed through to FLAC__stream_encoder_set_compression_level. Integer from 0 to 8 with higher numbers corresponding to slower but potentially better compression.

bz2_workFactor

The bzip2 workFactor, as described in BZ2_bzCompressInit. This has something to do with how soon the bz2 algorithm gives up on difficult (highly repetitive) data.

The additional arguments, data_algo and times_algo, are for debugging and should not be messed with lightly.

How to work with float arrays

The G3SuperTimestream can also be used to carry non-integer data, which is presented to the user as arrays of float32 or float64. For channels indexed by i and samples indexed by t, the non-integer data array x will be discretized for compression and serialization according to \(x_{it} = y_{it} \times q_i\), where y is an array of integers and q are per-channel quanta (float64).

There are two ways to enter “float mode”. These are demonstrated in two examples.

Example 1: Populate a G3SuperTimestream with an integer array, then apply a calibration factor (one per channel) using .calibrate:

# imports
from spt3g import core
import so3g
import numpy as np

# The data we want to capture
times = 1680000000 + 0.2 * np.arange(10000)
names = ['a', 'b', 'c', 'd', 'e']
data = (np.random.normal(size=(len(names), len(times))) * 256).astype('int32')

# Creation of a G3SuperTimestream
ts = so3g.G3SuperTimestream(
  names, core.G3VectorTime(times * core.G3Units.s), data)

# Calibrate to, like, pW or something.
pW_per_DAC = [1.23, 1.45, 1.89, 1.56, 1.01]
ts.calibrate(pW_per_DAC)

Example 2: Populate a G3SuperTimestream by first assigning discretization units (one per channel) to .quanta, and then assigning a float array to .data:

# imports
from spt3g import core
import so3g
import numpy as np

# The data we want to capture
times = 1680000000 + 0.2 * np.arange(10000)
names = ['a', 'b', 'c', 'd', 'e']
data = (np.random.normal(size=(len(names), len(times))) * 256).astype('float64')

# Creation of a G3SuperTimestream -- note we must set .quanta before
# setting .data to a float array.
ts = so3g.G3SuperTimestream(names, core.G3VectorTime(times * core.G3Units.s)
ts.quanta = 0.01 * np.ones(len(names))
ts.data = data

Note that the last few lines are equivalent to:

# Creation of a G3SuperTimestream carrying floats with resolution 0.01
ts = so3g.G3SuperTimestream(
  names, core.G3VectorTime(times * core.G3Units.s),
  data, 0.01 * np.ones(len(names)))

This object is not suitable for lossless compression of arbitrary float32 and float64 arrays. The operator needs to have some idea of what resolution must be preserved, i.e. how much rounding of arbitrary float data is acceptable in the application.

Prior to compression the .data member will look like a float array; but when compression is requested (or serialization occurs), the following will happen:

  • The array of floats x is converted to integers, y = round(x / quantum). If x is float32, then y will be packed into int32. If x is float64, then y will be int64.

  • The integer array y is compressed according to the lossless scheme used for integer data.

The precision of a particular non-zero float32 is approximately \(2^{-23}\) (1.2e-7) times its magnitude. If we fix the precision at 0.001, then the set of numbers that can be safely encoded by our float32 scheme is all multiples of 0.001 between -8388.6 and +8388.6.

Equivalently for float64 the precision is \(2^{-52}\) (2.2e-16) times magnitude, so with a precision of 0.001 we have dynamic range of about -4.5e12 to +4.5e12.

Working in C++

Here find an example of constructing a G3SuperTimestream from within C++, using the method SetDataFromBuffer to pass in a flat memory buffer. It will copy data into a new numpy array from a C-ordered memory block, allowing the caller to re-use the memory block. A rough example is presented below; see also the implementation of test_cxx_interface() in G3SuperTimestream.cxx.

 // Consider int32 array with 3 channels and 1000 samples.
 int shape[2] = {3, 1000};
 int typenum = NPY_INT32;

 // Use a flat buffer for storage.
 void *buf = calloc(shape[0] * shape[1], sizeof(int32_t));

 // ... fill up buf somehow ...

 // Create and manage a new G3SuperTimestream.
 auto ts = G3SuperTimestreamPtr(new G3SuperTimestream());

 // Set the channel names and timestamps.
 const char *chans[] = {"a", "b", "c"};
 ts->names = G3VectorString(chans, std::end(chans));
 ts->times = G3VectorTime();
 for (int i=0; i<n_samps; i++)
   ts->times.push_back(G3Time::Now());

 // Set compression options?
 // ts->Options(encode=0);

 // Set ts->data, by copying data from our buffer.
 ts->SetDataFromBuffer(buf, 2, shape, typenum, std::pair<int,int>(0, shape[1]));

 // Do something with ts...
 // writer->Process(ts);

 // Free what we allocated.
 free(buf);
}

Note

Because SetDataFromBuffer creates a new Python object (a numpy array), you should be holding the global interpreter lock (GIL) when that function is called, and also when you call Encode() or Decode(). Because G3SuperTimestream (sometimes) holds references to Python objects, using the object in “pure” C++ applications is non-trivial. But it’s possible, and could be made smoother if need be.

Interface autodoc

class so3g.G3SuperTimestream[source]

Construct with names and times uninitialized.

encode()[source]

Compress.

decode()[source]

Decompress.

calibrate(cal_factors)[source]

Apply per-channel scale factors. Note this puts you in float mode (if you’re not already) and modifies both .quanta and .data.

options()[source]

Set compression options.