ASTROMER

Models

Single-Band Encoder

The Single-Band Encoder represents the main class of the models, which load, fit, encode and train the preprocessed weights.

It took every single-band light curve that may vary between different stars, and this depends on the objectives of the survey being carried out.

The X is a set of observations of a celestial object over time (such as a star). Each observation had two characteristics: the magnitude (brightness) of the object and the Modified Julian Date (MJD) when the observation was made.

We propose to use learned representations of a transformer-based encoder to create embeddings that represent the variability of objects in dk.dimensional space. Making easy to fine-tune the model weights to match other surveys and use them to solve downstream task, such as classification or regression.

class ASTROMER.models.SingleBandEncoder(num_layers=2, d_model=200, num_heads=2, dff=256, base=10000, dropout=0.1, maxlen=100, batch_size=None)[source]

Bases: object

This class is a transformer-based model that process the input and generate a fixed-size representation Since each light curve has two characteristics (magnitude and time) we transform into embeddings Z = 200x256.

The maximum number of observations remain fixed and masked, so every Z had the same length even if some light curves are shorter than others.

Parameters

num_layer (Integer) – Number of self-attention blocks or transformer layers in the encoder.
d_model (Integer) – Determines the dimensionality of the model’s internal representation (must be divisible by ‘num_heads’).
num_heads (Integer) – Number of attention heads used in an attention layer.
dff (Integer) – Number of neurons for the fully-connected layer applied after the attention layers. It consists of two linear transformations with a non-linear activation function in between.
base (Float32) – Value that defines the maximum and minimum wavelengths of the positional encoder (see equation 4 on Oliva-Donoso et al. 2022). Is used to define the range of positions the attention mechanism uses to compute the attention weights.
dropout (Float32) – Regularization applied to output of the fully-connected layer to prevent overfitting. Randomly dropping out (i.e., setting to zero) some fraction of the input units in a layer during training.
maxlen (Integer) – Maximum length to process in the encoder. It is used in the SingleBandEncoder class to limit the input sequences’ length when passed to the transformer-based model.
batch_size (Integer) – Number of samples to be used in a forward pass. Note an epoch is completed when all batches were processed (default none).

encode(dataset, oids_list=None, labels=None, batch_size=1, concatenate=True)[source]

This method encodes a dataset of light curves into a fixed-dimensional embedding using the ASTROMER encoder. The method first checks the format of the dataset containing the light curves.

Then, it loads the dataset using predefined functions from the ‘data’ module. In this part, if a light curve contains more than 200 observations, ASTROMER will divide it into shorter windows of 200 length.

After loading data, the data pass through the encoder layer to obtain the embeddings.

Parameters

dataset – The input data to be encoded. It can be a list of numpy arrays or a tensorflow dataset.
oids_list (List) – list of object IDs. Since ASTROMER can only process fixed sequence of 200 observations, providing the IDs allows the model to concatenate windows when the length of the objects is larger than 200.
labels – an optional list of labels for the objects associated to the input dataset.
batch_size – the number of samples to be used in a forward-pass within the encoder. Default is 1.
concatenate (Boolean) – a boolean indicating whether to concatenate the embeddings of objects with the same ID into a single vector.

Returns

fit(train_batches, valid_batches, epochs=2, patience=40, lr=0.001, project_path='.', verbose=0)[source]

The ‘fit()’ method trains ASTROMER for a given number of epochs. After each epoch, the model’s performance is evaluated on the validation data, and the training stops if there is no improvement in a specified number of epochs (patience).

Parameters

train_batches (Object) – Training data already formatted as TF.data.Dataset
valid_batches (Object) – Validation data already formatted as TF.data.Dataset
epochs (Integer) – Number of training loops in where all light curves have been processed.
patience (Integer) – The number of epochs with no improvement after which training will be stopped.
lr (Float32) – A float specifying the learning rate
project_path – Path for saving weights and training logs
verbose (Integer) – if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported.”

Returns

from_pretraining(name='macho')[source]

Loads a pre-trained model with pre-trained weights for a specific astronomical dataset. This method allows users to easily load pre-trained models for astronomical time-series datasets and use them for their purposes.

This method checks if you have the weights locally, if not then downloads and then uploads them.

Parameters: name – Corresponds to the name of the survey used to pre-train ASTROMER. The name of the survey should match with the name of the zip file in https://github.com/astromer-science/weights
Returns

load_weights(weights_folder)[source]

The ‘load_weights()’ method loads pre-trained parameters into the model architecture. The method loads the weights from the file located at {weights_folder}/weights directory, which is assumed to be in TensorFlow checkpoint format.

Parameters: weights_folder – the path to the folder containing the pre-trained weights.
Returns

Preprocessing

ASTROMER.preprocessing.make_pretraining(input, batch_size=1, shuffle=False, sampling=False, max_obs=100, msk_frac=0.0, rnd_frac=0.0, same_frac=0.0, repeat=1, n_classes=-1, **numpy_args)[source]

Load and format data to feed the ASTROMER model. On this version, this function is able to process a list of numpy arrays or tf.records. The output is a tensorflow dataset (Tf.data) that was generated by following the preprocessing strategy explained in Section 5.3 (Donoso-Oliva, et al. 2022)

Parameters

input (object) – Dataset source. If using records then ‘input’ is a string pointing to the local directory containing the records files (e.g., ./my_records/train). The other option consists in passing a list of numpy arrays (light curves)
batch_size (Integer) – Determines the number of subsets using during training. Notice that len(subset)<len(dataset).
shuffle (Boolean) – Shuffle dataset before passing batches
sampling (Boolean) – If True, for each light curve we will sample a single window of length max_obs. If False, the light curve will be divided into max_obs windows covering all observations.
max_obs (Integer) – Indicates how long each input sample will be. In general, we use shorter sequences to train the model, avoiding overloading the memory or extremely zero-padding the sequence.
msk_frac (Float32) – The fraction of observations for each window that will be masked and therefore not considered by the attention layer. This fraction is used to calculate the RMSE on the loss function.
rnd_frac (Float32) – The fraction of masked values that will be changed by random observations from the same window. (This is inspired by BERT et.al., 2018)
same_frac (Float32) – The fraction of the masked values that will be unmask and processed by the attention layer. Since same_frac observations are initially part of the masked fraction we still use them to evaluate the loss function.
repeat (Integer) – Determines the number of times we repeat each light curve in the dataset.

Utils

ASTROMER.utils.download_weights(url, target)[source]

This method delivers the weights requested in the SingleBandEncoder class using the method ‘from_pretraining()’ that specifies the available surveys; ‘macho’, ‘atlas’ and ‘ztfg’. The UTILS module it’s a set of functions that allow performing functions not considered in models and preprocessing.

This code provides a simple and convenient way to download and extract zipped files from a URL to a specified directory using Python.

Parameters

url –
target –

Quick-start

Install

First, install the ASTROMER wheel using pip

pip install ASTROMER

from ASTROMER.models import SingleBandEncoder

Then initiate

model = SingleBandEncoder()
model = model.from_pretraining('macho')

It will automatically download the weights from this public github repository and load them into the SingleBandEncoder instance. Assuming you have a list of vary-lenght (numpy) light curves.

import numpy as np

samples_collection = [ np.array([[5200, 0.3, 0.2],
                                 [5300, 0.5, 0.1],
                                 [5400, 0.2, 0.3]]),

Light curves are Lx3 matrices with time, magnitude, and magnitude std. To encode samples use:

attention_vectors = model.encode(samples_collection,
                                  oids_list=['1', '2'],
                                  batch_size=1,
                                  concatenate=True)

Fine Tune

ASTROMER can be easly trained by using the fit. It include

from ASTROMER import SingleBandEncoder

model = SingleBandEncoder(num_layers= 2,
                       d_model   = 256,
                       num_heads = 4,
                       dff       = 128,
                       base      = 1000,
                       dropout   = 0.1,
                       maxlen    = 200)

model.from_pretrained('macho')

where,

num_layers: Number of self-attention blocks
d_model: Self-attention block dimension (must be divisible by num_heads)
num_heads: Number of heads within the self-attention block
dff: Number of neurons for the fully-connected layer applied after the attention blocks
base: Positional encoder base (see formula)
dropout: Dropout applied to output of the fully-connected layer
maxlen: Maximum length to process in the encoder

Notice you can ignore model.from_pretrained(‘macho’) for clean training.

mode.fit(train_data,
      validation_data,
      epochs=2,
      patience=20,
      lr=1e-3,
      project_path='./my_folder',
      verbose=0)

where,

train_data: Training data already formatted as tf.data
validation_data: Validation data already formatted as tf.data
epochs: Number of epochs for training
patience: Early stopping patience
lr: Learning rate
project_path: Path for saving weights and training logs
verbose: (0) Display information during training (1) don’t

train_data and validation_data should be loaded using load_numpy or pretraining_records functions. Both functions are in the ASTROMER.preprocessing module.

For large datasets is recommended to use Tensorflow Records see this tutorial to execute our data pipeline