ASTROMER

Models

Single-Band Encoder

The Single-Band Encoder represents the main class of the models, which load, fit, encode and train the preprocessed weights.

It took every single-band light curve that may vary between different stars, and this depends on the objectives of the survey being carried out.

The X is a set of observations of a celestial object over time (such as a star). Each observation had two characteristics: the magnitude (brightness) of the object and the Modified Julian Date (MJD) when the observation was made.

We propose to use learned representations of a transformer-based encoder to create embeddings that represent the variability of objects in dk.dimensional space. Making easy to fine-tune the model weights to match other surveys and use them to solve downstream task, such as classification or regression.

class ASTROMER.models.SingleBandEncoder(num_layers=2, d_model=200, num_heads=2, dff=256, base=10000, dropout=0.1, maxlen=100, batch_size=None)[source]

Bases: object

This class is a transformer-based model that process the input and generate a fixed-size representation Since each light curve has two characteristics (magnitude and time) we transform into embeddings Z = 200x256.

The maximum number of observations remain fixed and masked, so every Z had the same length even if some light curves are shorter than others.

Parameters
  • num_layer (Integer) – Number of self-attention blocks or transformer layers in the encoder.

  • d_model (Integer) – Determines the dimensionality of the model’s internal representation (must be divisible by ‘num_heads’).

  • num_heads (Integer) – Number of attention heads used in an attention layer.

  • dff (Integer) – Number of neurons for the fully-connected layer applied after the attention layers. It consists of two linear transformations with a non-linear activation function in between.

  • base (Float32) – Value that defines the maximum and minimum wavelengths of the positional encoder (see equation 4 on Oliva-Donoso et al. 2022). Is used to define the range of positions the attention mechanism uses to compute the attention weights.

  • dropout (Float32) – Regularization applied to output of the fully-connected layer to prevent overfitting. Randomly dropping out (i.e., setting to zero) some fraction of the input units in a layer during training.

  • maxlen (Integer) – Maximum length to process in the encoder. It is used in the SingleBandEncoder class to limit the input sequences’ length when passed to the transformer-based model.

  • batch_size (Integer) – Number of samples to be used in a forward pass. Note an epoch is completed when all batches were processed (default none).

encode(dataset, oids_list=None, labels=None, batch_size=1, concatenate=True)[source]

This method encodes a dataset of light curves into a fixed-dimensional embedding using the ASTROMER encoder. The method first checks the format of the dataset containing the light curves.

Then, it loads the dataset using predefined functions from the ‘data’ module. In this part, if a light curve contains more than 200 observations, ASTROMER will divide it into shorter windows of 200 length.

After loading data, the data pass through the encoder layer to obtain the embeddings.

Parameters
  • dataset – The input data to be encoded. It can be a list of numpy arrays or a tensorflow dataset.

  • oids_list (List) – list of object IDs. Since ASTROMER can only process fixed sequence of 200 observations, providing the IDs allows the model to concatenate windows when the length of the objects is larger than 200.

  • labels – an optional list of labels for the objects associated to the input dataset.

  • batch_size – the number of samples to be used in a forward-pass within the encoder. Default is 1.

  • concatenate (Boolean) – a boolean indicating whether to concatenate the embeddings of objects with the same ID into a single vector.

Returns

fit(train_batches, valid_batches, epochs=2, patience=40, lr=0.001, project_path='.', verbose=0)[source]

The ‘fit()’ method trains ASTROMER for a given number of epochs. After each epoch, the model’s performance is evaluated on the validation data, and the training stops if there is no improvement in a specified number of epochs (patience).

Parameters
  • train_batches (Object) – Training data already formatted as TF.data.Dataset

  • valid_batches (Object) – Validation data already formatted as TF.data.Dataset

  • epochs (Integer) – Number of training loops in where all light curves have been processed.

  • patience (Integer) – The number of epochs with no improvement after which training will be stopped.

  • lr (Float32) – A float specifying the learning rate

  • project_path – Path for saving weights and training logs

  • verbose (Integer) – if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported.”

Returns

from_pretraining(name='macho')[source]

Loads a pre-trained model with pre-trained weights for a specific astronomical dataset. This method allows users to easily load pre-trained models for astronomical time-series datasets and use them for their purposes.

This method checks if you have the weights locally, if not then downloads and then uploads them.

Parameters

name – Corresponds to the name of the survey used to pre-train ASTROMER. The name of the survey should match with the name of the zip file in https://github.com/astromer-science/weights

Returns

load_weights(weights_folder)[source]

The ‘load_weights()’ method loads pre-trained parameters into the model architecture. The method loads the weights from the file located at {weights_folder}/weights directory, which is assumed to be in TensorFlow checkpoint format.

Parameters

weights_folder – the path to the folder containing the pre-trained weights.

Returns

Preprocessing

ASTROMER.preprocessing.make_pretraining(input, batch_size=1, shuffle=False, sampling=False, max_obs=100, msk_frac=0.0, rnd_frac=0.0, same_frac=0.0, repeat=1, n_classes=-1, **numpy_args)[source]

Load and format data to feed the ASTROMER model. On this version, this function is able to process a list of numpy arrays or tf.records. The output is a tensorflow dataset (Tf.data) that was generated by following the preprocessing strategy explained in Section 5.3 (Donoso-Oliva, et al. 2022)

Parameters
  • input (object) – Dataset source. If using records then ‘input’ is a string pointing to the local directory containing the records files (e.g., ./my_records/train). The other option consists in passing a list of numpy arrays (light curves)

  • batch_size (Integer) – Determines the number of subsets using during training. Notice that len(subset)<len(dataset).

  • shuffle (Boolean) – Shuffle dataset before passing batches

  • sampling (Boolean) – If True, for each light curve we will sample a single window of length max_obs. If False, the light curve will be divided into max_obs windows covering all observations.

  • max_obs (Integer) – Indicates how long each input sample will be. In general, we use shorter sequences to train the model, avoiding overloading the memory or extremely zero-padding the sequence.

  • msk_frac (Float32) – The fraction of observations for each window that will be masked and therefore not considered by the attention layer. This fraction is used to calculate the RMSE on the loss function.

  • rnd_frac (Float32) – The fraction of masked values that will be changed by random observations from the same window. (This is inspired by BERT et.al., 2018)

  • same_frac (Float32) – The fraction of the masked values that will be unmask and processed by the attention layer. Since same_frac observations are initially part of the masked fraction we still use them to evaluate the loss function.

  • repeat (Integer) – Determines the number of times we repeat each light curve in the dataset.

Utils

ASTROMER.utils.download_weights(url, target)[source]

This method delivers the weights requested in the SingleBandEncoder class using the method ‘from_pretraining()’ that specifies the available surveys; ‘macho’, ‘atlas’ and ‘ztfg’. The UTILS module it’s a set of functions that allow performing functions not considered in models and preprocessing.

This code provides a simple and convenient way to download and extract zipped files from a URL to a specified directory using Python.

Parameters
  • url

  • target

Quick-start

Install

First, install the ASTROMER wheel using pip

pip install ASTROMER
from ASTROMER.models import SingleBandEncoder

Then initiate

model = SingleBandEncoder()
model = model.from_pretraining('macho')

It will automatically download the weights from this public github repository and load them into the SingleBandEncoder instance. Assuming you have a list of vary-lenght (numpy) light curves.

import numpy as np

samples_collection = [ np.array([[5200, 0.3, 0.2],
                                 [5300, 0.5, 0.1],
                                 [5400, 0.2, 0.3]]),

Light curves are Lx3 matrices with time, magnitude, and magnitude std. To encode samples use:

attention_vectors = model.encode(samples_collection,
                                  oids_list=['1', '2'],
                                  batch_size=1,
                                  concatenate=True)

Fine Tune

ASTROMER can be easly trained by using the fit. It include

from ASTROMER import SingleBandEncoder

model = SingleBandEncoder(num_layers= 2,
                       d_model   = 256,
                       num_heads = 4,
                       dff       = 128,
                       base      = 1000,
                       dropout   = 0.1,
                       maxlen    = 200)

model.from_pretrained('macho')

where,

  • num_layers: Number of self-attention blocks

  • d_model: Self-attention block dimension (must be divisible by num_heads)

  • num_heads: Number of heads within the self-attention block

  • dff: Number of neurons for the fully-connected layer applied after the attention blocks

  • base: Positional encoder base (see formula)

  • dropout: Dropout applied to output of the fully-connected layer

  • maxlen: Maximum length to process in the encoder

Notice you can ignore model.from_pretrained(‘macho’) for clean training.

mode.fit(train_data,
      validation_data,
      epochs=2,
      patience=20,
      lr=1e-3,
      project_path='./my_folder',
      verbose=0)

where,

  • train_data: Training data already formatted as tf.data

  • validation_data: Validation data already formatted as tf.data

  • epochs: Number of epochs for training

  • patience: Early stopping patience

  • lr: Learning rate

  • project_path: Path for saving weights and training logs

  • verbose: (0) Display information during training (1) don’t

train_data and validation_data should be loaded using load_numpy or pretraining_records functions. Both functions are in the ASTROMER.preprocessing module.

For large datasets is recommended to use Tensorflow Records see this tutorial to execute our data pipeline