Aller au contenu

ETL Overview

Ce contenu n’est pas encore disponible dans votre langue.

Welcome to Flojoy’s ETL Blocks page. Here you can find all the information on how to handle ETL tasks using Flojoy.



EXTRACT_COLUMNS Take an input dataframe/matrix and returns a dataframe/matrix with only the specified columns.


OPEN_IMAGE Load an image file from disk and return a DataContainer of type 'image'.
OPEN_PARQUET Load a local parquet file, then return the data as a dataframe.
READ_CSV Read a .csv file from disk or a URL, and then return it as a dataframe.
READ_S3 Take an S3 key name, S3 bucket name, and file name as input, then extract the file from the specified bucket.



FLOJOY_CLOUD_DOWNLOAD Download a DataContainer from Flojoy Cloud (beta).
FLOJOY_CLOUD_UPLOAD Upload a DataContainer to Flojoy Cloud (beta).


BATCH_PROCESSOR Blob match a pattern in the given input directory, iterate (in a LOOP) over all of the files found, then return each file path as a TextBlob.
LOCAL_FILE The LOCAL_FILE node loads a local file of a different type and converts it to a DataContainer class.
OPEN_MATLAB The OPEN_MATLAB node loads a local file of the .mat file format.


REMOTE_FILE Load a remote file from an HTTP URL endpoint, infer the type, and convert it to a DataContainer class.



DOT_PRODUCT Take two input matrices, multiply them (by dot product), and return the result.
INVERT Invert a Matrix or OrderedPair.
MATMUL Take two input matrices, multiply them, and return the result.
SHUFFLE_MATRIX Return a matrix that is randomly shuffled by the first axis
SORT_MATRIX Take an input matrix and sort it along the chosen axis.
TRANSPOSE_MATRIX Take an input 2D matrix and transpose it.


ORDERED_PAIR_XY_INVERT Return an OrderedPair with the axes inverted.


TEXT_CONCAT Concatenate 2 strings given by 2 TextBlob DataContainers.


BOOLEAN_2_SCALAR Takes boolean type data and converts it into scalar data type.
DF_2_NP Convert a DataFrame DataContainer to a Matrix DataContainer.
DF_2_ORDERED_TRIPLE Convert a DataFrame DataContainer to an OrderedTriple DataContainer.
MATRIX_2_VECTOR Convert a Matrix DataContainer to a Vector DataContainer.
MAT_2_DF Convert a Matrix DataContainer to a DataFrame DataContainer.
NP_2_DF Infer the type of an array-like DataContainer, then convert it to a DataFrame DataContainer'.
ORDERED_PAIR_2_VECTOR Returns the split components (x, y) of an ordered pair as Vectors.
ORDERED_TRIPLE_2_SURFACE Convert an OrderedTriple DataContainer to a Surface DataContainer.
VECTOR_2_MATRIX Convert a Vector DataContainer to a Matrix DataContainer.
VECTOR_2_ORDERED_PAIR Convert a Vector DataContainer to an OrderedPair DataContainer.
VECTOR_2_SCALAR Takes a vector and transform it into scalar data type.


DECIMATE_VECTOR The DECIMATE_VECTOR node returns the input vector by reducing the
INTERLEAVE_VECTOR The INTERLEAVE_VECTOR node combine multiple vectors into a single vector type by interleaving their elements.
REMOVE_DUPLICATES_VECTOR The REMOVE_DUPLICATES_VECTOR node returns a vector with only unique elements.
REPLACE_SUBSET The REPLACE_SUBSET node returns a new Vector with subset of elements replaced.
REVERSE_VECTOR The REVERSE_VECTOR node returns a vector equal to the input vector but reversed.
SHIFT_VECTOR The SHIFT_VECTOR node shifts the elements in the vector by the amount specified
SHUFFLE_VECTOR The SHUFFLE_VECTOR node returns a vector that is randomly shuffled.
SORT_VECTOR The SORT_VECTOR node returns the input Vector that is sorted
SPLIT_VECTOR The SPLIT_VECTOR node returns a vector that is splited by a given index
VECTOR_DELETE The VECTOR_DELETE node returns a new Vector with elements deleted from requested indices
VECTOR_INDEXING The VECTOR_INDEXING node returns the value of the vector at the requested index.
VECTOR_INSERT The VECTOR_INSERT node inserts a value to the Vector at the
VECTOR_LENGTH The VECTOR_LENGTH node returns the length of the input vector.
VECTOR_MAX The VECTOR_MAX node returns the maximum value from the Vector.
VECTOR_MIN The VECTOR_MIN node returns the minimum value from the Vector
VECTOR_SUBSET The VECTOR_SUBSET node returns the subset of values from requested indices