Feature set

class mangrove_surface.wrapper.feature_set.FeatureSetWrapper(feature_set_resource, collection)

Feature set resource

A feature set is a set of frames (one for each data set). A frame contains variables and its metadata (type, use or not).

It is used to customize data, generate aggregates and to train classifiers.

central(*args, **kwargs)

Return the central frame

The central frame is the one used to train classifiers.

clone(new_name=None, tags=None)

Clone the current feature set.

fit_classifier(name=None, tags=[], nb_aggregates=None, maximum_features=None)

Fit a new classifier

Parameters:
  • name – (optional) classifier name (by default the name will be the project name concatenated with the current time
  • tags – the classifier tags
  • nb_aggregates – used to generates nb_aggregates aggregates on the central frame used to train the classifier
  • maximum_features – used to allow at most maximum_features features in the new classifier
frame(*args, **kwargs)

Return the frame named name

Parameters:name – data (set) frame name
frames(*args, **kwargs)

List all frames

generate_aggregates(*args, **kwargs)

Generate a new feature set with n aggregates

Parameters:n – number of aggregates requested (a non-negative integer)
is_modified(*args, **kwargs)

Indicates if the current feature set has been modified

save(*args, **kwargs)

Save all the modifications (change variables type, set unused, etc.)

Warning

If clone = False the method overrides the current feature set resource

Raises:Exception – if clone = False and the current feature set is the default one.

Frame

class FeatureSetWrapper._Frame(dataset, change_type, fs)
features(filt=<function <lambda>>, id=False)

List features of the current frame

>>> fs.features()
[
    {
        'name': 'Flag_Prospect',
        'type': 'categorical',
        'use': True
    },
    {
        'name': 'LABEL',
        'type': 'continuous',
        'use': True
    },
    ...
]
Parameters:filt – (optional) a function that can be used to filter features
>>> fs.features(filter=lambda feat: fs.is_categorical(feat))
[
    {
        'name': 'Flag_Prospect',
        'type': 'categorical',
        'use': True
    },
    ...
]

or:

>>> fs.features(filter=lambda feat: feat["name"].startswith("Foo"))
[
    {
        'name': 'FooBar',
        'type': 'categorical',
        'use': True
    },
    {
        'name': 'FooFoo',
        'type': 'continuous',
        'use': False
    },
    ...
]
is_categorical(variable)

Indicates if the feature variable is categorical or not

:param variable:: feature name

is_central()

Indicates if the frame is central

is_change_type_allowed()

Indicate if the frame allows to change feature type

It is forbidden to change type of peripheral frame features if there is aggregates in the central frame

is_continuous(variable)

Indicates if the feature variable is continuous or not

:param variable:: feature name

is_modified()

Indicates if the frame has been modified

is_used(variable)

Return if the feature is used or not

modalities(name)

List modalities of the feature name

Parameters:name – feature name
set_categorical(variable)

Change the type of the feature variable to categorical

Parameters:variable – the feature name
Raises:MangroveChangeForbidden – if change type is not allowed
set_continuous(variable)

Change the type of the feature variable to continuous

Parameters:variable – the feature name
Raises:MangroveChangeForbidden – if change type is not allowed
set_unused(variable)

Set unused the feature variable

Parameters:variable – the feature name
set_used(variable)

Set used the feature variable

Parameters:variable – the feature name
type(variable)

Return the type of the feature variable The type could be categorical or continuous (other types can be provided like timestamps but they are not managed)

Parameters:variable – the feature