Serengeti logo BLACK white bg w slogan
Menu

Python Book of Recipes #2 - A Dash of Customization: How Plugin Architecture Can Spice Up Your Code

Boško Savić, Senior Software Developer
23.05.2023.

Just as a chef adds spices and ingredients to a dish to make it more flavorful and customizable, a Python developer can use plugin architecture to add flexibility and extendibility to their code.

By implementing a plugin architecture, you can empower other developers to add their own features to your codebase, while keeping the core functionality intact. Before whipping up some code with the help of plugin architecture, let us take a quick look at   the advantages of plugin architecture.

  • Allows developers to extend the system functionality without modifying the core codebase. This means that the core system can remain stable and well-tested, while new features can be added or removed as needed.
  • Promotes code modularity and separation of concerns, as each plugin is responsible for a specific behavior and can be developed independently of other plugins.
  • It makes the system more flexible and adaptable to changing requirements, as new plugins can be added or removed at any time, without affecting the core functionality.

As a data scientist, I always work with data, of course 😊

For example, developing models to predict the future based on time series data includes many steps and techniques, like loading and exploring data, feature engineering, data visualization, data transformation, spot-check algorithms, model evaluation etc. and, as we can see, there are many potential places for plugin architectures.

As we know, the goal of feature engineering is to provide strong and ideally simple relationships between new input features and the output feature for the supervised    learning algorithm to a model, so it can be a suitable candidate for a plugin pattern. Some other scenarios where you can use plugins are data cleaning, data visualization, NLP (text processing) ...

Ingredients:

  • TimeSeriesFeatureEngineering class - that takes a list of plugins
  • Few classes that will be plugins for feature engineering

Step 1. Create a class that will represent a feature engineering mechanism for time series data

image

What is happening here?

  • The __init__ method initializes an instance of the class and sets up an empty list, plugins, to store the plugins used for feature extraction.
    • The add_plugin method allows adding plugins to the list of plugins. It appends the provided plugin object to the plugins list.
    • The process method is responsible for extracting features from the time series using added plugins. It takes a pandas DataFrame, dataframe, as input, which is expected to contain time series data.
    • Inside the process method, it iterates over each plugin in the plugins list and calls the process method of each plugin with the dataframe as input. The resulting features from each plugin are collected in the plugin_features list.

Step 2. Add classes that will represent plugin used for extracting features

image 1

Here we see 2 plugins.

DateTimeFeaturesPlugin: Adds date and time-related features to the DataFrame, such as year, month, day, day of the week and hour.

LagFeaturesPlugin: Creates lag features by shifting the target variable by a specified number of time steps.

I also added plugins for:

Rolling Windows Statistics: Calculates rolling window statistics, including the rolling mean and standard deviation of the target variable.

Expanding Windows Statistics: Calculates expanding window statistics, including the expanding mean and standard deviation of the target variable.

Moving Average: Calculates the moving average of the target variable using a specified window size.

Square Root Transform: Applies a square root transformation to the target variable.

Log Transform: Applies a logarithmic transformation to the target variable.

Box-Cox Transform: Applies a Box-Cox transformation to the target variable.

The full code can be found on the link.

Conclusion

The code provides a flexible framework for extracting a variety of features from time series data using different plugins and storing the extracted features in a DataFrame.

Let's do business

The project was co-financed by the European Union from the European Regional Development Fund. The content of the site is the sole responsibility of Serengeti ltd.
cross