pipeline.preprocessing package

Submodules

pipeline.preprocessing.feature_processor module

class pipeline.preprocessing.feature_processor.FeatureGrabber(end_date, engine, config_db, con)[source]
getFeature(feature_to_load)[source]
pipeline.preprocessing.feature_processor.convert_categorical(df)[source]
pipeline.preprocessing.feature_processor.feature_name_grabber(df)[source]
pipeline.preprocessing.feature_processor.imputation_zero(df)[source]
pipeline.preprocessing.feature_processor.numerical_column_clean(df)[source]

pipeline.preprocessing.feature_table_builder module

class pipeline.preprocessing.feature_table_builder.Labeller(start_date, end_date, labels)[source]
get_labels()[source]
pipeline.preprocessing.feature_table_builder.chunker(seq, size)[source]
pipeline.preprocessing.feature_table_builder.dataframe_merge(d1, d2)[source]
pipeline.preprocessing.feature_table_builder.generate_fake_todays(fake_today, prediction_window, start_date)[source]

Given a final prediction window start date, the length of the prediction windows, and a training start date, return the start and end dates for all prediction windows as a dictionary.

Parameters:
  • fake_today (datetime) – start date for the final prediction window
  • prediction_window (int) – length of the prediction windows in days
  • start_date (datetime) – start date for the training period
Returns:

start and end dates for all prediction windows

Return type:

dict

pipeline.preprocessing.feature_table_builder.generate_feature_list(config)[source]
pipeline.preprocessing.feature_table_builder.generate_feature_table(config, fake_today, prediction_window, start_date, feature_timestamp)[source]
pipeline.preprocessing.feature_table_builder.label_feature_producer(start_date, end_date, features, labels)[source]
pipeline.preprocessing.feature_table_builder.merge_feature_dictionaries(d1, d2)[source]
pipeline.preprocessing.feature_table_builder.write_dataframe_to_sql(df_name, df, schema)[source]

pipeline.preprocessing.run module

pipeline.preprocessing.run.main(config_file_name)[source]

Module contents