pimkl package¶
Subpackages¶
Submodules¶
pimkl.analysis module¶
- pimkl.analysis.plot_aucs_to_buffer(df, save=False)[source]¶
plot AUC for multiindexed pandas.DataFrame where df.columns.names = [‘data’, ‘kind’]
- pimkl.analysis.plot_weights_significant_correlations_to_buffer(weights_df, correlation_type, save=False)[source]¶
plot heatmap showing value of correlation if significant between different molecular signatures where weights_df.index.names is [‘fold’, ‘class’]
pimkl.cli module¶
pimkl.data module¶
Split data into training and test.
- pimkl.data.get_learning_data(X, labels=None, max_per_class=30)[source]¶
Return splitted test and training data for single data type.
- pimkl.data.get_learning_data_in_dict_mode(X, labels=None, data_types=None, max_per_class=30)[source]¶
Return splitted test and training data for multiple data types.
pimkl.evaluation module¶
pimkl.inducers module¶
- pimkl.inducers.get_matching_data_and_network(data, network)[source]¶
Interesct data labels with network node labels.
- pimkl.inducers.get_pathway_inducer(network, gene_set, normed=True)[source]¶
Get a laplacian based pathway inducer.
- pimkl.inducers.read_inducer(filename, size, header=None, sep=',')[source]¶
Read inducer in CSC format.
pimkl.network module¶
- pimkl.network.generate_random_sets(number_of_sets, max_nodes, nodes_labels, number_of_nodes=None)[source]¶
pimkl.pimkl module¶
Main module.
pimkl.run module¶
- pimkl.run.fold_generator(number_of_folds, data, labels, max_per_class, transformer_class=<class 'pimkl.utils.preprocessing.standardizer.Standardizer'>)[source]¶
generate class balanced splits of data and labels
- pimkl.run.run_model(inducers, induction_name, mkl_name, estimator_name, mkl_parameters, estimator_parameters, induction_parameters, inducers_extended_names, fold_parameters)[source]¶
Run a single fold of the model with data splits from fold_generator.
Arguments are those to PIMKL and then the inducer_names and a dict containing the fold specific arguments. In junction with partial and the fold_generator it can be used for running folds in parallel:
`list(pool.imap(run_fold, fold_generator(...)))`
Module contents¶
Top-level package for pimkl.