| |
- Cluster
- Dendrogram
class Cluster |
|
Class for data clustering |
|
Methods defined here:
- __cluster_columns__(self, column_distance, column_linkage)
- __impute_missing_values__(self, data)
- __init__(self)
- __reorder_data__(self, data, order)
- __return_missing_values__(self, data, missing_values_indexes)
- cluster_data(self, row_distance='euclidean', row_linkage='single', axis='row', column_distance='euclidean', column_linkage='ward')
- Performs clustering according to the given parameters.
@datatype - numeric/binary
@row_distance/column_distance - see. DISTANCES variable
@row_linkage/column_linkage - see. LINKAGES variable
@axis - row/both
- normalize_data(self, feature_range=(0, 1), write_original=False)
- Normalizes data to a scale from 0 to 1. When write_original is set to True,
the normalized data will be clustered, but original data will be written to the heatmap.
- read_csv(self, filename, delimiter=',', header=False, missing_value=False, datatype='numeric')
- Reads data from the CSV file
- read_data(self, rows, header=False, missing_value=False, datatype='numeric')
- Reads data in a form of list of lists (tuples)
|
class Dendrogram |
|
Class which handles the generation of cluster heatmap format of clustered data.
As an input it takes a Cluster instance with clustered data. |
|
Methods defined here:
- __add_column_metadata_to_data__(self)
- __adjust_node_counts__(self)
- __check_column_metadata_length__(self)
- __compress_data__(self)
- __connect_additional_data_to_data__(self, additional_data, compressed_value)
- __connect_metadata_to_data__(self)
- __get_cluster_heatmap__(self, write_data)
- __get_column_dendrogram__(self)
- __get_distance_treshold__(self, cluster_count)
- __get_most_frequent__(self, col)
- __init__(self, clustering)
- __read_alternative_data__(self, alternative_data)
- __read_alternative_data_file__(self, alternative_data_file, delimiter)
- __read_metadata__(self, metadata, header)
- __read_metadata_file__(self, metadata_file, delimiter, header)
- __reorder_alternative_data__(self, alternative_data)
- add_alternative_data(self, alternative_data, header, alternative_data_compressed_value)
- Adds alternative data in a form of list of lists (tuples).
- add_alternative_data_from_file(self, alternative_data_file, delimiter, header, alternative_data_compressed_value)
- Adds alternative_data from csv file.
- add_column_metadata(self, column_metadata, header=True)
- Adds column metadata in a form of list of lists (tuples). Column metadata doesn't have header row, first item in each row is used as label instead
- add_column_metadata_from_file(self, column_metadata_file, delimiter=',', header=True)
- Adds column metadata from csv file. Column metadata doesn't have header.
- add_metadata(self, metadata, header=True, metadata_compressed_value='median')
- Adds metadata in a form of list of lists (tuples).
Metadata_compressed_value specifies the resulted value when the data are compressed (median/mean/frequency)
- add_metadata_from_file(self, metadata_file, delimiter, header=True, metadata_compressed_value='median')
- Adds metadata from csv file.
Metadata_compressed_value specifies the resulted value when the data are compressed (median/mean/frequency)
- create_cluster_heatmap(self, compress=False, compressed_value='median', write_data=True)
- Creates cluster heatmap representation in inchlib format. By setting compress parameter to True you can
cut the dendrogram in a distance to decrease the row size of the heatmap to specified count.
When compressing the type of the resulted value of merged rows is given by the compressed_value parameter (median, mean).
When the metadata are nominal (text values) the most frequent is the result after compression.
By setting write_data to False the data features won't be present in the resulting format.
- export_cluster_heatmap_as_html(self, htmldir='.')
- Export simple HTML page with embedded cluster heatmap and dependencies to given directory.
- export_cluster_heatmap_as_json(self, filename=None)
- Returns cluster heatmap in a JSON format or exports it to the file specified by the filename parameter.
| |