Source: texcla/corpus.py#L0


read_folder

read_folder(directory)

read text files in directory and returns them as array

Args:

  • directory: where the text files are

Returns:

Array of text


read_pos_neg_data

read_pos_neg_data(path, folder, limit)

returns array with positive and negative examples


imdb

imdb(limit=None, shuffle=True)

Downloads (and caches) IMDB Moview Reviews. 25k training data, 25k test data

Args:

  • limit: get only first N items for each class

Returns:

[X_train, y_train, X_test, y_test]