SentimentAnalysis
项目概述
- 项目参考了
Yoon Kim的论文Convolutional Neural Networks for Sentence Classification的实现方法,利用CNN卷积神经网络完成语句情感分析。 - 项目结构
data/*:json数据parse_data:生成数据的json文件cnn.py:CNN网络模型train.py:训练脚本test.py:测试脚本
- 参数
ALLOW_SOFT_PLACEMENT=True BATCH_SIZE=50 CHECKPOINT_EVERY=100 DEV_DATA_FILE=./data/dev.json DROPOUT_KEEP_PROB=0.5 EMBEDDING_DIM=300 EVALUATE_EVERY=100 FILTER_SIZES=3,4,5 L2_REG_LAMBDA=3.0 LOG_DEVICE_PLACEMENT=False NUM_CHECKPOINTS=5 NUM_EPOCHS=200 NUM_FILTERS=100 TEST_DATA_FILE=./data/test.json TRAIN_DATA_FILE=./data/train.json
步骤
- 模型数据:
https://cloud.tsinghua.edu.cn/d/e3da1c00a9e84a5d9132/ - Train with dropout.
$ python test.py --checkpoint_dir="./runs/1577169494/checkpoints/" Total number of test examples: 2210 Accuracy: 0.400905 - Train with 256 or 512 hidden size.
- 修改参数
NUM_FILTERS = 256$ python test.py --checkpoint_dir="./runs/1577171119/checkpoints/" Total number of test examples: 2210 Accuracy: 0.40905 - 修改参数
NUM_FILTERS = 512$ python test.py --checkpoint_dir="./runs/1577172661/checkpoints/" Total number of test examples: 2210 Accuracy: 0.39819
- 修改参数
- Train with a different number of the hidden layer. (The number of hiddenlayer should be set to 1 and 3)
- 修改参数
FILTER_SIZES = 3$ python test.py --checkpoint_dir="./runs/1577174708/checkpoints/" Total number of test examples: 2210 Accuracy: 0.384163 - 修改参数
FILTER_SIZES = 3,4$ python test.py --checkpoint_dir="./runs/1577175634/checkpoints/" Total number of test examples: 2210 Accuracy: 0.39819 - 修改参数
FILTER_SIZES = 3,4,5$ python test.py --checkpoint_dir="./runs/1577169494/checkpoints/" Total number of test examples: 2210 Accuracy: 0.400905
- 修改参数
- Train with pre-trained word embedding. (We supply GloVe pretrainedword embedding with 300-dimension for your experiments and you canexplore the model performance with the same dimension without pre-trained word embeddings.)
$ python test.py --checkpoint_dir="./runs/1577022248/checkpoints/" Total number of test examples: 2210 Accuracy: 0.414027