新手尝试用LSTM进行字段分类, 数据集如下
Data ----- Label
DKWL----0
FCHN----0
KDQP----0
IHGS----1
....
然后我进行了编码:
00011101000001000111-----1
.....
shape:(N,20)
接下来 我建造了网络:
class RNN(nn.Module):
def __init__(self):
super(RNN, self).__init__()
self.rnn = nn.LSTM(
input_size=20, # 每个sample有20个feature
hidden_size=64,
num_layers=1,
batch_first=True,
)
self.out = nn.Linear(64, 2) # 2分类问题,所以output设置成了2
def forward(self, x):
r_out, (h_n, h_c) = self.rnn(x, None)
out = self.out(r_out[:, -1, :])
return out
最后开始训练:
for epoch in range(EPOCH):
for step, (x, b_y) in enumerate(train_loader): # gives batch data
b_x = x.view(-1, ?, ?)
#这里的reshape我应该设置为多少? 按照教程,应该设置为(batch, time_step, input_size)的形式,因为我想让网络记住每个sample的feature的sequence,所以我把time_step设置成了20(一个sample有20个featur,最后我得到了(-1,20,20),但是会报错,然后我又改成了(-1,1,20), 这就没问题了,但是改成这样,是不是就没有记住feature的sequence?
output = rnn(b_x)
loss = loss_func(output, b_y)
optimizer.zero_grad()
loss.backward()
optimizer.step()