← Home / Projects / Sentiment Analysis
Last updated:

Sentiment Analysis with Embeddings + RNN / GRU / LSTM

This page summarizes how the sentiment classifier was built: data preparation, embedding strategy, training setup, model variants (baseline vs GRU vs LSTM), and the visuals used to validate results. Code is intentionally shown only as small snippets (no full notebook dump).

NLP Sequence Classification Embeddings RNN • GRU • LSTM

Task

Sentiment classification (positive vs negative vs neutral vs irrelevant )

Input

Tokenized text sequences with padding/truncation

Output

Probability score + class label

1) Data & Preprocessing

Text was cleaned and standardized before training. The objective was to reduce noise without losing meaning (especially around negations like “not good”).

  • Lowercasing + basic normalization
  • Tokenization → integer sequences
  • Padding/truncation to a fixed length
  • Train/validation split with consistent random seed
# Tokenization + padding (snippet)
tokenizer = Tokenizer(num_words=VOCAB_SIZE, oov_token="<OOV>")
tokenizer.fit_on_texts(train_texts)

X_train = pad_sequences(tokenizer.texts_to_sequences(train_texts),
                        maxlen=MAX_LEN, padding="post", truncating="post")
X_val   = pad_sequences(tokenizer.texts_to_sequences(val_texts),
                        maxlen=MAX_LEN, padding="post", truncating="post")

Visual checks used

  • Word cloud to quickly inspect dominant vocabulary
Word cloud for the corpus (placeholder) Word cloud for the corpus (placeholder)

2) Embedding Strategy

The embedding layer converts word IDs into dense vectors so the model can learn semantic relationships. We used trainable embeddings as the default approach.

Vocab: 20000 (default constant)
Dim: 128 & 68
Len: 163
# Embedding layer (snippet)
model = Sequential([
  Embedding(input_dim=VOCAB_SIZE, output_dim=EMBED_DIM, input_length=MAX_LEN),
  ...

Optional upgrade: use pre-trained embeddings (e.g., GloVe) to improve generalization on small datasets.

Baseline vs Sequence Models

  • RNN: Embedding → SimpleRNN → dense classifier
  • GRU: Embedding → GRU → dense classifier
  • LSTM: Embedding → LSTM → dense classifier
# Baseline (snippet)
Embedding(...)
GlobalAveragePooling1D()
Dense(4, activation="sigmoid")

3) Model Variants Compared

The main comparison was whether sequence-aware units (RNN/GRU/LSTM) improved over an embedding-only baseline. GRU/LSTM typically help when longer dependencies matter and gradients need more stability.

SimpleRNN

Good for fundamentals + quick tests; limited for long context.

Embedding(...)
SimpleRNN(UNITS)
Dense(4, "sigmoid")

GRU

Fewer gates than LSTM; often faster with strong performance.

Embedding(...)
GRU(UNITS)
Dense(4, "sigmoid")

LSTM

Strong for longer sequences; more parameters / compute.

Embedding(...)
LSTM(UNITS)
Dense(4, "sigmoid")

4) Training Setup

Training used a standard binary classification setup.

  • Loss: binary cross-entropy
  • Optimizer: Adam
  • Regularization: dropout (if overfitting appears)
  • Callbacks: early stopping / checkpoints
# Compile + train (snippet)
model.compile(optimizer="adam",
              loss="binary_crossentropy",
              metrics=["accuracy"])

history = model.fit(X_train, y_train,
                    validation_data=(X_val, y_val),
                    epochs=EPOCHS,
                    batch_size=BATCH_SIZE)

Training curves

Training/validation curves (placeholder) Training/validation curves (placeholder) Training/validation curves (placeholder)

5) Evaluation & Results

Evaluation focused on both metrics and error patterns (examples that break the model).

  • Accuracy, Precision, Recall, F1
y_pred = np.argmax(model_gru.predict(X_test), axis=1)
y_true = np.argmax(y_test, axis=1)
print(classification_report(y_true, y_pred_gru, target_names=le.classes_))
BiGRU_classification_report Matrix:
Confusion matrix (placeholder)
BiLSTM_classification_report Matrix:
Confusion matrix (placeholder)
BiSimpleRNN_classification_report Matrix:
Confusion matrix (placeholder)