February – Transformers, Audio data augmentation, HuggingFace

Week 2/21/2022 – 2/27/2022 – transformer, BERT, attention

References

  1. BERT
    1. Text Extraction with BERT in Keras – It uses transformer from HuggingFace
    2. Text Classification with Transformer in Keras – it implements a transformer block as a Keras layer and then uses it for text classification
    3. Time Series classification with a Transformer model in Keras

Week 2/7 – 2/13/2022 – adding attention

  1. Tips and tricks for training
  2. Adding attention to models
adding attention layer in Keras
Adding attention in Keras – official documentation

References

  1. Tips for training ML models
    1. The Model Performance Mismatch Problem (and what to do about it)
    2. How To Improve Deep Learning Performance
    3. A bunch of tips and tricks for training deep neural networks
  2. Adding an attention layer
    1. How to do attention over an LSTM sequences with masking?
    2. https://www.youtube.com/watch?v=oaV_Fv5DwUM
    3. How to add Attention on top of a Recurrent Layer (Text Classification) #4962
    4. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention – paper, not yet read
    5. Keras attention mechanism
    6. A Comprehensive Guide to Attention Mechanism in Deep Learning for Everyone – a really good read, also includes coding examples to define an attention layer in Keras
    7. A Beginner’s Guide to Using Attention Layer in Neural Networks – another article that shows how to use Keras attention layer
    8. Keras attention layer – Dot-product attention layer, a.k.a. Luong-style attention
    9. What is the difference between Luong attention and Bahdanau attention? – stack overflow
    10. The Luong Attention Mechanism
    11. Effective Approaches to Attention-based Neural Machine Translation – Luong attention original paper
    12. Craft your own Attention layer in 6 lines — Story of how the code evolved – not yet read, a detailed article though
    13. Practical PyTorch: Translation with a Sequence to Sequence Network and Attention
    14. Getting started with Attention for Classification
    15. Keras Self Attention
    16. Source code for attention layers implemented in Keras
  3. Transformers
    1. Attention Is All You Need – introduces the transformer architecture – original paper
    2. Time Series Classification with a Transformer Model in Keras – full example that uses a transformer
    3. The Transformer neural network architecture EXPLAINED. “Attention is all you need” (NLP) – youtube video
    4. Multi Head Attention later – Keras
  4. F-Net
    1. Text Generation using FNet – Keras complete example
    2. F-Net: mixing Tokens with Fourier Transform – original paper
    3. F-Net explained – Youtube video
    4. F-Net code on Google in Tensorflow

Week 1/31 – 2/6/2022

  1. Masking in Keras LSTM: There are three ways for masking in Keras. See snippnet below.
From [2] reference below.
Appling a mask in Keras LSTM using the Embedding layer
Example for Masking using the Making layer
Passing a mass argument to mask certain timesteps .. from reference [4]

References

  1. LSTM
    1. Keras lstm with masking layer for variable-length inputs — How to use masking with LSTM – StackOverflow
    2. Masking and Padding with Keras – Tensorflow documentation
    3. Masking layer in Keras – Keras documentation
    4. How does masking work in RNN?
    5. Lambda layer in Keras
    6. What is masking in RNN – Quora