March – 2022 – My Research Work and Life Adventures

Week 3/7 – 3/14 (Transformers, attention, masking)

References

Hugging Face transformer
1. https://huggingface.co/typeform/roberta-large-mnli – Multi lingual
2. https://towardsdatascience.com/how-to-train-a-bert-model-from-scratch-72cfce554fc6 – How to train BERT from scratch
Fall Detection statistics
Transformers masking
1. MultiHeadAttention attention_mask [Keras, Tensorflow] example
2. MultiHeadAttention masking mechanism #45854
3. MultiHeadAttention padding mask example #49237
4. Transformers Explained Visually (Part 3): Multi-head Attention, deep dive – A good explanation of Tensorflow in general
5. How to support masking in custom tf.keras.layers.Layer

Week 2/28 – 3/6 (Transformers, masking, attention)

Transformers

References

Transformers
1. Simple Text Multi Classification Task Using Keras BERT – also explains some theory about BERT alongwith the code. The important limitation of BERT to be aware of is that the maximum length of the sequence for BERT is 512 tokens. For shorter sequence input than the maximum allowed input size, we would need to add pad tokens [PAD]. On the other hand, if the sequence is longer, we need to cut the sequence. This BERT limitation on the maximum length of the sequence is something that you need to be aware of for longer text segments.
2. Transformer model for language understanding – COLAB code . implements transformer using Tensorflow as per the original paper — tried this extensively — problem with masking and also model size is constrained to 512?
3. Timeseries classification with a Transformer model in Keras – tried this.. without the mask though. Not sure how to generate the mask for attention
HuggingFace
1. Text Extraction with BERT — Google COLAB
2. BERT Masked LM training — uses Pytorch and HuggingFace BERT model
3. https://huggingface.co/docs/transformers/index – HuggingFace library has many models implemented – including FNet, Vision Transformer, BERT, Wav2Vec2
4. Encoders in HuggingFace library – use no masks?
5. Fine Tuning for Audio Classification with Transformers – COLAB code – uses Wav2Vec2 model, which expects inputs as 1d array (audio file sampled at 16000Hz)
6. Transformers: State-of-the-Art Natural Language Processing – HuggingFace paper
Keyword Transformer
1. How does it implement data augmentation? How to include data augmentation in the model input pipeline
2. https://colab.research.google.com/drive/1DJLse4a-9i9Ro-UN6KOsO4WBni9P1d1u?usp=sharing#scrollTo=silent-chair – COLAB implementation
3. Keyword Transformer – original paper implementation by the authors
Data Augmentation
1. SpecAugment in Tensorflow – official example
2. Audiomentations library
3. Visualize an audio pipeline – using audiomentations
4. Audio Albumentations : Transform Your Audio – Kaggle notebook
5. AudioLoudness meter in Python
6. SpecAugment paper
7. Mixup paper
8. Note: ImageNet dataset fact: This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images
Wav2Vec2 model
Vision Transformer
Acoustic ML
1. Language Model like Pre-Training for Acoustic Data

Week 3/7 – 3/14 (Transformers, attention, masking)

References

Week 2/28 – 3/6 (Transformers, masking, attention)

References

Share this: