Experiment with contextual embeddings based on Transformer architectures.

Perform initial experiments with the contextual log line embeddings.

Our current embedding is based on aggregating (averaging) of per-token fastText embeddings. Contextual embeddings are expected to improve the performance of the downstream task similarly to NLP.

  - start with pre-trained BERT-like Transformer models (https://huggingface.co/, https://www.sbert.net/, https://simpletransformers.ai/), then:
    * continue with unsupervised pretraining with objectives like masked language modeling (MLM) or next sentence prediction (NSP)
    * finetune on labeled log data
  - analyze the embeddings (clustering, t-SNE visualizations...)
  - add to LAD benchmark suite and compare with other methods


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with contextual embeddings based on Transformer architectures. #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Experiment with contextual embeddings based on Transformer architectures. #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions