Week 9: Application to Real Data

Topics

This week’s assignments will guide you through the following topics:

  • Computing workflow of CMS

  • Issues related to real data

Reading

Please read the following:

Tasks

Propose a project for Quarter 2. Possible extensions include:

  • Explore other model architectures for the same H(bb) identification task, like transformers, tensor networks, equivariant neural networks.

  • Expand the model to do multiclass classification (classifying all flavors of QCD quarks, gluons, H(bb) present in the dataset).

  • Try an unsupervised learning approach like (variational) autoencoders for anomaly detection.

  • Explore model compression, knowledge distillation, quantization or other techniques to reduce the model size or complexity; Can it be made more efficient?

  • Perform a regression task, like correcting the energy or mass of the particle jet based on generator-level “truth” information.

  • Apply concepts you learned to a new dataset like the TrackML Particle Tracking Challenge: https://www.kaggle.com/c/trackml-particle-identification/overview.

  • Apply your algorithm to real data.

  • Apply concepts you learned to a slightly different jet tagging problem for VBS production of 2H(bb) 2H(WW), which may be used in a real CMS analysis!

  • Explainable AI for GNNs using layerwise relevance propagation (LRP) or GNNExplainer [25].

Weekly Questions

Answer the following questions

  • What are some differences between the CMS event data model and other processing frameworks