After the recent incident, we have restored access to the website from outside the CERN network, however access from certain worldwide locations is still being blocked.

CERN Accelerating science

 
Illustration of the strategies for defining signal regions with different traditional approaches (left picture) and the anomaly score proposed and examined in this article (right picture). The same histogram is used to emphasize the similarity of the strategies, however, we do not expect the strategies to yield the exact same results.
Transverse momentum $p_\mathrm{T}$ (left) and energy $E$ (right) in GeV of the jets for all backgrounds.
Transverse momentum $p_\mathrm{T}$ (left) and energy $E$ (right) in GeV of the jets for all backgrounds.
Pseudorapidity $\eta$ (left) and azimuthal angle $\phi$ (right) of the jets for all backgrounds.
Pseudorapidity $\eta$ (left) and azimuthal angle $\phi$ (right) of the jets for all backgrounds.
Transverse momentum $p_\mathrm{T}$ (left) and energy $E$ (right) in GeV of the leptons ($e^+$, $e^-$, $\mu^+$, $\mu^-$) for all backgrounds.
Transverse momentum $p_\mathrm{T}$ (left) and energy $E$ (right) in GeV of the leptons ($e^+$, $e^-$, $\mu^+$, $\mu^-$) for all backgrounds.
Pseudorapidity $\eta$ (left) and azimuthal angle $\phi$ (right) of the leptons ($e^+$, $e^-$, $\mu^+$, $\mu^-$) for all backgrounds.
Pseudorapidity $\eta$ (left) and azimuthal angle $\phi$ (right) of the leptons ($e^+$, $e^-$, $\mu^+$, $\mu^-$) for all backgrounds.
Number of jets (left) and leptons (right).
Number of jets (left) and leptons (right).
Missing transverse energy $E_\mathrm{T}^{\rm miss}$ in GeV and azimuthal angle $\phi_{E_\mathrm{T}^{\rm miss}}$ for all backgrounds.
Missing transverse energy $E_\mathrm{T}^{\rm miss}$ in GeV and azimuthal angle $\phi_{E_\mathrm{T}^{\rm miss}}$ for all backgrounds.
The scalar sum of the jet transverse momenta $H_\mathrm{T}$ in GeV (see Eq.~\eqref{eq:defHT}) for all backgrounds, with $H_\mathrm{T} > 600$~GeV imposed.
\emph{Left:} A narrow Gaussian anomaly centered around $(2,2)$ (in red) is added to an exponentially-distributed background (in blue). \emph{Right:} The probability of belonging to the signal events (outliers) is assigned to each point of the dataset and we can perform a counting. In this case, higher probabilities are correctly assigned to the outliers.
Example of a box-and-whisker plot. The AUC for the \textit{Combined-PROD-VAE\_beta1\_z21-Flow} method is marked by the data points for each of the 34 channel-signal combinations. This method uses a combination of flow based likelihoods and a variational autoencoder with a loss function only focused on the KL divergence of the latent space (see Sec.~\ref{sec:DeepSVDDSpline} for more details). A box is drawn spanning the inner half of the data (Qn denotes the nth quartile), with a line through the box at the median. The whiskers extend to the extremal points unless they are further away from the box than 1.5 times the length of the box. These data points are denoted by circles.
Box plots for each of the physics signals in the hackathon dataset. These summarize the span of results for the many anomaly detection models trained on background only samples. Channel 2a has the tightest pre-selection cuts, and therefore less data, which leads to the signals looking less anomalous.
Box plots summarizing the anomaly detection techniques applied to all of the new physics signals. The colors denote the technique that have the top score the most times.
Box plots summarizing the anomaly detection techniques applied to all of the new physics signals. The colors denote the technique that appear in the top five scores the most times.
Box plots summarizing the anomaly detection techniques applied to all of the new physics signals. The colors denote the techniques that have the highest average rankings.
Box plots summarizing the anomaly detection techniques applied to all of the new physics signals. The colors denote the techniques that have the highest mean scores for each of the figures of merit.
Box plots summarizing the anomaly detection techniques applied to all of the new physics signals. The colors denote the techniques that have the highest median scores for each of the figures of merit.
Box plots summarizing the anomaly detection techniques applied to all of the new physics signals. The colors denote the techniques that have the highest minimum scores for each of the figures of merit. No technique has $\epsilon_S$ above 0 for all physics signals for $\epsilon_B=10^{-4}$ and only one ALAD model has $\epsilon_S$ above 0 for all physics signals at $\epsilon_B=10^{-3}$.
Box plots for each of the physics signals in the hackathon dataset. These summarize the span of results for the many anomaly detection models trained on background only samples. The SI is defined as $\epsilon_S/\sqrt{\epsilon_B}$. The maximum significance improvement over the three working points ($\epsilon_B = 10^{-2}$, $10^{-3}$, and $10^{-4}$) are used as the metric for each technique.
The minimum, median, and maximum best total improvements for each technique across the physics models. The TI is defined as the maximum signal improvement for a physics model across all signal regions.
The minimum, median, and maximum best total improvements for each technique across the physics models.
Box plots summarizing the anomaly detection techniques applied to the secret dataset. The colors denote the techniques that have the highest average rank (top), median score (middle) and minimum score (bottom) for each of the figures of merit described in Sec.~\ref{sec:merit}.
Box plots summarizing the anomaly detection techniques applied to the secret dataset. The colors denote the techniques that have the highest average rank (top), median score (middle) and minimum score (bottom) for each of the figures of merit described in Sec.~\ref{sec:merit}.
Box plots summarizing the anomaly detection techniques applied to the secret dataset. The colors denote the techniques that have the highest average rank (top), median score (middle) and minimum score (bottom) for each of the figures of merit described in Sec.~\ref{sec:merit}.
The minimum, median, and maximum best total improvements for each technique applied on each of the signals in the secret dataset.
The minimum, median, and maximum best total improvements for each technique applied on each of the signals in the Hackathon (top) and Secret (middle) dataset.
The median TI scores for the Hackathon and Secret datasets along the $x$- and $y$-axis, respectively. The point marked with color were chosen as having median TI $>2$ on the Hackathon data, and the grey points are included as reference models.