,-----.  ,---.   ,----.   ,--.   ,--.                            ,--.             ,--.        
'  .--./ /  O  \ '  .-./   |   `.'   | ,---. ,--,--,      ,---. ,-'  '-.,--.,--. ,-|  | ,---.  
|  |    |  .-.  ||  | .---.|  |'.'|  || .-. ||      \    | .-. :'-.  .-'|  ||  |' .-. || .-. : 
'  '--'\|  | |  |'  '--'  ||  |   |  |' '-' '|  ||  |    \   --.  |  |  '  ''  '\ `-' |\   --. 
 `-----'`--' `--' `------' `--'   `--' `---' `--''--'     `----'  `--'   `----'  `---'  `----'                                                                                             


The CAGMon etude is a study version of CAGMon that evaluates the dependence between the primary and auxiliary channels.

Project goal

The goal of this project is to find a systematic way of identifying the abnormal glitches in the gravitational-wave data using various methods of correlation analysis. Usually, the community such as LIGO, Virgo, and KAGRA uses a conventional way of finding glitches in auxiliary channels of the detector - Klein-Welle, Omicron, Ordered Veto Lists, etc. However, some different ways can be possible to find and monitor them in a (quasi-) realtime. Also, the method can point out which channel is responsible for the found glitch. In this project, we study its possible to apply three different correlation methods - maximal information coefficient, Pearson's correlation coefficient, and Kendall's tau coefficient - in the gravitational wave data from the KAGRA detector.


Methods and Frameworks

Maximal Information Coefficient (MIC)

the Maximal Information coefficient(MIC) of a set D of two-variable data with sample size n and the grid less than B(n) is given by

\[ MIC(D)=\underset{xy<B(n)}{\max}{\left\{ \frac{I^{*}(D,x,y)}{\log \min \left\{x,y \right\}} \right \}} \],

where \[\omega(1)<B(n)\le O(n^{1-\epsilon}) \] for some \[ 0<\epsilon<1 \]

Pearson's Correlation Coefficient (PCC)

Pearson Correlation Coefficient(PCC) is a statistic that explains the amount of variance accounted for in the relationship between two (or more) variables by \[ R=\sum_{i=1}^{n} (X_i - \overline{X})(Y_i - \overline{Y})} \over {\sqrt{\sum_{i=1}^{n} (X_i - \overline{X}) \sum_{i=1}^{n} (Y_i - \overline{Y})} \],

where \[ \overline{X} \] and \[ \overline{Y} \] are the mean of X and Y, respectively

Kendall's tau Coefficient

Kendall’s tau with a random samples n of observations from two variables measures the strength of the relationship between two ordinal level variables by

\[ \tau =\frac{c-d}n \choose 2 \],

where c is the number of concordant pairs, and d is the number of discordant pairs

Flow chart

Code development



Code versions

  1. CAGMon Etude Alpha
    • for the basic test and evaluation of the LASSO regression method developed by LIGO
    • reproduced original CAGMon methods and idea
  2. CAGMon Etude Beta
    • added coefficient trend plots with LASSO beta, coherence, MIC, PCC, and Kendall's tau
  3. CAGMon Etude Delta
    • fixed a critical problem that sucked enormous memory when it used the matplotlib module
  4. CAGMon Etude Eta
    • fixed minor issues
    • added the range limitation of stride
  5. CAGMon Etude Flat
    • fixed minor issues and optimized scripts
    • added the script of HTML summary page
    • added coefficient distribution plots
  6. CAGMon Etude Octave (current version)
    • remove some processes that make Time-series and Scatter plots. Even though it required tremendous memory, this information is not useful
    • adjust HTML code
    • fixed minor issues and optimized scripts
    • added the analysis option whether or not the algorithm proceeds in the active segment only
    • improve script efficiency
    • added the process to make scatter and OmegaScan plots in detail boxes of the summary page

  7. CAGMon Etude Rhapsody (development version)
    • Require auto-selection of CAGMon parameters
    • Require pre-estimation process to check intrinsic sample rate of each channel
    • Improve script efficiency and completeness

Series of scripts

User guide

Needs of code development

Empirical study (No free lunch)

  1. Apply to glitch data on KAGRA during O3GK
  2. Apply to the glitch data of GravitySpy on LIGO

    • Data
      • TBD
  3. Apply to the mid-range data
    • Data
      • TBD
  4. Apply to the long-range data
    • Data
      • TBD

Exemplary results

1. Earthquake effects during O3GK (with CAGMon Etude Flat)

2. With iKAGRA hardware injection data (with CAGMon Etude Flat)

3. Skim through some obs-segments of O3GK (with CAGMon Etude Octave)

4. Glitch analysis during O3GK




Presentation materials



Science.1518; Detecting Novel Associations in Large Data Sets