Diff for "PJJung/CAGMonEtude"

Differences between revisions 19 and 32 (spanning 13 versions)

 ,-----.  ,---.   ,----.   ,--.   ,--.                            ,--.             ,--.        
'  .--./ /  O  \ '  .-./   |   `.'   | ,---. ,--,--,      ,---. ,-'  '-.,--.,--. ,-|  | ,---.  
|  |    |  .-.  ||  | .---.|  |'.'|  || .-. ||      \    | .-. :'-.  .-'|  ||  |' .-. || .-. : 
'  '--'\|  | |  |'  '--'  ||  |   |  |' '-' '|  ||  |    \   --.  |  |  '  ''  '\ `-' |\   --. 
 `-----'`--' `--' `------' `--'   `--' `---' `--''--'     `----'  `--'   `----'  `---'  `----'

Description

The CAGMon etude is a study version of CAGMon that evaluates the dependence between the primary and auxiliary channels.

Project goal

The goal of this project is to find a systematic way of identifying the abnormal glitches in the gravitational-wave data using various methods of correlation analysis. Usually, the community such as LIGO, Virgo, and KAGRA uses a conventional way of finding glitches in auxiliary channels of the detector - Klein-Welle, Omicron, Ordered Veto Lists, etc. However, some different ways can be possible to find and monitor them in a (quasi-) realtime. Also, the method can point out which channel is responsible for the found glitch. In this project, we study its possible to apply three different correlation methods - maximal information coefficient, Pearson's correlation coefficient, and Kendall's tau coefficient - in the gravitational wave data from the KAGRA detector.

Participants

John.J Oh (NIMS)
Young-Min Kim (UNIST)
Pil-Jong Jung (NIMS)

Methods and Frameworks

Maximal Information Coefficient (MIC)

the Maximal Information coefficient(MIC) of a set D of two-variable data with sample size n and grid less than B(n) is given by

\[ MIC(D)=\underset{xy<B(n)}{\max}{\left\{ \frac{I^{*}(D,x,y)}{\log \min \left\{x,y \right\}} \right \}} \],

where \[\omega(1)<B(n)\le O(n^{1-\epsilon}) \] for some \[ 0<\epsilon<1 \]

Pearson's Correlation Coefficient (PCC)

Pearson Correlation Coefficient(PCC) is a statistic that explains the amount of variance accounted for in the relationship between two (or more) variables by \[ R=} \],

where \[ \overline{X} \] and \[ \overline{Y} \] are the mean of X and Y, respectively

Kendall's tau Coefficient

Kendall’s tau with a random samples n of observations from two variables measures the strength of the relationship between two ordinal level variables by

\[ \tau =\frac{c-d} \],

where c is the number of concordant pairs, and d is the number of discordant pairs

Flow chart

Code development

GitHub

TBA

Code versions

CAGMon Etude Alpha
- for the basic test and evaluation of the LASSO regression method developed by LIGO
- reproduced original CAGMon methods and idea
CAGMon Etude Beta
- added coefficient trend plots with LASSO beta, coherence, MIC, PCC, and Kendall's tau
CAGMon Etude Delta
- fixed a critical problem that sucked enormous memory when it used the matplotlib module
CAGMon Etude Eta
- fixed minor issues
- added the range limitation of stride
CAGMon Etude Flat (current version)
- fixed minor issues and optimized scripts
- added the script of HTML summary page
- added coefficient distribution plots
CAGMon Etude Octave (development version)
- remove some processes that make Time-series and Scatter plots. Even though it required tremendous memory, this information is not useful
- adjust HTML code
- fixed minor issues and optimized scripts

Series of scripts

Agrement.py
- the script gathered functions the model required
Melody.py
- the script to calcutate each coefficient and to save trend data as csv
Conchord.py
- the script to make plots
Echo.py
- the script to save the result as HTML web page
CAGMonEtude{Version}.py
- the script to run each script

User guide

How to use CAGMon

Needs of code development

Fundamental critarian or guideline of the stride and its data-size
Daily running on KAGRA

Exemplary results

1. Earthquake effects during O3GK

Datetime: 19 April 2020 20:39 UTC
Purpose
- Test to run CAGMon algorithm with a remarkable event
- To figure out the cause of lock-loss in KAGRA
Results
- stride 5 seconds Summary page
- stride 20 seconds Summary page
- stride 30 seconds Summary page

2. Skim through all obs-segments of O3GK

Purpose
- Test for calculation time and required resources with all observation segments during O3GK
- To figure out trigger events or abnormal behaviors
Results
- April 7, 1270287158 - 1270328032
- April 8, 1270339218 - 1270425618
  - Full Data is unavailable in the KISTI cluster
- April 9, 1270425618 - 1270510167
- April 10, 1270513160 - 1270596544
- April 11, 1270598418 - 1270683904
- April 12, 1270684818 - 1270762046
- April 14, 1270909686 - 1270937768
- April 15, 1270945288 - 1271017582
  - Event: GRB200415 (08:48:05 UTC)　
  - Full Data is unavailable in the KISTI cluster
- April 16, 1271030433 - 1271112809
- April 17, 1271119833 - 1271186507
- April 18, 1271227441 - 1271288128
- April 19, 1271289618 - 1271364033
- April 20, 1271377409 - 1271460608
  - Event: GRB200420A (2:32:58 UTC)
  - Full Data is unavailable in the KISTI cluster

3. With iKAGRA hardware injection data

Event
- Phenomenon: the strain channel and seismometer channels in iKAGRA had a high correlation during the hardware injection test
- Cause: still unknown
- Hypothesis: the glitches have relatively the same behavior as the vacuum rotary pump
- More detail analysis: hveto brief Report for K1 and KGWG Face-to-Face Meeting
Purpose
- To verify whether this model senses injected signals and abnormal glitches
- To test noise resistance and data-size limitation
Results
- stride: 10 seconds with about 5000 data size during 12 minutes summary page
- stride: 10 seconds with about 10000 data size during 12 minutes summary page
- stride: 10 seconds with about 20000 data size during 12 minutes summary page
- stride: 10 seconds with about 30000 data size during 12 minutes summary page
- stride: 10 seconds with about 40000 data size during 12 minutes summary page
- stride: 2 seconds with about 8000 data size during 12 minutes summary page
- stride: 5 seconds with about 20000 data size during 12 minutes summary page
- stride: 60 seconds with about 7500 data during whole iKAGRA data summary page
- stride: 150 seconds with about 10000 data during whole iKAGRA data summary page
- stride: 300 seconds with about 20000 data during whole iKAGRA data summary page
- stride: 600 seconds during whole iKAGRA data summary page

-  ⇤ ← Revision 19 as of 2021-01-22 11:47:10 → 
  Size: 5606
  Editor: PJJung
  Comment:
+   ← Revision 32 as of 2021-01-26 10:57:58 → ⇥
  Size: 9729
  Editor: PJJung
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 66:
-==== Code version ====
+==== Code versions ====
 Line 75:
-  * fixed memory issues
  * fixed minor bugs
+  * fixed minor issues
-Line 78:
+Line 77:
-. CAGMon Etude Flat (latest version)
+. CAGMon Etude Flat (current version)
-Line 82:
+Line 81:
-==== Structure of scripts ====
+. CAGMon Etude Octave (development version)
  * remove some processes that make Time-series and Scatter plots. Even though it required tremendous memory, this information is not useful
  * adjust HTML code
  * fixed minor issues and optimized scripts
 
==== Series of scripts ====
-Line 85:
+Line 88:
-  * the script gathered functions the medel required
+  * the script gathered functions the model required
-Line 89:
+Line 92:
-  * the script to make plots, such as coefficient trend, coefficient distribution trend, time-series, and scatter plots
+  * the script to make plots
-Line 92:
+Line 95:
- * CAGMonEtudeFlat.py
+ * CAGMonEtude{Version}.py
-Line 99:
+Line 102:
+ * Fundamental critarian or guideline of the stride and its data-size
-Line 110:
+Line 114:
-  * [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078(5)/ | Summary page with stride 5 seconds]]
  * [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078/ | Summary page with stride 20 seconds]]
  * [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078(30)/ | Summary page with stride 30 seconds]]
+  * stride 5 seconds [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078(5)/ | Summary page]]
  * stride 20 seconds [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078/ | Summary page]]
  * stride 30 seconds [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078(30)/ | Summary page]]
-Line 119:
+Line 123:
+  * April 7, 1270287158 - 1270328032
  * April 8, 1270339218 - 1270425618
   * Full Data is unavailable in the KISTI cluster
  * April 9, 1270425618 - 1270510167
  * April 10, 1270513160 - 1270596544
  * April 11, 1270598418 - 1270683904
  * April 12, 1270684818 - 1270762046
  * April 14, 1270909686 - 1270937768
  * April 15, 1270945288 - 1271017582
   * Event: GRB200415 (08:48:05 UTC)　
   * Full Data is unavailable in the KISTI cluster
  * April 16, 1271030433 - 1271112809
  * April 17, 1271119833 - 1271186507
  * April 18, 1271227441 - 1271288128
  * April 19, 1271289618 - 1271364033
  * April 20, 1271377409 - 1271460608
   * Event: GRB200420A (2:32:58 UTC)
   * Full Data is unavailable in the KISTI cluster
-Line 120:
+Line 142:
+. With iKAGRA hardware injection data
 * Event
  * Phenomenon: the strain channel and seismometer channels in iKAGRA had a high correlation during the hardware injection test
  * Cause: still unknown
  * Hypothesis: the glitches have relatively the same behavior as the vacuum rotary pump
  * More detail analysis: [[https://www.dropbox.com/s/950vjc807sgz24u/hveto%20brief%20Report%20for%20K1.pdf?dl=0 | hveto brief Report for K1]] and [[https://www.dropbox.com/s/hb7rx93an8yluiq/PilJong%2C%20KGWG%20Face-to-Face%20Meeting.pdf?dl=0 | KGWG Face-to-Face Meeting]]
 * Purpose
  * To verify whether this model senses injected signals and abnormal glitches 
  * To test noise resistance and data-size limitation
 *Results
  * stride: 10 seconds with about 5000 data size during 12 minutes [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b5000%5d/ | summary page]]
  * stride: 10 seconds with about 10000 data size during 12 minutes [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b10000%5d/ | summary page]]
  * stride: 10 seconds with about 20000 data size during 12 minutes [[ https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b20000%5d/| summary page]]
  * stride: 10 seconds with about 30000 data size during 12 minutes [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b30000%5d/ | summary page]]
  * stride: 10 seconds with about 40000 data size during 12 minutes [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b40000%5d/ | summary page]]
  * stride: 2 seconds with about 8000 data size during 12 minutes [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b2s%5d/ | summary page]]
  * stride: 5 seconds with about 20000 data size during 12 minutes [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b5s%5d/ | summary page]]
  * stride: 60 seconds with about 7500 data during whole iKAGRA data  [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b60s%5d/ | summary page]]
  * stride: 150 seconds with about 10000 data during whole iKAGRA data  [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b150s%5d/ | summary page]]
  * stride: 300 seconds with about 20000 data during whole iKAGRA data  [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b300s%5d/ | summary page]]
  * stride: 600 seconds during whole iKAGRA data  [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b600s%5d/ | summary page]]