Differences between revisions 1 and 71 (spanning 70 versions)
Revision 1 as of 2021-01-20 13:22:49
Size: 1355
Editor: PJJung
Comment:
Revision 71 as of 2021-02-09 16:13:34
Size: 18698
Editor: PJJung
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= CAGMon etude =
== Descripstion ==
{{{
 ,-----. ,---. ,----. ,--. ,--. ,--. ,--.
' .--./ / O \ ' .-./ | `.' | ,---. ,--,--, ,---. ,-' '-.,--.,--. ,-| | ,---.
| | | .-. || | .---.| |'.'| || .-. || \ | .-. :'-. .-'| || |' .-. || .-. :
' '--'\| | | |' '--' || | | |' '-' '| || | \ --. | | ' '' '\ `-' |\ --.
 `-----'`--' `--' `------' `--' `--' `---' `--''--' `----' `--' `----' `---' `----'
}}}


== Description ==
Line 5: Line 13:


== Project Goal ==
 * [[https://kgwg.nims.re.kr/cbcwiki/CAGMon | KGWG wiki link]]
 * [[http://gwwiki.icrr.u-tokyo.ac.jp/JGWwiki/CAGMon| KAGRA wiki link]]

== Project goal ==
Line 22: Line 31:
the Maximal Information coefficient(MIC) of a set D of two-variable data with sample size n and the grid less than B(n) is given by

\[
MIC(D)=\underset{xy<B(n)}{\max}{\left\{ \frac{I^{*}(D,x,y)}{\log \min \left\{x,y \right\}} \right \}}
\],

where \[\omega(1)<B(n)\le O(n^{1-\epsilon}) \] for some \[ 0<\epsilon<1 \]
Line 24: Line 40:
Pearson Correlation Coefficient(PCC) is a statistic that explains the amount of variance accounted for in the relationship between two (or more) variables by
\[
R={{\sum_{i=1}^{n} (X_i - \overline{X})(Y_i - \overline{Y})} \over {\sqrt{\sum_{i=1}^{n} (X_i - \overline{X}) \sum_{i=1}^{n} (Y_i - \overline{Y})}}}
\],

where \[ \overline{X} \] and \[ \overline{Y} \] are the mean of X and Y, respectively

Line 26: Line 50:


== Exemplary Results ==
Kendall’s tau with a random samples n of observations from two variables measures the strength of the relationship between two ordinal level variables by

\[
\tau =\frac{c-d}{{n \choose 2}}
\],

where c is the number of concordant pairs, and d is the number of discordant pairs

==== Flow chart ====



== Code development ==

==== GitHub ====
[[TBA]]

==== Code versions ====
 1. CAGMon Etude Alpha
   * for the basic test and evaluation of the LASSO regression method developed by LIGO
   * reproduced original CAGMon methods and idea
 2. CAGMon Etude Beta
  * added coefficient trend plots with LASSO beta, coherence, MIC, PCC, and Kendall's tau
 3. CAGMon Etude Delta
  * fixed a critical problem that sucked enormous memory when it used the matplotlib module
 4. CAGMon Etude Eta
  * fixed minor issues
  * added the range limitation of stride
 5. CAGMon Etude Flat
  * fixed minor issues and optimized scripts
  * added the script of HTML summary page
  * added coefficient distribution plots
 6. CAGMon Etude Octave (current version)
  * remove some processes that make Time-series and Scatter plots. Even though it required tremendous memory, this information is not useful
  * adjust HTML code
  * fixed minor issues and optimized scripts
  * added the analysis option whether or not the algorithm proceeds in the active segment only
  * improve script efficiency
  * added the process to make scatter and OmegaScan plots in detail boxes of the summary page
 7. CAGMon Etude Rhapsody (development version)
  * Require auto-selection of CAGMon parameters
  * Require pre-estimation process to check intrinsic sample rate of each channel
  * Improve script efficiency and completeness


==== Series of scripts ====
 * Agrement.py
  * the script gathered functions the model required
 * Melody.py
  * the script to calcutate each coefficient and to save trend data as csv
 * Conchord.py
  * the script to make plots
 * Echo.py
  * the script to save the result as HTML web page
 * CAGMonEtude{Version}.py
  * the script to run each script

==== User guide ====
 * [[/userguide | How to use CAGMon]]

==== Needs of code development ====
 * Fundamental criteria or guideline of CAGMon parameters, such as the stride, the sample rate, and its data-size
 * Daily running on KAGRA


== Empirical study (No free lunch) ==
 1. Apply to glitch data on KAGRA during O3GK
  * Glitch information
   * [[https://docs.google.com/spreadsheets/d/1JxC3QL6jF3xmA0MnWtWO_dUgNOF_i5enD_j4yUK1X7s/edit#gid=417713112 | KAGRA glitch catalog]]
  * Purpose
   * To decide on appropriate parameters when we run CAGMon for searching glitches and correlation
   * To make recommended parameters in the short-range analysis
  * Result
   * [[ | Glitch Catalog for the empirical study of CAGMon parameters]]
  * Conclusion

 2. Apply to the glitch data of GravitySpy on LIGO
   * Data
    * TBD
  
 3. Apply to the mid-range data
   * Data
    * TBD
 4. Apply to the long-range data
   * Data
    * TBD

== Exemplary results ==

1. Earthquake effects during O3GK (with CAGMon Etude Flat)
 * Datetime: 19 April 2020 20:39 UTC
 * Purpose
  * Test to run CAGMon algorithm with a remarkable event
  * To figure out the cause of lock-loss in KAGRA
 * Computing resource
  * KISTI-LDG
  * Requested CPUS: 32cores
  * Requested memory: 128GB
 * Results
  * stride 5 seconds [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078(5)/ | Summary page]]
  * stride 20 seconds [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078/ | Summary page]]
  * stride 30 seconds [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/2020-04-19_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1271363358-1271364078(30)/ | Summary page]]

2. With iKAGRA hardware injection data (with CAGMon Etude Flat)
 * Event
  * Phenomenon: the strain channel and seismometer channels in iKAGRA had a high correlation during the hardware injection test
  * Cause: still unknown
  * Hypothesis: the glitches have relatively the same behavior as the vacuum rotary pump
  * More detail analysis: [[https://www.dropbox.com/s/950vjc807sgz24u/hveto%20brief%20Report%20for%20K1.pdf?dl=0 | h-veto brief Report for K1]] and [[https://www.dropbox.com/s/hb7rx93an8yluiq/PilJong%2C%20KGWG%20Face-to-Face%20Meeting.pdf?dl=0 | KGWG Face-to-Face Meeting]]
 * Purpose
  * To verify whether this model senses injected signals and abnormal glitches
  * To test noise resistance and data-size limitation
 * Computing resource
  * KISTI-LDG
  * Requested CPUS: 32cores
  * Requested memory: 64GB
 *Results
 || Stride || Sample sata || Data size || Dada length || Summary page link ||
 || 10s || 512Hz || about 5,000 || about 12m || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b5000%5d/ | summary page]] ||
 || 10s || 1024Hz || about 10,000 || about 12m || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b10000%5d/ | summary page]] ||
 || 10s || 2048Hz || about 20,000 || about 12m || [[ https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b20000%5d/| summary page]] ||
 || 10s || 3072Hz || about 30,000 || about 12m || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b30000%5d/ | summary page]] ||
 || 10s || 4096Hz || about 40,000 || about 12m || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b40000%5d/ | summary page]] ||
 || 2s || 4096Hz || about 8,000 || about 12m || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b2s%5d/| summary page]] ||
 || 5s || 4096Hz || about 20,000 || about 12m || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145624200-1145624936%5b5s%5d/| summary page]] ||
 || 60s || 128Hz || about 7,500 || whole iKAGRA data || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b60s%5d/ | summary page]] ||
 || 150s || 64Hz || about 10,000 || whole iKAGRA data || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b150s%5d/ | summary page]] ||
 || 300s || 64Hz || about 20,000 || whole iKAGRA data || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b300s%5d/ | summary page]] ||
 || 600s || 16Hz || about 10,000 || whole iKAGRA data || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/iKAGRA/2016-04-25_K1:LSC-MICH_CTRL_CAL_OUT_DQ_1145621548-1145670954%5b600s%5d/ | summary page]] ||


3. Skim through some obs-segments of O3GK (with CAGMon Etude Octave)
 * Purpose
  * Test for calculation time and required resources with all observation segments during O3GK
  * To figure out trigger events or abnormal behaviors
 * Computing resource
  * KISTI-LDG
  * Requested CPUS: 32cores
  * Requested memory: 128GB
 * Results
 || Date || GPS time || Data length || Stride || Sample rate || Data size || Summary page link || Remarks ||
 || April 7 || 1270287158 - 1270328032 || 11h || 500s || 16Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-07_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270287158-1270328032_500_16/ | summary page]] || processing time: 4h12m / memory usage: 42GB ||
 || || || || 240s || 32Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-07_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270287158-1270328032_240_32/ | summary page]] || processing time: 5h21m / memory usage: 23GB ||
 || || || || 120s || 64Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-07_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270287158-1270328032_120_64/ | summary page]] || processing time: 17h10m / memory usage: 41.9GB ||
 || || || || 60s || 128Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-07_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270287158-1270328032_60_128/ | summary page]] || processing time: 23h03m / memory usage: 28.8GB ||
 || || || || 30s || 256Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-07_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270287158-1270328032_30_256/ | summary page]] || processing time: 1d23h / memory usage: 24GB ||
 || || || || 15s || 512Hz || about 8,000 || [[ | summary page]] || processing time: > 3 days => killed ||
 || || || || 8s || 1024Hz || about 8,000 || [[ | summary page]] || processing time: > 3 days => killed ||
 || || || || 4s || 2048Hz || about 8,000 || [[ | summary page]] || processing time: > 3 days => killed ||
 || || || || 2s || 4096Hz || about 8,000 || [[ | summary page]] || processing time: > 3 days => killed ||
 || || || || 1s || 8192Hz || about 8,000 || [[ | summary page]] || processing time: > 3 days => killed ||
 || April 14 || 1270909686 - 1270937768 || 7h || 500s || 16Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_500_16/ | summary page]] || processing time: 50m / memory usage: 1.9GB ||
 || || || || 240s || 32Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_240_32/ | summary page]] || processing time: 2h8m / memory usage: 2.4GB ||
 || || || || 120s || 64Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_120_64/ | summary page]] || processing time: 4h40m / memory usage: 3.6GB ||
 || || || || 60s || 128Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_60_128/ | summary page]] || processing time: 8h30m / memory usage: 2.0GB ||
 || || || || 30s || 256Hz || about 8,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_30_256/ | summary page]] || processing time: 15h30m / memory usage: 2.2GB ||
 || || || || 500s || 32Hz || about 16,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_500_32/ | summary page]] || processing time: 50m / memory usage: 3.2GB ||
 || || || || 240s || 64Hz || about 16,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_240_64/ | summary page]] || processing time: 5h25m / memory usage: 3.3GB ||
 || || || || 120s || 128Hz || about 16,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_120_128/ | summary page]] || processing time: 10h10m / memory usage: 3.4GB ||
 || || || || 60s || 256Hz || about 16,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_60_256/ | summary page]] || processing time: 18h50m / memory usage: 3.4GB ||
 || || || || 30s || 512Hz || about 16,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_30_512/ | summary page]] || processing time: 17h / memory usage: 3.7GB ||
 || || || || 500s || 64Hz || about 36,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_500_64/ | summary page]] || processing time: 6h30m / memory usage: 6.7GB ||
 || || || || 240s || 128Hz || about 36,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_240_128/ | summary page]] || processing time: 10h57m / memory usage: 6.7GB ||
 || || || || 120s || 256Hz || about 36,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_120_256/ | summary page]] || processing time: 17h3m / memory usage: 7.0GB ||
 || || || || 60s || 512Hz || about 36,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_60_512/ | summary page]] || processing time: 1d6h / memory usage: 7.1GB ||
 || || || || 30s || 1024Hz || about 36,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_30_1024/ | summary page]] || processing time: 2d22h30m / memory usage: 7.0GB ||
 || || || || 500s || 128Hz || about 64,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_500_128/ | summary page]] || processing time: 12h10m / memory usage: 14.6GB ||
 || || || || 240s || 256Hz || about 64,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_240_256/ | summary page]] || processing time: 1d3h40m / memory usage: 14.6GB ||
 || || || || 120s || 512Hz || about 64,000 || [[https://ldas-jobs.ligo.caltech.edu/~pil-jong.jung/CAGMon/O3GK/2020-04-14_K1:CAL-CS_PROC_C00_STRAIN_DBL_DQ_1270909686-1270937768_120_512/ | summary page]] || processing time: 2d6h / memory usage: 14.7GB ||
 || || || || 60s || 1024Hz || about 64,000 || [[ | summary page]] || processing time: > 4 days => kille ||
 || || || || 30s || 2048Hz || about 64,000 || [[ | summary page]] || processing time: > 4 days => kille ||


4. Glitch analysis during O3GK
 * Purpose
  *
 * Computing resource
  * KISTI-LDG
  * Requested CPUS: 32cores
  * Requested memory: 128GB
 * CAGMon parameters
  * MIC Alpha: 0.6
  * MIC c: 15
  * Data-size: 8192
  * Stride: 1.0 second
 * Results
 || Datetime || GPS time || Summary page link || Remarks ||


== Cross-validation ==
Line 34: Line 245:
==== Presentation Materials ==== ==== Presentation materials ====
[[https://gwdoc.icrr.u-tokyo.ac.jp/cgi-bin/private/DocDB/ShowDocument?docid=12481|JGW-G2112481-v1]]
Line 37: Line 249:
[[https://science.sciencemag.org/content/334/6062/1518 | Science.1518; Detecting Novel Associations in Large Data Sets]]

 ,-----.  ,---.   ,----.   ,--.   ,--.                            ,--.             ,--.        
'  .--./ /  O  \ '  .-./   |   `.'   | ,---. ,--,--,      ,---. ,-'  '-.,--.,--. ,-|  | ,---.  
|  |    |  .-.  ||  | .---.|  |'.'|  || .-. ||      \    | .-. :'-.  .-'|  ||  |' .-. || .-. : 
'  '--'\|  | |  |'  '--'  ||  |   |  |' '-' '|  ||  |    \   --.  |  |  '  ''  '\ `-' |\   --. 
 `-----'`--' `--' `------' `--'   `--' `---' `--''--'     `----'  `--'   `----'  `---'  `----'                                                                                             

Description

The CAGMon etude is a study version of CAGMon that evaluates the dependence between the primary and auxiliary channels.

Project goal

The goal of this project is to find a systematic way of identifying the abnormal glitches in the gravitational-wave data using various methods of correlation analysis. Usually, the community such as LIGO, Virgo, and KAGRA uses a conventional way of finding glitches in auxiliary channels of the detector - Klein-Welle, Omicron, Ordered Veto Lists, etc. However, some different ways can be possible to find and monitor them in a (quasi-) realtime. Also, the method can point out which channel is responsible for the found glitch. In this project, we study its possible to apply three different correlation methods - maximal information coefficient, Pearson's correlation coefficient, and Kendall's tau coefficient - in the gravitational wave data from the KAGRA detector.

Participants

  • John.J Oh (NIMS)
  • Young-Min Kim (UNIST)
  • Pil-Jong Jung (NIMS)

Methods and Frameworks

Maximal Information Coefficient (MIC)

the Maximal Information coefficient(MIC) of a set D of two-variable data with sample size n and the grid less than B(n) is given by

\[ MIC(D)=\underset{xy<B(n)}{\max}{\left\{ \frac{I^{*}(D,x,y)}{\log \min \left\{x,y \right\}} \right \}} \],

where \[\omega(1)<B(n)\le O(n^{1-\epsilon}) \] for some \[ 0<\epsilon<1 \]

Pearson's Correlation Coefficient (PCC)

Pearson Correlation Coefficient(PCC) is a statistic that explains the amount of variance accounted for in the relationship between two (or more) variables by \[ R=\sum_{i=1}^{n} (X_i - \overline{X})(Y_i - \overline{Y})} \over {\sqrt{\sum_{i=1}^{n} (X_i - \overline{X}) \sum_{i=1}^{n} (Y_i - \overline{Y})} \],

where \[ \overline{X} \] and \[ \overline{Y} \] are the mean of X and Y, respectively

Kendall's tau Coefficient

Kendall’s tau with a random samples n of observations from two variables measures the strength of the relationship between two ordinal level variables by

\[ \tau =\frac{c-d}n \choose 2 \],

where c is the number of concordant pairs, and d is the number of discordant pairs

Flow chart

Code development

GitHub

TBA

Code versions

  1. CAGMon Etude Alpha
    • for the basic test and evaluation of the LASSO regression method developed by LIGO
    • reproduced original CAGMon methods and idea
  2. CAGMon Etude Beta
    • added coefficient trend plots with LASSO beta, coherence, MIC, PCC, and Kendall's tau
  3. CAGMon Etude Delta
    • fixed a critical problem that sucked enormous memory when it used the matplotlib module
  4. CAGMon Etude Eta
    • fixed minor issues
    • added the range limitation of stride
  5. CAGMon Etude Flat
    • fixed minor issues and optimized scripts
    • added the script of HTML summary page
    • added coefficient distribution plots
  6. CAGMon Etude Octave (current version)
    • remove some processes that make Time-series and Scatter plots. Even though it required tremendous memory, this information is not useful
    • adjust HTML code
    • fixed minor issues and optimized scripts
    • added the analysis option whether or not the algorithm proceeds in the active segment only
    • improve script efficiency
    • added the process to make scatter and OmegaScan plots in detail boxes of the summary page

  7. CAGMon Etude Rhapsody (development version)
    • Require auto-selection of CAGMon parameters
    • Require pre-estimation process to check intrinsic sample rate of each channel
    • Improve script efficiency and completeness

Series of scripts

  • Agrement.py
    • the script gathered functions the model required
  • Melody.py
    • the script to calcutate each coefficient and to save trend data as csv
  • Conchord.py
    • the script to make plots
  • Echo.py
    • the script to save the result as HTML web page
  • CAGMonEtude{Version}.py
    • the script to run each script

User guide

Needs of code development

  • Fundamental criteria or guideline of CAGMon parameters, such as the stride, the sample rate, and its data-size
  • Daily running on KAGRA

Empirical study (No free lunch)

  1. Apply to glitch data on KAGRA during O3GK
  2. Apply to the glitch data of GravitySpy on LIGO

    • Data
      • TBD
  3. Apply to the mid-range data
    • Data
      • TBD
  4. Apply to the long-range data
    • Data
      • TBD

Exemplary results

1. Earthquake effects during O3GK (with CAGMon Etude Flat)

  • Datetime: 19 April 2020 20:39 UTC
  • Purpose
    • Test to run CAGMon algorithm with a remarkable event
    • To figure out the cause of lock-loss in KAGRA
  • Computing resource
    • KISTI-LDG
    • Requested CPUS: 32cores
    • Requested memory: 128GB
  • Results

2. With iKAGRA hardware injection data (with CAGMon Etude Flat)

  • Event
    • Phenomenon: the strain channel and seismometer channels in iKAGRA had a high correlation during the hardware injection test
    • Cause: still unknown
    • Hypothesis: the glitches have relatively the same behavior as the vacuum rotary pump
    • More detail analysis: h-veto brief Report for K1 and KGWG Face-to-Face Meeting

  • Purpose
    • To verify whether this model senses injected signals and abnormal glitches
    • To test noise resistance and data-size limitation
  • Computing resource
    • KISTI-LDG
    • Requested CPUS: 32cores
    • Requested memory: 64GB
  • Results

    Stride

    Sample sata

    Data size

    Dada length

    Summary page link

    10s

    512Hz

    about 5,000

    about 12m

    summary page

    10s

    1024Hz

    about 10,000

    about 12m

    summary page

    10s

    2048Hz

    about 20,000

    about 12m

    summary page

    10s

    3072Hz

    about 30,000

    about 12m

    summary page

    10s

    4096Hz

    about 40,000

    about 12m

    summary page

    2s

    4096Hz

    about 8,000

    about 12m

    summary page

    5s

    4096Hz

    about 20,000

    about 12m

    summary page

    60s

    128Hz

    about 7,500

    whole iKAGRA data

    summary page

    150s

    64Hz

    about 10,000

    whole iKAGRA data

    summary page

    300s

    64Hz

    about 20,000

    whole iKAGRA data

    summary page

    600s

    16Hz

    about 10,000

    whole iKAGRA data

    summary page

3. Skim through some obs-segments of O3GK (with CAGMon Etude Octave)

  • Purpose
    • Test for calculation time and required resources with all observation segments during O3GK
    • To figure out trigger events or abnormal behaviors
  • Computing resource
    • KISTI-LDG
    • Requested CPUS: 32cores
    • Requested memory: 128GB
  • Results

    Date

    GPS time

    Data length

    Stride

    Sample rate

    Data size

    Summary page link

    Remarks

    April 7

    1270287158 - 1270328032

    11h

    500s

    16Hz

    about 8,000

    summary page

    processing time: 4h12m / memory usage: 42GB

    240s

    32Hz

    about 8,000

    summary page

    processing time: 5h21m / memory usage: 23GB

    120s

    64Hz

    about 8,000

    summary page

    processing time: 17h10m / memory usage: 41.9GB

    60s

    128Hz

    about 8,000

    summary page

    processing time: 23h03m / memory usage: 28.8GB

    30s

    256Hz

    about 8,000

    summary page

    processing time: 1d23h / memory usage: 24GB

    15s

    512Hz

    about 8,000

    summary page

    processing time: > 3 days => killed

    8s

    1024Hz

    about 8,000

    summary page

    processing time: > 3 days => killed

    4s

    2048Hz

    about 8,000

    summary page

    processing time: > 3 days => killed

    2s

    4096Hz

    about 8,000

    summary page

    processing time: > 3 days => killed

    1s

    8192Hz

    about 8,000

    summary page

    processing time: > 3 days => killed

    April 14

    1270909686 - 1270937768

    7h

    500s

    16Hz

    about 8,000

    summary page

    processing time: 50m / memory usage: 1.9GB

    240s

    32Hz

    about 8,000

    summary page

    processing time: 2h8m / memory usage: 2.4GB

    120s

    64Hz

    about 8,000

    summary page

    processing time: 4h40m / memory usage: 3.6GB

    60s

    128Hz

    about 8,000

    summary page

    processing time: 8h30m / memory usage: 2.0GB

    30s

    256Hz

    about 8,000

    summary page

    processing time: 15h30m / memory usage: 2.2GB

    500s

    32Hz

    about 16,000

    summary page

    processing time: 50m / memory usage: 3.2GB

    240s

    64Hz

    about 16,000

    summary page

    processing time: 5h25m / memory usage: 3.3GB

    120s

    128Hz

    about 16,000

    summary page

    processing time: 10h10m / memory usage: 3.4GB

    60s

    256Hz

    about 16,000

    summary page

    processing time: 18h50m / memory usage: 3.4GB

    30s

    512Hz

    about 16,000

    summary page

    processing time: 17h / memory usage: 3.7GB

    500s

    64Hz

    about 36,000

    summary page

    processing time: 6h30m / memory usage: 6.7GB

    240s

    128Hz

    about 36,000

    summary page

    processing time: 10h57m / memory usage: 6.7GB

    120s

    256Hz

    about 36,000

    summary page

    processing time: 17h3m / memory usage: 7.0GB

    60s

    512Hz

    about 36,000

    summary page

    processing time: 1d6h / memory usage: 7.1GB

    30s

    1024Hz

    about 36,000

    summary page

    processing time: 2d22h30m / memory usage: 7.0GB

    500s

    128Hz

    about 64,000

    summary page

    processing time: 12h10m / memory usage: 14.6GB

    240s

    256Hz

    about 64,000

    summary page

    processing time: 1d3h40m / memory usage: 14.6GB

    120s

    512Hz

    about 64,000

    summary page

    processing time: 2d6h / memory usage: 14.7GB

    60s

    1024Hz

    about 64,000

    summary page

    processing time: > 4 days => kille

    30s

    2048Hz

    about 64,000

    summary page

    processing time: > 4 days => kille

4. Glitch analysis during O3GK

  • Purpose
  • Computing resource
    • KISTI-LDG
    • Requested CPUS: 32cores
    • Requested memory: 128GB
  • CAGMon parameters
    • MIC Alpha: 0.6
    • MIC c: 15
    • Data-size: 8192
    • Stride: 1.0 second
  • Results

    Datetime

    GPS time

    Summary page link

    Remarks

Cross-validation

Beyond

References

Presentation materials

JGW-G2112481-v1

Papers

Science.1518; Detecting Novel Associations in Large Data Sets

PJJung/CAGMonEtude (last edited 2021-07-28 08:43:57 by PJJung)