Differences between revisions 37 and 38
Revision 37 as of 2021-01-26 14:22:14
Size: 12322
Editor: PJJung
Comment:
Revision 38 as of 2021-01-26 14:24:14
Size: 12344
Editor: PJJung
Comment:
Deletions are marked like this. Additions are marked like this.
Line 132: Line 132:
 || April 7 || 1270287158 - 1270328032 || 11h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || April 7 || 1270287158 - 1270328032 || 11h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 135: Line 135:
 || April 9 || 1270425618 - 1270510167 || 23h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || April 9 || 1270425618 - 1270510167 || 23h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 137: Line 137:
 || April 10 || 1270513160 - 1270596544 || 23h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || April 10 || 1270513160 - 1270596544 || 23h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 139: Line 139:
 || April 11 || 1270598418 - 1270683904 || 23h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||
 || || || || 300s || 32Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||
 || April 11 || 1270598418 - 1270683904 || 23h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
 || || || || 300s || 32Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 142: Line 142:
 || || || || 300s || 32Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || || || || 300s || 32Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 144: Line 144:
 || || || || 300s || 32Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || || || || 300s || 32Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 146: Line 146:
 || April 16 || 1271030433 - 1271112809 || 22h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || April 16 || 1271030433 - 1271112809 || 22h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 148: Line 148:
 || April 17 || 1271119833 - 1271186507 || 18h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || April 17 || 1271119833 - 1271186507 || 18h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 150: Line 150:
 || April 18 || 1271227441 - 1271288128 || 16h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || April 18 || 1271227441 - 1271288128 || 16h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||
Line 152: Line 152:
 || April 19 || 1271289618 - 1271364033 || 20h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h memory usage: GB ||  || April 19 || 1271289618 - 1271364033 || 20h || 600s || 16Hz || about 10,000 || [[ | summary page]] || processing time: h / memory usage: GB ||

 ,-----.  ,---.   ,----.   ,--.   ,--.                            ,--.             ,--.        
'  .--./ /  O  \ '  .-./   |   `.'   | ,---. ,--,--,      ,---. ,-'  '-.,--.,--. ,-|  | ,---.  
|  |    |  .-.  ||  | .---.|  |'.'|  || .-. ||      \    | .-. :'-.  .-'|  ||  |' .-. || .-. : 
'  '--'\|  | |  |'  '--'  ||  |   |  |' '-' '|  ||  |    \   --.  |  |  '  ''  '\ `-' |\   --. 
 `-----'`--' `--' `------' `--'   `--' `---' `--''--'     `----'  `--'   `----'  `---'  `----'                                                                                             

Description

The CAGMon etude is a study version of CAGMon that evaluates the dependence between the primary and auxiliary channels.

Project goal

The goal of this project is to find a systematic way of identifying the abnormal glitches in the gravitational-wave data using various methods of correlation analysis. Usually, the community such as LIGO, Virgo, and KAGRA uses a conventional way of finding glitches in auxiliary channels of the detector - Klein-Welle, Omicron, Ordered Veto Lists, etc. However, some different ways can be possible to find and monitor them in a (quasi-) realtime. Also, the method can point out which channel is responsible for the found glitch. In this project, we study its possible to apply three different correlation methods - maximal information coefficient, Pearson's correlation coefficient, and Kendall's tau coefficient - in the gravitational wave data from the KAGRA detector.

Participants

  • John.J Oh (NIMS)
  • Young-Min Kim (UNIST)
  • Pil-Jong Jung (NIMS)

Methods and Frameworks

Maximal Information Coefficient (MIC)

the Maximal Information coefficient(MIC) of a set D of two-variable data with sample size n and grid less than B(n) is given by

\[ MIC(D)=\underset{xy<B(n)}{\max}{\left\{ \frac{I^{*}(D,x,y)}{\log \min \left\{x,y \right\}} \right \}} \],

where \[\omega(1)<B(n)\le O(n^{1-\epsilon}) \] for some \[ 0<\epsilon<1 \]

Pearson's Correlation Coefficient (PCC)

Pearson Correlation Coefficient(PCC) is a statistic that explains the amount of variance accounted for in the relationship between two (or more) variables by \[ R=\sum_{i=1}^{n} (X_i - \overline{X})(Y_i - \overline{Y})} \over {\sqrt{\sum_{i=1}^{n} (X_i - \overline{X}) \sum_{i=1}^{n} (Y_i - \overline{Y})} \],

where \[ \overline{X} \] and \[ \overline{Y} \] are the mean of X and Y, respectively

Kendall's tau Coefficient

Kendall’s tau with a random samples n of observations from two variables measures the strength of the relationship between two ordinal level variables by

\[ \tau =\frac{c-d}n \choose 2 \],

where c is the number of concordant pairs, and d is the number of discordant pairs

Flow chart

Code development

GitHub

TBA

Code versions

  1. CAGMon Etude Alpha
    • for the basic test and evaluation of the LASSO regression method developed by LIGO
    • reproduced original CAGMon methods and idea
  2. CAGMon Etude Beta
    • added coefficient trend plots with LASSO beta, coherence, MIC, PCC, and Kendall's tau
  3. CAGMon Etude Delta
    • fixed a critical problem that sucked enormous memory when it used the matplotlib module
  4. CAGMon Etude Eta
    • fixed minor issues
    • added the range limitation of stride
  5. CAGMon Etude Flat (current version)
    • fixed minor issues and optimized scripts
    • added the script of HTML summary page
    • added coefficient distribution plots
  6. CAGMon Etude Octave (development version)
    • remove some processes that make Time-series and Scatter plots. Even though it required tremendous memory, this information is not useful
    • adjust HTML code
    • fixed minor issues and optimized scripts

Series of scripts

  • Agrement.py
    • the script gathered functions the model required
  • Melody.py
    • the script to calcutate each coefficient and to save trend data as csv
  • Conchord.py
    • the script to make plots
  • Echo.py
    • the script to save the result as HTML web page
  • CAGMonEtude{Version}.py
    • the script to run each script

User guide

Needs of code development

  • Fundamental critarian or guideline of the stride and its data-size
  • Daily running on KAGRA

Exemplary results

1. Earthquake effects during O3GK

  • Datetime: 19 April 2020 20:39 UTC
  • Purpose
    • Test to run CAGMon algorithm with a remarkable event
    • To figure out the cause of lock-loss in KAGRA
  • Computing resource
    • KISTI-LDG
    • Requested CPUS: 72cores
    • Requested memory: 128GB
  • Results

2. Skim through all obs-segments of O3GK

  • Purpose
    • Test for calculation time and required resources with all observation segments during O3GK
    • To figure out trigger events or abnormal behaviors
  • Computing resource
    • KISTI-LDG
    • Requested CPUS: 72cores
    • Requested memory: 64GB
  • Results

    Date

    GPS time

    Data length

    Stride

    Sample rate

    Data size

    Summary page link

    Remarks

    April 7

    1270287158 - 1270328032

    11h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    April 8

    1270339218 - 1270425618

    24h

    Full Data is unavailable in the KISTI cluster

    April 9

    1270425618 - 1270510167

    23h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    April 10

    1270513160 - 1270596544

    23h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    April 11

    1270598418 - 1270683904

    23h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    April 12

    1270684818 - 1270762046

    21h

    600s

    16Hz

    about 10,000

    summary page

    300s

    32Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    April 14

    1270909686 - 1270937768

    7h

    600s

    16Hz

    about 10,000

    summary page

    300s

    32Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    April 15

    1270945288 - 1271017582

    20h

    GRB200415 (08:48:05 UTC) / Full Data is unavailable in the KISTI cluster

    April 16

    1271030433 - 1271112809

    22h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    April 17

    1271119833 - 1271186507

    18h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    April 18

    1271227441 - 1271288128

    16h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    April 19

    1271289618 - 1271364033

    20h

    600s

    16Hz

    about 10,000

    summary page

    processing time: h / memory usage: GB

    300s

    32Hz

    about 10,000

    summary page

    April 20

    1271377409 - 1271460608

    23h

    GRB200420A (2:32:58 UTC) / Full Data is unavailable in the KISTI cluster

3. With iKAGRA hardware injection data

  • Event
    • Phenomenon: the strain channel and seismometer channels in iKAGRA had a high correlation during the hardware injection test
    • Cause: still unknown
    • Hypothesis: the glitches have relatively the same behavior as the vacuum rotary pump
    • More detail analysis: hveto brief Report for K1 and KGWG Face-to-Face Meeting

  • Purpose
    • To verify whether this model senses injected signals and abnormal glitches
    • To test noise resistance and data-size limitation
  • Computing resource
    • KISTI-LDG
    • Requested CPUS: 72cores
    • Requested memory: 64GB
  • Results

    Stride

    Sample sata

    Data size

    Dada length

    Summary page link

    10s

    512Hz

    about 5,000

    about 12m

    summary page

    10s

    1024Hz

    about 10,000

    about 12m

    summary page

    10s

    2048Hz

    about 20,000

    about 12m

    summary page

    10s

    3072Hz

    about 30,000

    about 12m

    summary page

    10s

    4096Hz

    about 40,000

    about 12m

    summary page

    2s

    4096Hz

    about 8,000

    about 12m

    summary page

    5s

    4096Hz

    about 20,000

    about 12m

    summary page

    60s

    128Hz

    about 7,500

    whole iKAGRA data

    summary page

    150s

    64Hz

    about 10,000

    whole iKAGRA data

    summary page

    300s

    64Hz

    about 20,000

    whole iKAGRA data

    summary page

    600s

    16Hz

    about 10,000

    whole iKAGRA data

    summary page

Beyond

References

Presentation materials

JGW-G2112481-v1

Papers

Science.1518; Detecting Novel Associations in Large Data Sets

PJJung/CAGMonEtude (last edited 2021-07-28 08:43:57 by PJJung)