728x90
๋ฐ˜์‘ํ˜•

Amazon Sagemaker์—์„œ๋„ ์“ฐ์ด๋Š” RRCF๋Œ€ํ•ด์„œ ์†Œ๊ฐœํ• ๊นŒ ํ•œ๋‹ค.

 

RRCF ์†Œ๊ฐœ

  • RRCF ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ŠคํŠธ๋ฆฌ๋ฐ ๋ฐ์ดํ„ฐ์˜ ํŠน์ด์น˜๋ฅผ ๊ฒ€์ถœํ•˜๊ธฐ ์œ„ํ•œ ์•™์ƒ๋ธ” ๋ฐฉ๋ฒ•์ž„
  • ์—ฌ๋Ÿฌ ๊ธฐ๋Šฅ ์ œ๊ณต
    • ์ŠคํŠธ๋ฆฌ๋ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋„๋ก ์„ค๊ณ„
    • ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ ์ œ๊ณต
    • ๊ด€๋ จ ์—†๋Š” ์ฐจ์›์˜ ์˜ํ–ฅ ๊ฐ์†Œ
    • ํŠน์ด์น˜์˜ ์กด์žฌ๋ฅผ ์ˆจ๊ธธ ์ˆ˜ ์žˆ๋Š” ์ค‘๋ณต ๋ฐ ๊ทผ์ ‘ํ•œ ๊ฒƒ์„ ์ •์ƒ์ ์œผ๋กœ ์ฒ˜๋ฆฌ§๋ช…ํ™•ํ•œ ๊ธฐ๋ณธ ํ†ต๊ณ„์  ์˜๋ฏธ๋ฅผ ๊ฐ–๋Š” ์ด์ƒ ์ง•ํ›„ ์ ์ˆ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ํŠน์ง•
  • ์ด์ƒ ์ง•ํ›„ ๊ฐ์ง€ ๋ฐฉ๋ฒ• ์ผ๋ถ€
    •  
      ๋‹จ์ผ ํด๋ž˜์Šค ์ง€์› ๋ฒกํ„ฐ ๋จธ์‹ (OC-SVM; One class Support Vector Machines)
    • ๊ฐ•๋ ฅํ•œ ๊ณต๋ถ„์‚ฐ ์ถ”์ •
    •  
      LOF(Local Outlier Factor)
    •  
      ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง(RNN; Replicator Neural Network)
  •  
    ์œ„์˜ ๋ฐฉ๋ฒ•์—๋Š” ๋ช‡ ๊ฐ€์ง€์˜ ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐํ•˜๋Š”๋ฐ ๊ทธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Isolation Forest(IF) ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ƒˆ๋กœ์šด ์•™์ƒ๋ธ” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•จ
  •  
    IF ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํŠน์ด์น˜๋ฅผ ํƒ์ง€ํ•˜๋Š”๋ฐ ์žˆ์–ด ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์ง€๋งŒ, ๋ช‡ ๊ฐ€์ง€์˜ ํ•œ๊ณ„๊ฐ€ ์žˆ์Œ
    •  
      tree๊ฐ€ ์ƒ์„ฑ๋˜๋ฉด Isolation tree์—์„œ ํฌ์ธํŠธ๋ฅผ ์‚ฝ์ž…, ์‚ญ์ œ ํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ŠคํŠธ๋ฆฌ๋ฐ ๋ฐ์ดํ„ฐ์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๋„๋ก ์„ค๊ณ„๋˜์ง€ ์•Š์Œ
    •  
      IF ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๏ผ‚๋ถ€์ ์ ˆํ•œ ์ฐจ์›"์— ๋ฏผ๊ฐํ•˜๋ฉฐ, ์ด๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ์ ์€ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๋Š” ์ฐจ์›์—์„œ
      ํŒŒํ‹ฐ์…˜์ด ๋‚ญ๋น„๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธ
    •  
      tree ๊นŠ์ด๊ฐ€ ํŠน์ด์น˜๋ฅผ ๊ฐ์ง€ํ•˜๋Š” ๋ฐ ์žˆ์–ด ๊ฒฝํ—˜์  ์„ฑ๊ณต์„ ๋ณด์—ฌ์ฃผ์ง€๋งŒ, ์ด ์ธก์ •์ง€ํ‘œ๋ฅผ ์ด์ƒ์ ์ˆ˜๋กœ
      ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์— ๋Œ€ํ•œ ์ด๋ก ์  ์ •๋‹น์„ฑ์€ ๊ฑฐ์˜ ์—†์Œ
  •  
    RRCF ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ด๋Ÿฌํ•œ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ๊ณ ์•ˆ ๋จ

 

์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋™์ž‘ ์›๋ฆฌ

  • RRCF ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋žœ๋ค ๋ฐ์ดํ„ฐ ์ ๋“ค์„ ๊ฐ€์ ธ์™€ ๋™์ผํ•œ ์ˆ˜์˜ ์ ๋“ค๋กœ ์ž˜๋ผ tree๋ฅผ ๋งŒ๋“ฌ
  • Tree๋ฅผ ๋ชจ๋‘ ๊ฒฐํ•ฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ ์ ๋“ค์ด forest๋กœ ํ˜•์„ฑ๋˜๊ณ , ํŠน์ • ๋ฐ์ดํ„ฐ ์ ๋“ค์ด ์ด์ƒ์น˜์ธ์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Œ
  • ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋ฐฐ์น˜์— ๋”ฐ๋ผ ์ ์ˆ˜ ๋ถ€์—ฌ
  • ์ฃผํ™ฉ์ƒ‰ ์ ์ด ๋งŽ์€ ์ ์ˆ˜๋ฅผ ์–ป์Œ
  • ์›์•ˆ์— ์žˆ๋Š” ๊ฐ ๋ฐ์ดํ„ฐ ์ ์˜ ์ ์ˆ˜๊ฐ€ ์ด์ƒ์น˜ ๊ฐ’๋ณด๋‹ค ์ž‘์Œ

  • ์ด์ƒ์น˜ ์ ์ˆ˜๋Š” ์›์—์„œ ์–ผ๋งˆ๋‚˜ ๋ฉ€๋ฆฌ ๋–จ์–ด์ ธ ์žˆ๋Š”์ง€์— ๋”ฐ๋ผ ์ ์ˆ˜๋ฅผ ๋ถ€์—ฌํ•จ
  • ์ ์ˆ˜๊ฐ€ ์ž‘์„์ˆ˜๋ก ์ •์ƒ์ด๊ณ  ์ ์ˆ˜๊ฐ€ ๋†’์„์ˆ˜๋ก ์ด์ƒ์น˜์ž„
  • ๋ฐ์ดํ„ฐ ์ ๋“ค์˜ ์ ์ˆ˜๊ฐ€ ํ‘œ์ค€ํŽธ์ฐจ 3์„ ์ดˆ๊ณผํ•˜๋Š” ๊ฒฝ์šฐ ๋น„์ •์ƒ์ ์ธ ๊ฒƒ์œผ๋กœ ๊ฐ„์ฃผ ํ•จ

์ ์ˆ˜ ๊ณ„์‚ฐ ๋ฐฉ๋ฒ•

  1. ๊ฐ ์ฐจ์›์˜ ์ตœ์†Ÿ๊ฐ’๊ณผ ์ตœ๋Œ“๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ์˜ Bounding Box๋ฅผ ๋งŒ๋“ฌ
  2. ์ฐจ์› ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒํ•˜๊ณ  ์ฐจ์›์˜ ๋ฒ”์œ„๋ฅผ ์ž„์˜๋กœ ์ž๋ฆ„. ์œ„ ์˜ˆ์—์„œ๋Š” x์ถ•์„ ์ ˆ๋‹จ ํ•จ
  3. ์™ผ์ชฝ๊ณผ ์˜ค๋ฅธ์ชฝ ๋ชจ๋‘์— ๋Œ€ํ•ด Bounding Box๋ฅผ ๋‹ค์‹œ ๋งŒ๋“ฌ
  4. ๊ฐ๊ฐ ์ƒˆ๋กœ์šด Bounding Box์—์„œ ๋ฌด์ž‘์œ„๋กœ ์ž๋ฆ„
  5. Tree๊ฐ€ root์— ๊ฐ€๊นŒ์ด ์žˆ๋Š” ์ ์ด ์žˆ๋‹ค๋ฉด ๊ทธ๊ฒƒ๋“ค์€ ์ž˜๋ ค ๊ณ ๋ฆฝ ๋˜๊ณ , root์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ์ ์ˆ˜๊ฐ€ ๋†’์•„์ง
  6. Tree์˜ ๋ชจ๋“  ์ง€์ ์ด ์™„์ „ํžˆ ๊ฒฉ๋ฆฌ๋  ๋•Œ๊นŒ์ง€ ์ˆ˜ํ–‰๋จ

์ฐธ๊ณ  ์‚ฌ์ดํŠธ

 

 

์ด์ƒ ํƒ์ง€๋ฅผ ์œ„ํ•œ Amazon SageMaker ์˜ Random Cut Forest ๋นŒํŠธ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜ | Amazon Web Services

Amazon SageMaker์—์„œ ์ƒˆ๋กœ์šด ๋นŒํŠธ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ Random Cut Forest(RCF)๋ฅผ ์‚ฌ์šฉํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. RCF๋Š” ๋ฐ์ดํ„ฐ์…‹์—์„œ ์ด์ƒ์น˜(outlier)๋ฅผ ํƒ์ง€ํ•˜๋Š” ๋น„์ง€๋„ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค. ์ด๋ฒˆ ๋ธ”๋กœ๊ทธ์—์„œ๋Š” ์ด์ƒ ํƒ

aws.amazon.com

 

 

An Introduction to SageMaker Random Cut Forests — Amazon SageMaker Examples 1.0.0 documentation

This notebook was tested in Amazon SageMaker Studio on a ml.t3.medium instance with Python 3 (Data Science) kernel. Our first step is to setup our AWS credentials so that AWS SageMaker can store and access training data and model artifacts. We also need so

sagemaker-examples.readthedocs.io

 

Random Cut Forest — sagemaker 2.72.3 documentation

input_shape (dict) – Specifies the name and shape of the expected inputs for your trained model in json dictionary form, for example: {‘data’:[1,3,1024,1024]}, or {‘var1’: [1,1,28,28], ‘var2’:[1,1,28,28]}

sagemaker.readthedocs.io

 

RCF ์ž‘๋™ ๋ฐฉ์‹ - Amazon SageMaker

์ด ํŽ˜์ด์ง€์— ์ž‘์—…์ด ํ•„์š”ํ•˜๋‹ค๋Š” ์ ์„ ์•Œ๋ ค ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์‹ค๋ง์‹œ์ผœ ๋“œ๋ ค ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ์ž ๊น ์‹œ๊ฐ„์„ ๋‚ด์–ด ์„ค๋ช…์„œ๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ๋ง์”€ํ•ด ์ฃผ์‹ญ์‹œ์˜ค.

docs.aws.amazon.com

 

[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] ์‹ค์‹œ๊ฐ„ ์ด์ƒ ๊ฐ์ง€ ๋ชจ๋ธ Robust Random Cut Forest (RRCF)

์žฌ์•ผ์˜ ์ˆจ์€ ๊ณ ์ˆ˜๊ฐ€ ๋˜๊ณ  ์‹ถ์€ ์ดˆ์‹ฌ์ž

hiddenbeginner.github.io

 

rrcf ๐ŸŒฒ๐ŸŒฒ๐ŸŒฒ

๐ŸŒฒ Implementation of the Robust Random Cut Forest Algorithm for anomaly detection on streams

klabum.github.io

 

Robust Random Cut Forest (RRCF): A No Math Explanation

A few weeks ago my colleague, Christopher Sycalik, R&D Lead for Platform DXC Intelligence, and I had an opportunity to play with the AWS Kinesis Analytics algorithm Robust Random Cut Forrest (RRCF). RRCF provides anomaly detection on streaming data.

www.linkedin.com

 

RRCF ๊ฒ€์ƒ‰์–ด๋กœ ์ณ์„œ ์ฐพ์•„๋ณด๋ฉด ์ข‹์€ ์ž๋ฃŒ๋“ค ๋งŽ์Œ

https://medium.com

728x90
๋ฐ˜์‘ํ˜•

+ Recent posts