Search
๐Ÿ’พ

Lecture 02 : Performance

course
last review
2023/04/10
mastery
rookie
progress
not started
date
2023/03/15
4 more properties
Previous chapter
์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜๋Š” ๊ฒƒ์€ ๊ฐ„๋‹จํ•œ ์ผ์ด ์•„๋‹ˆ๋‹ค.
ํ˜„๋Œ€์˜ ์†Œํ”„ํŠธ์›จ์–ด๋Š” ๊ทœ๋ชจ๊ฐ€ ํฌ๊ณ  ๋ณต์žกํ•˜๋ฉฐ, ํ•˜๋“œ์›จ์–ด๋Š” ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ์œ„ํ•œ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์–ด ์„ฑ๋Šฅํ‰๊ฐ€๊ฐ€ ๋”์šฑ ์–ด๋ ต๋‹ค.

Performance

โ€ข
๋‘ ๋ฐ์Šคํฌํ†ฑ ์ปดํ“จํ„ฐ์—์„œ ๊ฐ™์€ ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰์‹œํ‚จ๋‹ค๋ฉด, ๋จผ์ € ๋๋‚˜๋Š” ์ชฝ์ด ๋” ๋น ๋ฅธ ์ปดํ“จํ„ฐ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.
โ—ฆ
๊ฐœ์ธ์˜ ์ž…์žฅ์—์„œ๋Š” ์‘๋‹ต์‹œ๊ฐ„(response time) : a.k.a. ์‹คํ–‰์‹œ๊ฐ„(execution time)์ด ๋” ์ค‘์š”ํ•  ๊ฒƒ์ด๋‹ค.
โ€ข
๊ทธ๋Ÿฌ๋‚˜ ์—ฌ๋Ÿฌ ๋Œ€์˜ ์„œ๋ฒ„๋ฅผ ๊ฐ€์ง€๊ณ  ์—ฌ๋Ÿฌ ์‚ฌ์šฉ์ž์˜ ์ž‘์—…์„ ์ฒ˜๋ฆฌํ•œ๋‹ค๋ฉด, ํ•˜๋ฃจ๋™์•ˆ ๋” ๋งŽ์€ ์ž‘์—…์„ ์ฒ˜๋ฆฌํ•˜๋Š” ์ปดํ“จํ„ฐ๊ฐ€ ๋” ๋น ๋ฅธ ์ปดํ“จํ„ฐ์ผ ๊ฒƒ์ด๋‹ค.
โ—ฆ
์ฆ‰, ์„œ๋ฒ„ ์ž…์žฅ์—์„œ๋Š” ์ฒ˜๋ฆฌ๋Ÿ‰(throughput) ๋˜๋Š” ๋Œ€์—ญํญ(bandwidth)์ด ๋” ์ค‘์š”ํ•˜๋‹ค.

ThroughPut

์–ด๋Š ๋น„ํ–‰๊ธฐ๊ฐ€ ๊ฐ€์žฅ ์ข‹์€๊ฐ€? ๊ฐ ๋น„ํ–‰๊ธฐ๋งˆ๋‹ค์˜ ์ˆ˜์น˜๊ฐ€ ๋‹ค๋ฅผ ๊ฒƒ์ด๋‹ค.
โ€ข
๊ฐ€์žฅ ๋น ๋ฅธ ๋น„ํ–‰๊ธฐ๋Š”?
โ€ข
๊ฐ€์žฅ ๋งŽ์€ ์Šน๊ฐ์„ ํƒœ์šฐ๋Š” ๋น„ํ–‰๊ธฐ๋Š”?
โ€ข
๊ฐ€์žฅ ๊ธด ์—ฌํ–‰๋ ˆ์ธ์ง€๋Š”?
ํ•œ๋ช…์ด ๋น ๋ฅด๊ฒŒ ์ด๋™ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ฝฉ์ฝ”๋“œ๊ฐ€ ๋” ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์ผ ๊ฒƒ์ด๋‹ค.
๊ทธ๋ ‡๋‹ค๋ฉด ๋งŽ์€ ์Šน๊ฐ์„ ๋น ๋ฅด๊ฒŒ ์˜ฎ๊ธฐ๊ธฐ ์œ„ํ•ด์„œ๋Š” ์–ด๋–ค๊ฒŒ ๋” ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์ธ๊ฐ€?
์–ด๋–ค ์ผ์„ ํ•˜๋Š”๋ฐ ๋‹จ์ˆœํžˆ ํ•˜๋‚˜์˜ ์ผ๋งŒ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒฝ์šฐ๋„ ์žˆ์œผ๋‚˜, ์—ฌ๋Ÿฌ ์ผ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒฝ์šฐ๋„ ๋งŽ์„ ๊ฒƒ์ด๋‹ค.
ThroughPut์€ ๋‹จ์œ„์‹œ๊ฐ„๋™์•ˆ ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ์ผ์„ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒƒ์ด๋‹ค.
๋น„ํ–‰๊ธฐ๋Š” ์ปดํ“จํ„ฐ ํ”Œ๋žซํผ์— ๋Œ€์‘๋˜๊ณ , ์Šน๊ฐ์€ instruction์— ๋Œ€์‘๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.
ํ”„๋กœ๊ทธ๋žจ์€ instruction์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด์žˆ์œผ๋ฉฐ, ์–ด๋–ค ํ”„๋กœ๊ทธ๋žจ์„ ๋Ÿฌ๋‹์‹œํ‚ฌ๋•Œ ์ˆ˜๋งŽ์€ ๋ช…๋ น์–ด๊ฐ€ ์‹คํ–‰๋˜์–ด์•ผ ํ•˜๊ณ , ๋งŽ์€ ๋ช…๋ น์–ด๋ฅผ ๋น ๋ฅด๊ฒŒ ์ฒ˜๋ฆฌํ•ด์•ผํ•œ๋‹ค.

Relative Performance

Performance=1Executionย Time\text{Performance} = \cfrac{1}{\text{Execution Time}}
์ฆ‰, ์ƒ๋Œ€์ ์œผ๋กœ ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚˜๋‹ค๋Š” ์ด์•ผ๊ธฐ๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ์‹คํ–‰์‹œ๊ฐ„์ด ์ ๋‹ค๋Š” ์ด์•ผ๊ธฐ์™€ ๊ฐ™๋‹ค.
ํ˜ผ๋™์„ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด ์ปดํ“จํ„ฐ ์„ฑ๋Šฅ์„ ์ •๋Ÿ‰์  ๋น„๊ตํ• ๋•Œ๋Š” โ€œโ€ฆ๋ณด๋‹ค ๋น ๋ฅด๋‹คโ€๋ผ๋Š” ์šฉ์–ด๋งŒ ์‚ฌ์šฉํ•˜๊ธฐ๋กœ ํ•œ๋‹ค.
๋˜ํ•œ ์„ฑ๋Šฅ์ด ์ฆ๊ฐ€ํ•˜๋ฉด ์‹คํ–‰์‹œ๊ฐ„์ด ๊ฐ์†Œํ•œ๋‹ค. ๋‘˜์€ ์—ญ๊ด€๊ณ„์ด๋ฏ€๋กœ, โ€œ์„ฑ๋Šฅ์ด ์ฆ๊ฐํ•œ๋‹ค.โ€ ๋Œ€์‹  โ€œ์„ฑ๋Šฅ์ด ๊ฐœ์„ ๋œ๋‹ค.โ€์™€ ๊ฐ™์€ ํ‘œํ˜„์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์œผ๋กœ ํ•ฉ์˜ํ•œ๋‹ค.
์ฆ‰, ๋‘ ์ปดํ“จํ„ฐ X์™€ Y์— ๋Œ€ํ•ด X์˜ ์„ฑ๋Šฅ์ด Y์˜ ์„ฑ๋Šฅ๋ณด๋‹ค ์ข‹๋‹ค๋ฉด
์„ฑ๋ŠฅX>์„ฑ๋ŠฅY1์‹คํ–‰์‹œ๊ฐ„X>1์‹คํ–‰์‹œ๊ฐ„Y์‹คํ–‰์‹œ๊ฐ„Y>์‹คํ–‰์‹œ๊ฐ„X์„ฑ๋Šฅ_\text{X}>์„ฑ๋Šฅ_\text{Y} \\ \cfrac{1}{์‹คํ–‰์‹œ๊ฐ„_\text{X}} > \cfrac{1}{์‹คํ–‰์‹œ๊ฐ„_\text{Y}} \\ ์‹คํ–‰์‹œ๊ฐ„_\text{Y} > ์‹คํ–‰์‹œ๊ฐ„_\text{X}
๊ฐ€ ๋œ๋‹ค.
์ด๋Š” ์„ฑ๋Šฅ์„ ์ •๋Ÿ‰์ ์œผ๋กœ ๋‚˜ํƒ€๋‚ผ ๊ฒฝ์šฐ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.
์„ฑ๋ŠฅX์„ฑ๋ŠฅY=n\cfrac{์„ฑ๋Šฅ_\text{X}}{์„ฑ๋Šฅ_\text{Y}}=n

Measuring Execution Time

์‹œ๊ฐ„์€ ์„ฑ๋Šฅ์˜ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ์ฒ™๋„์ด๋‹ค. ๊ฐ™์€ ์ž‘์—…์„ ๋นจ๋ฆฌ ๋๋‚ด๋Š” ์ปดํ“จํ„ฐ๊ฐ€ ๊ฐ€์žฅ ๋น ๋ฅธ ๋†ˆ์ด๋‹ค.
์—ฌ๊ธฐ์„œ ํ”„๋กœ๊ทธ๋žจ ์‹คํ–‰์‹œ๊ฐ„์€ ํ”„๋กœ๊ทธ๋žจ์„ ์ฒ˜๋ฆฌํ•˜๋Š”๋ฐ ๊ฑธ๋ฆฐ ์‹œ๊ฐ„์„ ์ดˆ๋‹จ์œ„๋กœ ํ‘œ์‹œํ•œ ๊ฒƒ์ด๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ์šฐ๋ฆฌ๊ฐ€ ์žฌ๋Š” ๋ฐฉ๋ฒ•์— ๋”ฐ๋ผ ์‹œ๊ฐ„์„ ์—ฌ๋Ÿฌ๊ฐœ๋กœ ์ •์˜ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

Elapsed time(a.k.a. Response Time (a.k.a. Wall Clock Time)) : ๊ณ„์‚ฐ ์‹คํ–‰์‹œ๊ฐ„

ํ•œ ์ž‘์—…์„ ๋๋‚ด๋Š”๋ฐ ๋“œ๋Š” ์ „์ฒด ์‹œ๊ฐ„.
โ€ข
ํ”„๋กœ์„ธ์‹ฑ, I/O, OS ์˜ค๋ฒ„ํ—ค๋“œ, idle Time๊นŒ์ง€ ๋ชจ๋‘ ํฌํ•จํ•œ๋‹ค.

CPU Time

์ปดํ“จํ„ฐ๋ฅผ ๊ณต์œ ํ•˜๋Š” ๊ฒฝ์šฐ, ํ”„๋กœ์„ธ์„œ ํ•˜๋‚˜๊ฐ€ ์—ฌ๋Ÿฌ ํ”„๋กœ๊ทธ๋žจ์„ ๋™์‹œ์— ์‹คํ–‰ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ๋‹ค.
์ด๋Ÿฐ ํ™˜๊ฒฝ์—์„œ๋Š” ํ”„๋กœ๊ทธ๋žจ์˜ Elapsed Time์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฒƒ ๋ณด๋‹ค ์ฒ˜๋ฆฌ๋Ÿ‰์„ ์ตœ์ ํ™” ํ•˜๋Š” ๊ฒƒ์ด ๋” ์ค‘์š”ํ•  ์ˆ˜ ์žˆ๋‹ค.
๋”ฐ๋ผ์„œ Elapsed Time๊ณผ ๊ตฌ๋ถ„ํ•ด์„œ ์ˆœ์ˆ˜ํ•˜๊ฒŒ ํ”„๋กœ์„ธ์„œ๊ฐ€ ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ์†Œ๋น„ํ•œ ์‹œ๊ฐ„์„ ๊ณ„์‚ฐํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค.
CPU Time์˜ ์ •์˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.
ํ”„๋กœ์„ธ์„œ๊ฐ€ ์ฃผ์–ด์ง„ Job์— ๋Œ€ํ•ด์„œ๋งŒ ์ฒ˜๋ฆฌํ•˜๋Š” ์‹œ๊ฐ„.
I/O time, ๋‹ค๋ฅธ job ํ• ๋‹น์‹œ๊ฐ„์„ ์ œ์™ธํ•œ ๊ฒฝ์šฐ
๋ฌผ๋ก  ์ด๋ฅผ ์ˆœ์ˆ˜ํ•˜๊ฒŒ ์ถ”์ถœํ•ด๋‚ด๋Š” ๊ฑด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. (Linux์˜ CPU Time๋„ 100% ์ •ํ™•ํ•˜์ง€๋Š” ์•Š์Œ)

CPU performance and factors

์ปดํ“จํ„ฐ์˜ ์„ฑ๋Šฅ์„ ํ‘œ์‹œํ•  ๋•Œ
์‚ฌ์šฉ์ž๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ์ฒ™๋„์™€
์„ค๊ณ„์ž๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ์ฒ™๋„๊ฐ€
์„œ๋กœ ๋‹ค๋ฅธ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์ด ์žˆ๋‹ค.
์ด๋•Œ ์ฒ™๋„๊ฐ„์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๊ตฌํ•œ๋‹ค๋ฉด ์„ค๊ณ„์ƒ์˜ ๋ณ€ํ™”๊ฐ€ ์‚ฌ์šฉ์ž๊ฐ€ ํ‰๊ฐ€ํ•˜๋Š” ์„ฑ๋Šฅ์— ์–ผ๋งˆ๋‚˜ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.
CPU Time = CPU Clock Cycles ร— Clock Cycle Time = CPU CLock Cycles / Clock Rate
โ€ข
ํด๋ฝ์‚ฌ์ดํด์„ ์ค„์ด๊ฑฐ๋‚˜, ํด๋ก ์‚ฌ์ดํดํƒ€์ž„์„ ์ค„์ด๊ฑฐ๋‚˜, ํด๋ฝ ๋ ˆ์ดํŠธ๋ฅผ ๋Š˜์ด๋ฉด CPU Time ์ฆ๊ฐ€!
๋‹ค๋งŒ, ์ด ์š”์†Œ ์ค‘ ํ•˜๋‚˜๋ฅผ ์ค„์ด๋ฉด ๋‹ค๋ฅธ ํ•˜๋‚˜๊ฐ€ ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์ž์ฃผ ๋ฐœ์ƒํ•˜๋‹ˆ ์ฃผ์˜ํ•˜์ž.

Instruction Performance

์•ž์„œ ์„ค๋ช…ํ•œ ์ˆ˜์‹์€ ํ”„๋กœ๊ทธ๋žจ ์ˆ˜ํ–‰์— ํ•„์š”ํ•œ ๋ช…๋ น์–ด ๊ฐœ์ˆ˜์— ๊ด€๋ จ๋œ ์‚ฌํ•ญ์ด ํฌํ•จ๋˜์–ด์žˆ์ง€ ์•Š๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ์‹คํ–‰ํ•  ๋ช…๋ น์–ด๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์ปดํ“จํ„ฐ๋Š” ์ด ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ด์•ผ ํ•˜๋Š” ์ž…์žฅ์ด๋ฏ€๋กœ, ์‹คํ–‰์‹œ๊ฐ„์€ ๋ช…๋ น์–ด ์ˆ˜์™€ ๊ด€๋ จ์ด ์žˆ๋‹ค.
๊ทธ๋Ÿฌ๋ฏ€๋กœ ํ”„๋กœ๊ทธ๋žจ์— ์‹คํ–‰์— ํ•„์š”ํ•œ ํด๋Ÿญ ์‚ฌ์ดํด ์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.
CPU Clock Cycle = Instruction Count ร— Cycles per Instruction
Cycles per Instruction(CPI)๋Š” ํ”„๋กœ๊ทธ๋žจ์ด ์‹คํ–‰ํ•œ ๋ชจ๋“  ๋ช…๋ น์–ด์— ๋Œ€ํ•ด ํ‰๊ท ํ•œ ๊ฐ’์„ ์‚ฌ์šฉํ•œ๋‹ค.

Classical CPU Performance Equation

์ด๊ฑธ ์ ์šฉํ•ด์„œ CPU Time์„ ์žฌ์ •์˜ํ•˜์ž.
CPU Time = Instruction Count ร— CPI ร— Clock Cycle Time = Instruction Count ร— Cycles per Instruction / Clock Rate
Q : Instruction์„ ์‹คํ–‰ํ•˜๋Š”๋ฐ 1์‚ฌ์ดํด๋งŒ ํ•„์š”ํ•˜์ง€ ์•Š๋‚˜์š”?
A : ํ˜„์žฌ์˜ ์ปดํ“จํ„ฐ๋Š” ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋ช…๋ น์–ด๋งˆ๋‹ค ๊ฑฐ์ณ๊ฐ€๋Š” ํ•˜๋“œ์›จ์–ด์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋‹ค๋ฅด๋‹ค.
โ€ข
๋ช…๋ น์–ด๋งˆ๋‹ค ํ•„์š”ํ•œ ๋งŒํผ์˜ ์‚ฌ์ดํด์„ ํ• ๋‹นํ•˜์ž.
์ฆ‰, CPI๋Š” ๋ฐ˜๋“œ์‹œ 1์€ ์•„๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ๊ฐœ๊ฐ€ ๋  ์ˆ˜ ์žˆ๋‹ค.
๋˜ํ•œ Compiler Optimization Level์— ๋”ฐ๋ผ์„œ๋„ ์„ฑ๋Šฅ์ด ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋‹ค!!!

CPI Example

CPU TimeA = Instruction Count * CPI(A) * Cycle Time(A) = I * 2.0 * 250ps = I * 500ps CPU TimeB = Instruction Count * CPI(B) * Cycle Time(B) = I * 1.2 * 600ps = I * 600ps Speed Up = CPU Time(B)/CPU Time(A) = 1.2
Clock Speed์— ์†์ง€๋งˆ๋ผ!

CPI in More Detale

Clockย Cycles=โˆ‘i=1n(CPIiร—Instructionย Counti)Clock\space Cycles = \sum^n_{i=1}(CPI_i\times Instruction\space Count_i)

CPI Example

์ผ๋ฐ˜ ํ†ต๋… : ์ฝ”๋“œ์‚ฌ์ด์ฆˆ๊ฐ€ ์ž‘์œผ๋ฉด ๋นจ๋ฆฌ ๋Œ์•„๊ฐˆ ๊ฒƒ์ด๋‹ค
โ€ข
Sequence 1 : IC = 5
โ—ฆ
Clock Cycle = 2*1 + 1*2+ 2*3 = 10
โ—ฆ
Avg. CPI = 10/5 = 2.0
โ€ข
Sequence 2 : IC = 6
โ—ฆ
Clock Cycle = 4*1 + 1*2 + 1*3 = 9
โ—ฆ
Avg. CPI = 9/6 = 1.5

Performance Summary

โ€ข
Performance Depends on
โ—ฆ
์•Œ๊ณ ๋ฆฌ์ฆ˜ : IC, CPI(์•„๋งˆ๋„?)์— ์˜ํ–ฅ
โ—ฆ
ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด : IC, CPI์— ์˜ํ–ฅ
โ—ฆ
Compiler : IC, CPI
โ—ฆ
Instruction Set architecture : IC, CPI, T(c)
Next chapter