๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

study๐Ÿ“š/python24

[python/ํŒŒ์ด์ฌ] Numpy ๋„˜ํŒŒ์ด (3) ๋„˜ํŒŒ์ด(Numpy) 8. Subsetting, Slicing 8.1 Subsetting a array([5, 7, 9]) a[2] 9 8.2 Slicing a[0:2] array([5, 7]) b array([[1.5, 2. , 3. ], [4. , 5. , 6. ]]) b[0:2, 1] array([2., 5.]) c array([[[1.5, 2. , 3. ], [4. , 5. , 6. ]], [[3. , 2. , 1. ], [4. , 5. , 6. ]]]) c[1, :] array([[3., 2., 1.], [4., 5., 6.]]) #์—ญ์ˆœ ์ •๋ ฌ a[::-1] array([9, 7, 5]) 8.3 Boolean indexing a array([5, 6, 9]) a[a (2, 3) #ํ–‰์—ด ๋ฐ”๊พธ๊ธฐ i =.. 2022. 8. 22.
[python/ํŒŒ์ด์ฌ] Numpy ๋„˜ํŒŒ์ด (2) ๋„˜ํŒŒ์ด(Numpy) 4. Inspecting your array import numpy as np a = np.array([1, 2, 3]) b = np.array([(1.5,2, 3), (4, 5, 6)], dtype=float) c = np.array([[(1.5, 2, 3), (4,5,6)], [(3,2,1), (4,5,6)]], dtype=float) e = np.full((2, 2), 7) f = np.eye(2) .shape : ๋ฐฐ์—ด ๊ตฌ์กฐ ํ™•์ธ a.shape (3,) len() : ๋ฐฐ์—ด์˜ ๊ธธ์ด ํ™•์ธ len(a) 3 b array([[1.5, 2. , 3. ], [4. , 5. , 6. ]]) ndim : ๋ฐฐ์—ด ์ฐจ์ˆ˜ ํ™•์ธ b.ndim 2 e array([[7, 7], [7, 7]]) .size .. 2022. 8. 16.
[python/ํŒŒ์ด์ฌ] ํŒ๋‹ค์Šค ํ”„๋กœํŒŒ์ผ๋ง(Pandas-Profiling) ํŒ๋‹ค์Šค ํ”„๋กœํŒŒ์ผ๋ง(Pandas Profiling) ์ข‹์€ ๋จธ์‹  ๋Ÿฌ๋‹ ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด์„œ ๋ฐ์ดํ„ฐ์˜ ์„ฑ๊ฒฉ์„ ํŒŒ์•…ํ•˜๋Š” ๊ณผ์ •์ด ์„ ํ–‰๋˜์–ด์•ผ ํ•œ๋‹ค. ์ด ๊ณผ์ •์—์„œ ๋ฐ์ดํ„ฐ ๋‚ด ๊ฐ’์˜ ๋ถ„ํฌ, ๋ณ€์ˆ˜ ๊ฐ„์˜ ๊ด€๊ณ„, Null๊ฐ’๊ณผ ๊ฐ™์€ ๊ฒฐ์ธก๊ฐ’(missing values)์กด์žฌ ์œ ๋ฌด ๋“ฑ์„ ํŒŒ์•…ํ•˜๊ฒŒ ๋˜๋Š”๋ฐ ์ด์™€ ๊ฐ™์ด ๋ฐ์ดํ„ฐ๋ฅผ ํŒŒ์•…ํ•˜๋Š” ๊ณผ์ •์„ EDA(Exploratory Data Analysis, ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„) ์ด๋ผ๊ณ  ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„์— ๋“œ๋Š” ์‹œ๊ฐ„์„ ์ ˆ์•ฝํ•˜๊ธฐ ์œ„ํ•ด ๋ช‡ ์ค„์˜ ์ฝ”๋“œ๋กœ ์—ฌ๋Ÿฌ ๋ถ„์„ ํ†ต๊ณ„๋Ÿ‰์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๋Š” ํ•˜๋Š” ๋ฐ ์ด๋ฅผ ํŒ๋‹ค์Šค ํ”„๋กœํŒŒ์ผ๋ง(Pandas-Profiling) ์ด๋ผ๊ณ  ํ•œ๋‹ค. pip ๋ช…๋ น์„ ํ†ตํ•ด ํŒจํ‚ค์ง€ ์„ค์น˜ pip install -U pandas-profiling ๋ฐ์ดํ„ฐ ๋กœ๋“œํ•˜๊ธฐ im.. 2022. 8. 5.
[python/ํŒŒ์ด์ฌ] Matplotlib (3) - ๋ฐ•์Šค ํ”Œ๋กฏ(box plot), ํžˆ์Šคํ† ๊ทธ๋žจ, ํŒŒ์ด ์ฐจํŠธ(Pie chart), 3์ฐจ์› ํ”Œ๋กœํŒ… Matplotlib ๋ฐ•์Šค ํ”Œ๋กฏ(box plot) ax.boxplot() ์ˆ˜์น˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ํ•˜๋‚˜์˜ ๋ฐฉ์‹ ์ผ๋ฐ˜์ ์œผ๋กœ ๋ฐ•์Šค ํ”Œ๋กฏ์€ ์ „์ฒด ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์–ป์–ด์ง„ ๋‹ค์„ฏ ๊ฐ€์ง€ ์š”์•ฝ ์ˆ˜์น˜๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๊ทธ๋ฆผ ์ตœ์†Œ๊ฐ’ ์ œ 1์‚ฌ๋ถ„์œ„ ์ˆ˜(Q1) ์ œ 2์‚ฌ๋ถ„์œ„ ์ˆ˜ ๋˜๋Š” ์ค‘์œ„์ˆ˜(Q2) ์ œ 3์‚ฌ๋ถ„์œ„ ์ˆ˜(Q3) ์ตœ๋Œ€๊ฐ’ import matplotlib.pyplot as plt import seaborn as sns sns.set(rc={'figure.figsize':(10, 5)}) import pandas as pd # ๋ฐ์ดํ„ฐ ์ค€๋น„ r1 = np.random.normal(loc=0, scale=0.5, size=100) r2 = np.random.normal(loc=0.5, scale=1, size=100) r3 = np.random.. 2022. 8. 4.