๋ฐ์ดํฐ ์ ์ฒ๋ฆฌ
1) ๋ฐ์ดํฐ ํ๋ ์ ์์ฑ
- dict ๋ฅผ ์ด์ฉํ ๋ฐ์ดํฐ ํ๋ ์ ์์ฑ
import pandas as pd
df = pd.DataFrame({'a' : [1, 2, 3], 'b' : [4, 5, 6], 'c' : [7, 8, 9]})
type(df)
pandas.core.frame.DataFrame
df
dummy = {'a': [1, 2, 3], 'b' : [4, 5, 6], 'c' : [7, 8, 9]}
df2 = pd.DataFrame(dummy)
df2
- List ๋ฅผ ์ด์ฉํ ๋ฐ์ดํฐ ํ๋ ์ ์์ฑ
a = [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
df3 = pd.DataFrame(a)
df3
df3.columns = ['a', 'b', 'c']
df3
- ๋ฌธ์ : ์๋ ํ ์ด๋ธ๊ณผ ๊ฐ์ ๋ฐ์ดํฐ ํ๋ ์ ๋ง๋ค๊ธฐ
a = {'company' : ['abc', 'ํ์ฌ', 123], '์ง์์' : [400, 10, 6]}
df4 = pd.DataFrame(a)
df4
- ๋ฌธ์ : ์๋ ํ ์ด๋ธ๊ณผ ๊ฐ์ ๋ฐ์ดํฐ ํ๋ ์ ๋ง๋ค๊ธฐ
a = {'company' : ['abc', 'ํ์ฌ', 123], '์ง์์' : [400, 10, 6], '์์น' : ['Seoul', NaN, 'Busan']}
a = {'company' : ['abc', 'ํ์ฌ', 123], '์ง์์' : [400, 10, 6], '์์น' : ['Seoul', , 'Busan']}
-numpy ๋ฅผ ํตํ ํด๊ฒฐ
import numpy as np
a = {'company' : ['abc', 'ํ์ฌ', 123], '์ง์์' : [400, 10, 6], '์์น' : ['Seoul', np.NaN, 'Busan']}
df5 = pd.DataFrame(a)
df5
2) ์นผ๋ผ๋ช ์ถ์ถ/ ๋ณ๊ฒฝ
- ๋ฐ์ดํฐ ํ๋ ์ ์์ฑ
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3], 'b' : [4, 5, 6], 'c' : [7, 8, 9]})
df
- ์นผ๋ผ๋ช ์ป๊ธฐ
df.columns
Index(['a','b','c'], dtype='object')
df.colums[1]
'b'
- ๋ฌธ์ : ์นผ๋ผ๋ช ์ธ a, b, c๋ฅผ d, e, f๋ก ๋ฐ๊พธ์ด๋ผ
- ์นํ์ ํตํ ์นผ๋ผ๋ช ๋ณ๊ฒฝ
df.columns = ['d', 'e', 'f']
df
- ๋ฌธ์ : ์นผ๋ผ๋ช ์ธ d, e, f ์ค d๋ฅผ '๋' ๋ก f๋ฅผ '์ํ'๋ก ๋ฐ๊พธ์ด๋ผ
df.colums = ['๋', 'e', '์ํ']
df
- rename์ ํตํ ์นผ๋ผ๋ช ๋ณ๊ฒฝ
# ๋ฐ์ดํฐ ํ๋ ์ ์ฌ์์ฑ
df = pd.DataFrame({'a': [1, 2, 3], 'b' : [4, 5, 6], 'c' : [7, 8, 9]})
df.columns = ['d', 'e', 'f']
df
df.rename(columns = {'d' : '๋', 'f' : '์ํ'})
df
rename ์ ํตํด์ ์ปฌ๋ผ๋ช ์ด ๋ณ๊ฒฝ๋์์ง๋ง, ์ ์ฅ์ ์ ๋จ.
-inplace = True ๋ก ๋์ด ์์ด์ผ ์ ์ฅ๋จ
df.rename(columns = {'d' : '๋', 'f' : '์ํ'}, inplace = True)
df
๋๊ธ