๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
study๐Ÿ“š/python

[python/ํŒŒ์ด์ฌ] ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ - apply, map ์„ ํ™œ์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜

by ์Šค๋‹ 2022. 7. 27.

apply, map ์„ ํ™œ์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜

import pandas as pd

df = pd.DataFrame({'a' : [1, 2, 3, 4, 5]})

-๋ฌธ์ œ : a ๊ฐ€ 2 ๋ณด๋‹ค ์ž‘์œผ๋ฉด '2 ๋ฏธ๋งŒ', 4 ๋ณด๋‹ค ์ž‘์œผ๋ฉด '4 ๋ฏธ๋งŒ', 4 ๋ณด๋‹ค ํฌ๋ฉด '4 ์ด์ƒ' ์ด ์ €์žฅ๋œ b ์นผ๋Ÿผ์„ ์ถ”๊ฐ€ํ•˜๊ธฐ

df

df['b'] = 0
df

a = df[df['a] < 2]
a

df['b'][a.index] = '2 ๋ฏธ๋งŒ'
df

a = df[(df['a'] >= 2) & (df['a'] < 4)]
a

df['b'][a.index] = '4 ๋ฏธ๋งŒ'

pd.set_option('mode.chained_assignment',  None)
df['b'][a.index] = '4 ๋ฏธ๋งŒ'
df

a = df[df['a'] >= 4
df['b'][a.index] = '4 ์ด์ƒ'
df

1. ํ•จ์ˆ˜ + apply ๋ฅผ ์ด์šฉํ•œ ํ•ด๊ฒฐ

  • apply()
  1. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ input์œผ๋กœ ๋ฐ›๋Š” ํ•จ์ˆ˜ ์ƒ์„ฑ
  2. ๋ฐ˜์˜ํ•˜๊ณ  ์‹ถ์€ column ์ด๋ฆ„์„ ๊ฐ€์ ธ์™€ ์ ์šฉํ•˜๊ณ  ์‹ถ์€ ํ•จ์ˆ˜ ๋‚ด์šฉ ์ž‘์„ฑ
  3. ์›ํ•˜๋Š” df์— apply ํ•จ์ˆ˜๋กœ ์ ์šฉํ•˜๊ณ  ์œ„์—์„œ ์ž‘์„ฑํ•œ ํ•จ์ˆ˜ ์ ์šฉ
def case_function(x):
    if x < 2:
        return '2 ๋ฏธ๋งŒ'
    elif x < 4:
        return '4 ๋ฏธ๋งŒ'
    else:
        return '4 ์ด์ƒ'
df['c'] = df['a'].apply(case_function)
df

-๋ฌธ์ œ : a ๊ฐ€ 1 ์ด๋ฉด 'one', 2 ์ด๋ฉด 'two', 3 ์ด๋ฉด 'three', 4 ์ด๋ฉด 'four', 5 ์ด๋ฉด 'five' ๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ์นผ๋Ÿผ d ๋ฅผ ๋งŒ๋“ค๊ธฐ

df

  • ์‚ฌ์šฉ์ž ์ •์˜ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•œ ํ•ด๊ฒฐ๋ฐฉ๋ฒ•
def function(x):
    if x == 1:
        return 'one'
    elif x == 2:
        return 'two'
    elif x == 3:
        return 'three'
    elif x == 4:
        return 'four'
    elif x == 5:
        return 'five'
df['d'] = df['a'].apply(function)
df

2. map์„ ์ด์šฉํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•

  • map() : ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•œ ๋ฒˆ์— ๋‹ค๋ฅธ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์‚ฌ์šฉ. ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” list๋‚˜ tuple์„ ๋Œ€์ƒ์œผ๋กœ ์ฃผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ํ•จ์ˆ˜
a = { 1 : 'one', 2 : 'two', 3 : 'three', 4 : 'four', 5 : 'five'}
df['e'] = df['a'].map(a)
df

๋Œ“๊ธ€