Matplotlib 데이터 시각화

Matplotlib 데이터 시각화 그래프

Line plot

Line Styles

기호	옵션
-	실선
--	대시 선
-.	대시 점 선
:	점선

Markers

기호	의미	기호	의미
.	점	,	픽셀
o	원	s	사각형
v,<,^,>	삼각형	1,2,3,4	삼각선
p	오각형	H,h	육각형

loc. 범례의 위치 옵션

문자형	code	문자형	code
'best'	0	'center left	6
'upper right'	1	'center right'	7
'upper left'	2	'lower center'	8
'lower left'	3	'upper center'	9
'lower right'	4	'center'	10
'right'	5

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

x = np.arange(10)
fig, ax = plt.subplots()
ax.plot(
    x, x, label='y=x',
    linestyle='-',
    marker='.',
    color='blue'
)
ax.plot(
    x, x**2, label='y=x^2',
    linestyle='-.',
    marker=',',
    color='red'
)
ax.set_xlabel("x")
ax.set_ylabel("y")

# 범례 설정.
ax.legend(
    loc='upper left',
    shadow=True,
    fancybox=True,
    borderpad=2
)

fig.savefig("plot.png")

Bar plot

Bar그래프(막대형 차트)는 여러값을 비교하는데 적합
Histogram은 일정 시간 동안의 숫자 데이터 분포를 시각화 하는데 적합

x = np.arange(10)
fig, ax = plt.subplots(figsiz=(12, 4))  # 가로 12, 세로 4 인 그래프
ax.bar(x, x*2)


# 누적 차트
# 0~1사이에 랜덤값 3개를 추출
x = np.random.rand(3)
y = np.random.rand(3)
z = np.random.rand(3)
data= [x, y, z]
fig, ax = plt.subplots()

# X 축에 사용할 [0, 1, 2] 배열 선언
x_ax = np.arange(3)

# x, y, z 순서대로 그래프를 그림.
# bottom 을 설정하여 쌓아올림.
for i in x_ax:
    ax.bar(x_ax, data[i],
    bottom = np.sum(data[:i], axis=0))

ax.set_xticks(x_ax)
ax.set_xticklabels(['A', 'B', 'C'])

# Histogram   ( 도수부포표 )
fig, ax = plt.subplots()


#### 실습예제.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# 한글지원을 위해 폰트 설정
import matplotlib.font_manager as fm
fname='./NanumBarunGothic.ttf'
font = fm.FontProperties(fname = fname).get_name()
plt.rcParams["font.family"] = font

# Data set
x = np.array(["축구", "야구", "농구", "배드민턴", "탁구"])
y = np.array([13, 10, 17, 8, 7])
z = np.random.randn(1000)

fig, axes = plt.subplots(1, 2, figsize=(8, 4))

# Bar 그래프
axes[0].bar(x, y)
# 히스토그램
axes[1].hist(z, bins = 200)

fig.savefig("plot.png")

Pandas를 활용한 Matplotlib

기존 x, y 가 들어가는 곳에 DataFrame의 컬럼 (Series) 를 입력.

df = pd.read_csv("./test.csv")
fig, ax = plt.subplots()
ax.plot(df["col1"], df["col2"], lebel = "test plot")
ax.set_xlabel("x axis")
ax.set_ylabel("y axis")

df = pd.read_csv("./data/pokemon.csv")

# Type1 이 Fire 이거나 Type2 가 Fire 인 경우 데이터 가져오기
fire = df[(df['Type 1']=='Fire') | ((df['Type2'])=='Fire')]
water = df[(df['Type 1']=='Water') | ((df['Type2'])=='Water')]
fig, ax = plt.subplots()
# color = red, size = 50
ax.scatter(fire['Attack'], fire['Defense'], color='R', label='Fire', maker='*', s=50)
ax.scatter(water['Attack'], water['Defense'], color='R', label='Water', maker='*', s=50)
ax.set_xlabel('Attack')
ax.set_ylabel('Defense')
ax.legend(loc='upper right')

이름없음

Matplotlib 데이터 시각화

Matplotlib 데이터 시각화 그래프

Line plot

Bar plot

Pandas를 활용한 Matplotlib

티스토리툴바