'Machine Learning' 카테고리의 글 목록

[DL] Keras(TensorFlow) 관련 에러 해결

Machine Learning 2025. 1. 15. 11:40 |

Keras 관련 에러를 몇 가지 확인하고 해결해 보자.

1)

dense = keras.layers.Dense(10, activation='softmax', input_shape=(784, ))

위 명령을 실행하면 아래와 같은 경고가 출력된다.

UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.

경고이므로 무시하고 넘어간다.

model = keras.Sequential(dense)
이어서 위 명령은 아래와 같은 에러가 출력된다.

TypeError: 'Dense' object is not iterable
아래와 같이 바꿔서 해결한다.
model = keras.Sequential([dense])

아니면 처음부터 아래와 같이 입력하면 경고나 에러 없이 진행 된다.
model = keras.Sequential([keras.Input(shape=(784, )), keras.layers.Dense(10, activation='softmax')])

2)

model.compile(loss='sparse_categorical_crossentropy', metrics='accuracy')

위 명령을 실행하면 아래와 같은 에러가 출력된다.

ValueError: Expected `metrics` argument to be a list, tuple, or dict. Received instead: metrics=accuracy of type <class 'str'>

아래와 같이 바꿔서 해결한다.
model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'])

※ 참고

혼자 공부하는 머신러닝 + 딥러닝

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

[ML] MNIST pandas (0)	2024.12.21
[Scraping] 환율 정보를 SMS로 보내기 (3)	2024.01.02
[Scraping] 환율 정보 (0)	2024.01.02
OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기 (0)	2020.10.07
CSV 분석 (0)	2019.01.20

Posted by J-sean

:

[ML] MNIST pandas

Machine Learning 2024. 12. 21. 17:24 |

MNIST 데이터를 pandas로 읽고 출력해 보자.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#from sklearn.datasets import fetch_openml
 
#mnist = fetch_openml('mnist_784', as_frame=False)
#X, y = mnist.data, mnist.target
 
np.set_printoptions(linewidth=np.inf)
 
mnist = pd.read_csv("mnist_784.csv")
print("■ First 5 Data:")
print(mnist.iloc[0:5, 0:-1])
print("■ First 5 Targets:")
print(mnist.iloc[0:5, -1])
 
FirstImage = mnist.iloc[0, 0:-1].to_numpy().reshape(28, 28)
# values: Return a Numpy representation of the DataFrame,
#         the axes labels will be removed.
 
print("■ First Image:\n", FirstImage)
 
plt.imshow(FirstImage, cmap="binary")
plt.axis("off")
plt.show()

 

1
2
3
4
5
6
7
8
9
10
11
12

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
 
np.set_printoptions(linewidth=np.inf)
 
mnist = pd.read_csv("mnist_784.csv")
X = mnist.iloc[:, :-1].to_numpy().reshape(-1, 28, 28)
y = mnist.iloc[:, -1].to_numpy()
 
print(X[0])
print("Target: ", y[0])

 

Error: the number of classes has to be greater than one; got 1 class

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import SGDClassifier
 
np.set_printoptions(linewidth=np.inf)
 
mnist = pd.read_csv("mnist_784.csv")
X = mnist.iloc[:, :-1].to_numpy()
y = mnist.iloc[:, -1].to_numpy().astype('str')
# .astype('str')을 삭제하면 y에 숫자 데이터가 저장된다. 그러면 21, 22 라인에서
# 문자5('5')와 비교하기 때문에 모두 False가 되어버려 타겟이 False 클래스 하나만
# 갖게 되어 아래와 같은 에러가 발생한다.
# The number of classes has to be greater than one; got 1 class
# 아니면 .astype('str')을 삭제하고 21, 22 라인에서 '5'가 아닌 5와 비교해도 된다.
 
first_digit = X[0]
# 첫 번째 데이터 지정.
 
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
y_train_5 = (y_train == '5')
y_test_5 = (y_test == '5')
 
sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(X_train, y_train_5)
print(sgd_clf.predict([first_digit]))
# 첫 번째 데이터가 5인지 확인.

 

첫 번째 데이터가 5인지 확인하는 코드.

결과로 True가 출력된다.

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

[DL] Keras(TensorFlow) 관련 에러 해결 (0)	2025.01.15
[Scraping] 환율 정보를 SMS로 보내기 (3)	2024.01.02
[Scraping] 환율 정보 (0)	2024.01.02
OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기 (0)	2020.10.07
CSV 분석 (0)	2019.01.20

Posted by J-sean

:

[Scraping] 환율 정보를 SMS로 보내기

Machine Learning 2024. 1. 2. 19:04 |

네이버에서 환율 정보를 scraping 하고 IFTTT를 이용해 SMS로 보낸다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

from bs4 import BeautifulSoup as bs
import urllib.request as req
import requests
 
url = 'https://finance.naver.com/marketindex/' # 시장지표: 네이버 금융
res = req.urlopen(url)
 
soup = bs(res, 'html.parser')
title = soup.select_one('a.head > h3.h_lst').string
rate = soup.select_one('div.head_info > span.value').string
 
# IFTTT Platform
# To trigger an Event make a POST or GET web request to:
trigger_url = 'https://maker.ifttt.com/trigger/이벤트_입력/with/key/키_입력'
# With an optional JSON body of:
result = requests.post(trigger_url, data = { "value1" : title, "value2" : rate, "value3" : 'Sean' })
# The data is completely optional, and you can also pass value1, value2, and value3 as query parameters
# or form variables. This content will be passed on to the Action in your Recipe.
print('Result:', result, result.status_code, result.reason)

 

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

[DL] Keras(TensorFlow) 관련 에러 해결 (0)	2025.01.15
[ML] MNIST pandas (0)	2024.12.21
[Scraping] 환율 정보 (0)	2024.01.02
OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기 (0)	2020.10.07
CSV 분석 (0)	2019.01.20

Posted by J-sean

:

[Scraping] 환율 정보

Machine Learning 2024. 1. 2. 19:03 |

환율 정보를 스크랩 해 보자.

크롬에서 F12를 누르면 위와 같은 정보가 표시된다. 정보를 태그와 클래스명으로 구분할 수 있을거 같다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

import urllib.request
from bs4 import BeautifulSoup as bs
 
url = "https://m.stock.naver.com/marketindex/home/exchangeRate/exchange"
with urllib.request.urlopen(url) as response:
    html = response.read().decode('utf-8')
 
soup = bs(html, 'html.parser')
 
all_countries = soup.find_all('strong', 'MainListItem_name__2Nl6J')
all_rates = soup.find_all('span', 'MainListItem_price__dP8R6')
 
for country, rate in zip(all_countries, all_rates):
    print(country.string + ': ', rate.string)
 
#for i, c in enumerate(all_countries):
#    print(i+1, c.string)
 
#for i, r in enumerate(all_rates):
#    print(i+1, r.string)
 
#print(soup.find('strong', 'MainListItem_name__2Nl6J').string)
#print(soup.find('span', 'MainListItem_price__dP8R6').string)

 

같은 클래스명을 쓰는 정보가 여러개 있다. 모두 검색하여 표시한다.

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

[ML] MNIST pandas (0)	2024.12.21
[Scraping] 환율 정보를 SMS로 보내기 (3)	2024.01.02
OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기 (0)	2020.10.07
CSV 분석 (0)	2019.01.20
JSON 분석 (0)	2019.01.18

Posted by J-sean

:

OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기

Machine Learning 2020. 10. 7. 21:41 |

광학 문자 인식(Optical Character Recognition; OCR)은 사람이 쓰거나 기계로 인쇄한 문자의 영상이나 이미지를 기계가 읽을 수 있는 문자로 변환하는 것이다. 다양한 운영체제를 위한 광학 문자 인식 엔진 Tesseract를 윈도우즈에서 사용해 보자.

Tesseract Windows version을 제공하는 UB Mannheim에 접속해서 적당한 플랫폼의 Tesseract를 다운 받는다.

설치한다.

Additional language data (download)를 클릭한다.

Korean을 선택한다.

Python-tesseract(pytesseract)를 설치한다. Python-tesseract은 Google의 Tesseract-OCR Engine의 wrapper이다.

영문 이미지를 준비한다.

한글 이미지를 준비한다.

1
2
3
4
5
6
7
from PIL import Image
import pytesseract
 
pytesseract.pytesseract.tesseract_cmd = r'D:\Program Files\Tesseract-OCR\tesseract'
 
print(pytesseract.image_to_string(Image.open('English.png')))
#print(pytesseract.image_to_string(Image.open('Korean.png'), lang = 'kor'))

영문, 한글 이미지를 읽고 텍스트를 출력한다.

영문 이미지 결과.

한글 이미지 결과.

Console에서 아래 명령어로도 같은 결과를 얻을 수 있다.

tesseract English.png stdout

tesseract Korean.png stdout -l kor

stdout이 아닌 다른 이름으로 출력을 지정하면 그 이름의 텍스트 파일로 출력된다.

tesseract English.png result

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

[Scraping] 환율 정보를 SMS로 보내기 (3)	2024.01.02
[Scraping] 환율 정보 (0)	2024.01.02
CSV 분석 (0)	2019.01.20
JSON 분석 (0)	2019.01.18
Beautifulsoup XML 분석 (0)	2019.01.15

Posted by J-sean

:

CSV 분석

Machine Learning 2019. 1. 20. 12:48 |

Python 기본 라이브러리 csv를 이용해 CSV(comma-separated values) 형식을 분석 할 수 있다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import locale
import csv
 
lo = locale.getdefaultlocale()
# Tries to determine the default locale settings and returns them as a tuple of
# the form (language code, encoding).
print("Default language code: " + lo[0], "Default encoding: " + lo[1], sep = "\n", end = "\n\n")
 
filename = "list.csv"
with open(filename, "rt", encoding="euc_kr") as f:
    csv_data = f.read()
 
data = []
rows = csv_data.split("\n")
for row in rows:
    if row == "":
        continue
    cells = row.split(",")
    data.append(cells)
 
for c in data:
    print("%-8s %8s" %(c[1], c[2]))
 
print()
 
#Python csv library
with open(filename, "at", encoding="euc_kr") as f:
    # 'a' - open for writing, appending to the end of the file if it exists
    # For binary read-write access, the mode 'w+b' opens and truncates the file to 0 bytes.
    # 'r+b' opens the file without truncation.
    csv_writer = csv.writer(f, delimiter = ",", quotechar = '"')
    csv_writer.writerow(["101", "Math", "4300"])
    csv_writer.writerow(["102", "Physics", "4800"])    
    csv_writer.writerow(["103", "English", "5700"])
# stream position을 바꾸고 싶으면 io module의 seek()을 f.seek(...)처럼 사용 한다.
# seek(offset[, whence])
# Change the stream position to the given byte offset. offset is interpreted relative to the
# position indicated by whence. The default value for whence is SEEK_SET. Values for whence are:
# SEEK_SET or 0 – start of the stream (the default); offset should be zero or positive
# SEEK_CUR or 1 – current stream position; offset may be negative
# SEEK_END or 2 – end of the stream; offset is usually negative
# Return the new absolute position.
 
with open(filename, "rt", encoding="euc_kr") as f:
    csv_reader = csv.reader(f, delimiter = ",", quotechar = '"')
    for cells in csv_reader:
        if cells == []:
            continue
        print("%-8s %8s" %(cells[1], cells[2]))
Colored by Color Scripter
cs

list.csv:

결과:

실행 후 list.csv:

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

[Scraping] 환율 정보 (0)	2024.01.02
OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기 (0)	2020.10.07
JSON 분석 (0)	2019.01.18
Beautifulsoup XML 분석 (0)	2019.01.15
[Scraping] Selenium으로 로그인이 필요한 싸이트 정보 가져오기 (0)	2019.01.01

Posted by J-sean

:

JSON 분석

Machine Learning 2019. 1. 18. 20:14 |

Python 기본 라이브러리 json을 이용해 JSON (JavaScript Object Notation) 형식을 분석할 수 있다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import urllib.request as req
import os.path
import json
 
url = "https://api.github.com/repositories"
filename = "repositories.json"
 
if not os.path.exists(filename):
    #req.urlretrieve(url, filename)
    # Legacy interface. It might become deprecated at some point in the future.
    with req.urlopen(url) as contents:
        jsn = contents.read().decode("utf-8") # .decode("utf-8")이 없으면 jsn에는 str이 아닌 bytes가 저장 된다.
        # If the end of the file has been reached, read() will return an empty string ('').
        print(jsn)        
        # print(json.dumps(jsn, indent="\t")) 는 indent가 적용되어 출력되어야 하지만 원본이 indent가 적용되어 있지
        # 않아 indent 없이 출력 된다.
        with open(filename, mode="wt", encoding="utf-8") as f:
            f.write(jsn)
 
with open(filename, mode="rt", encoding="utf-8") as f:
    items = json.load(f) # JSON 문서를 갖고 있는 파일 포인터 전달. loads()는 JSON 형식의 문자열 전달
    for item in items:
        print("Name:", item["name"], "Login:", item["owner"]["login"])
 
test = {
    "Date" : "2019-01-17",
    "Time" : "21:30:24",
    "Location" : {
        "Town" : "Franklin",
        "City" : "Newyork",
        "Country" : "USA"
        }
    }
 
s = json.dumps(test, indent="\t")
# Serialize obj to a JSON formatted str
# If indent is a non-negative integer or string, then JSON array elements and object members will be pretty-printed
# with that indent level. An indent level of 0, negative, or "" will only insert newlines. None (the default) selects
# the most compact representation. Using a positive integer indent indents that many spaces per level. If indent is a
# string (such as "\t"), that string is used to indent each level.
print(s)
 
with open("dump.json", mode="wt") as f:
    json.dump(test, f, indent="\t")
# Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object)
cs

출력 결과 처음 부분.

출력 결과 마지막 부분.

write()으로 만든 repositories.json과 dump()으로 만든 dump.json 파일.

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기 (0)	2020.10.07
CSV 분석 (0)	2019.01.20
Beautifulsoup XML 분석 (0)	2019.01.15
[Scraping] Selenium으로 로그인이 필요한 싸이트 정보 가져오기 (0)	2019.01.01
[Scraping] Naver '이 시각 주요 뉴스' 목록 가져 오기 (0)	2018.12.30

Posted by J-sean

:

Beautifulsoup XML 분석

Machine Learning 2019. 1. 15. 22:47 |

Beautifulsoup을 이용해 XML을 분석할 수 있다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from bs4 import BeautifulSoup
import urllib.request as req
import os.path
 
url = "http://www.weather.go.kr/weather/forecast/mid-term-rss3.jsp?stnId=108"
# 기상청 날씨누리 전국 중기예보 RSS
filename = "forecast.xml"
 
if not os.path.exists(filename):
    #req.urlretrieve(url, savename)
    # Legacy interface. It might become deprecated at some point in the future.
    with req.urlopen(url) as contents:
        xml = contents.read().decode("utf-8")
        # If the end of the file has been reached, read() will return an empty string ('').
        #print(xml)
        with open(filename, mode="wt") as f:
            f.write(xml)
 
with open(filename, mode="rt") as f:
    xml = f.read()
    soup = BeautifulSoup(xml, "html.parser")
    # html.parser는 모든 태그를 소문자로 바꾼다.
    #print(soup)
 
    print("[", soup.find("title").string, "]")
    print(soup.find("wf").string, "\n")
 
    # 날씨에 따른 지역 분류
    info = {}    # empty dicionary
    for location in soup.find_all("location"):
        name = location.find("city").string
        weather = location.find("wf").string
        if not (weather in info):
            info[weather] = []    # empty list. dictionary는 list를 value로 가질 수 있다.
        info[weather].append(name)
 
    for weather in info.keys():    # Return a new view of the dictionary’s keys.
        print("■", weather)
        for name in info[weather]:
            print("|-", name)
Colored by Color Scripter
cs

저작자표시 비영리 변경금지

'Machine Learning' 카테고리의 다른 글

OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기 (0)	2020.10.07
CSV 분석 (0)	2019.01.20
JSON 분석 (0)	2019.01.18
[Scraping] Selenium으로 로그인이 필요한 싸이트 정보 가져오기 (0)	2019.01.01
[Scraping] Naver '이 시각 주요 뉴스' 목록 가져 오기 (0)	2018.12.30

Posted by J-sean

:

Software Engineer English & Software Engineering Blog - Sean

Category

Recent Posts

Recent Comments

Tags

'Machine Learning'에 해당되는 글 10건

[DL] Keras(TensorFlow) 관련 에러 해결

'Machine Learning' 카테고리의 다른 글

[ML] MNIST pandas

Error: the number of classes has to be greater than one; got 1 class

'Machine Learning' 카테고리의 다른 글

[Scraping] 환율 정보를 SMS로 보내기

'Machine Learning' 카테고리의 다른 글

[Scraping] 환율 정보

'Machine Learning' 카테고리의 다른 글

OCR with Tesseract on Windows - Windows에서 테서랙트 사용하기

'Machine Learning' 카테고리의 다른 글

CSV 분석

'Machine Learning' 카테고리의 다른 글

JSON 분석

'Machine Learning' 카테고리의 다른 글

Beautifulsoup XML 분석

'Machine Learning' 카테고리의 다른 글

티스토리툴바