Text Processing(4) - 한장 요약

Text Processing(4) - 한장 요약

2019. 1. 23. 02:51ㆍUdacity Nanodegree/Natural Language Processing

Text Processing은 다음과 같은 절차로 이루어진다

마침표나 쉼표같은 punctuation 제거
문장을 단어 단위로 분할(Tokenization)
Stop word 제거
Stemming / Lemmatization - 한쪽만 쓰이는 것이 아닌 둘다 쓰는 경우가 보통이다. (Lemmatization → Stemming)

Language Model(2) - HMM을 이용한 Part of Speech(POS) Tagging (0)	2019.02.03
Language Model(1) - Bayesian Theorem 부터 Naive Bayes까지 (0)	2019.02.03
Text Processing(3) - 텍스트 내에서의 분류/변환작업 (0)	2019.01.23
Text Processing(2) - 데이터 전처리(Text Preprocessing) (0)	2019.01.23
Text Processing(1) - 데이터 클리닝(Data Cleaning/Cleansing) (0)	2019.01.20

꾸무