본문 바로가기

datapreprocessing1

[NLP] 자연어 처리(NLP) - 단어의 토큰화 컴퓨터가 단어를 알아들을 수 있게 단어를 토큰으로 인코딩하는 방법을 소개한다. 개발 환경Python 3.10.16tensorflow 2.16.1토큰화언어를 숫자로 인코딩sentences = [ 'Today is a sunny day', 'Today is a ranny day', 'Is it sunny today?']tokenizer = Tokenizer(num_words = 100)tokenizer.fit_on_texts(sentences)word_index = tokenizer.word_indexprint(word_index)시퀀스sentences = [ 'Today is a sunny day', 'Today is a ranny day', 'Is it sunny to.. 2025. 2. 12.

이전 1 다음

티스토리툴바